Hi Tom, Thank you for starting the discussion
> Given all the flap about txid, this surely mustn't go in without public > review first ;-). So, here is a submission from Sergey Karpov to fill > in the lack of any working code examples for user-written tsearch > parsers and dictionaries. > > I will be mostly off-line for the next day or so and don't have time to > work on this more now, but here are a few comments: > > * It seems a bit odd to put multiple independent contrib modules under a > single subfolder. I'd be inclined to drop the ts_pack layer and just > make the dictionaries and parser be top-level contrib modules. Yes, I understand your position, as well as Magnus' complaints. However, putting all the code to its own contribs is not the best solution, as the majority of it is no more than examples. dict_regex, on the contrary, is an add-on very useful in some situations (and we actually use in in our projects). Also, its requirements differ from the rest of the dictionaries, see below. So, what about the following layout: - contrib/ts_examples - single module which contains all the example stuff in a single folder, to be built together - contrib/dict_regex - separate contrib > * Depending on PCRE, when we have an at-least-equally-good regex engine > built in, is silly. It's an unnecessary dependency and to the (minor) > extent that the regex syntax is different, we'd have to document the > discrepancies. Built-in regex engine seems to not support the one feature critical to the dict_regex operation - it is not able to report the "partial match" in a case when the matching fails solely due to premature end of input string (i.e. when matching may possibly succeed after adding some data to the string). If it is possible to achieve this behaviour with built-in engine, please point me to the right direction. > * dict_regex is not nearly up to speed on encoding or locale issues. > I didn't look at the other ones too closely, they may or may not need > similar adjustments. > > * Allowing config files to be read from anywhere is not acceptable. > We have dealt with this in the core code and the contrib examples > *must* follow the same rules. Is it necessary to require this behaviour from each contrib module? They are not core code, and usually solve application-level tasks - is it optimal to store the application config files in postgres tree? Also, these dictionaries need some example config files at the regression test time, and these configs are of no sense for anyone - is it good to pullute the system tree with them? On the other hand, to prevent reading arbitrary files we may require the specific header line which identifies these dictionary configs. > * The whole "utils" part of dict_regex should probably go away; it > is reinventing wheels that already exist in the Postgres backend > environment. Since these are meant to be code examples, they should > show the best ways of doing things within Postgres. Yes, you are right. I'll rewrite it using StringInfo (the "official" string-handling layer, right?). Sincerely your, Sergey Karpov ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings