On Mon, 2008-06-02 at 19:39 +0400, Teodor Sigaev wrote: > > > I have attached a patch for phrase search with respect to the cvs head. > > Basically it takes a a phrase (text) and a TSVector. It checks if the > > relative positions of lexeme in the phrase are same as in their > > positions in TSVector. > > Ideally, phrase search should be implemented as new operator in tsquery, say > # > with optional distance. So, tsquery 'foo #2 bar' means: find all texts where > 'bar' is place no far than two word from 'foo'. The complexity is about > complex > boolean expressions ( 'foo #1 ( bar1 & bar2 )' ) and about several languages > as > norwegian or german. German language has combining words, like a footboolbar > - > and they have several variants of splitting, so result of to_tsquery('foo # > footboolbar') will be a 'foo # ( ( football & bar ) | ( foot & ball & bar ) )' > where variants are connected with OR operation.
This is far more complicated than I thought. > Of course, phrase search should be able to use indexes. I can probably look into how to use index. Any pointers on this? > > > > If the configuration for text search is "simple", then this will produce > > exact phrase search. Otherwise the stopwords in a phrase will be ignored > > and the words in a phrase will only be matched with the stemmed lexeme. > > Your solution can't be used as is, because user should use tsquery too to use > an > index: > > column @@ to_tsquery('phrase search') AND is_phrase_present('phrase search', > column) > > First clause will be used for index scan and it will fast search a candidates. Yes this is exactly how I am using in my application. Do you think this will solve a lot of common case or we should try to get phrase search 1. Use index 2. Support arbitrary distance between lexemes 3. Support complex boolean queries -Sushant. > > > For my application I am using this as a separate shared object. I do not > > know how to expose this function from the core. Can someone explain how > > to do this? > > Look at contrib/ directory in pgsql's source code - make a contrib module > from > your patch. As an example, look at adminpack module - it's rather simple. > > Comments of your code: > 1) > +#ifdef PG_MODULE_MAGIC > +PG_MODULE_MAGIC; > +#endif > > That isn't needed for compiled-in in core files, it's only needed for modules. > > 2) > use only /**/ comments, do not use a // (C++ style) comments -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers