On 17.07.2012 18:01, Curtis Hovey wrote: > On 07/17/2012 11:38 AM, Abel Deuring wrote: >> I am working currently on >> https://bugs.launchpad.net/launchpad/+bug/1020443 (Full text search >> broken for certain search terms having a "bad combination" of >> punctuation characters, like "?!.". This a fallout bug from my previous >> work on another full text search related bug: >> https://bugs.launchpad.net/launchpad/+bug/29713 ) >> >> As explained by stub in a comment, the stored procedure ftq() does no >> longer >> >> I see two options to fix this bug: >> >> (A) We can either fix the immediate problem (the fix would be quite >> simple) and keep the feature "treat the characters '&|!' as logical >> operators in full text searches". >> >> (B) Let ftq() simply remove "&|!" from queries. > > I like option B because you also get to close > #69628 Need to advertise "OR"/"|" operator for searches
Right, that's a nice side effect ;) > > It also makes it easier to fix this bug > #660283 Bug search pages should document valid search expressions Agreed. Documenting the core features should not be too difficult, but the full text search still has some quirks^Wfeatures, like those described in bug 29227 > > PS. Maybe these bugs are fixable now > #29227 Full text search only understands whitespace as a word seperator No, this bug has a different cause. Postgres' text parser can detect a number of different tokens: Most words are detected as "asciiwords" or "words", but the string mentioned in this bug, "/dev/pmu", is detected as a file/path name and stored as a whole in the FTI. But it would be useful to review how LP's search and indexing machinery is configured. This is not the only problem with non-word/asciiword tokens: Bug 1015511 and bug 1015519 show other issues. > #56244 Can't search for phrases in bug reports That would boil down to checking the position of all words in the index. I believe that this is not supported out of the box by Postgres. But the FTI stores already the position of words. > #111956 Cannot search for identifier containing underscores No, my work on bug 1020443 will not fix this, but I think that a fix for bug 56244 could be easily extended to fix this one too. _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : launchpad-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp