Re: [rt-users] Pg FTS query works, RT search returns 0 results ( was: '.' as delimiter/boundary breaks domain name searches)
>> * An RT 4.2.12 search for the same string returns 0 results[2]. >> I'm 98% certain the ticket *is* indexed though as other queries >> return it[2]. ... > Is the ticket in question a merged ticket? You had a mail from the > same timeframe which dealt with merged ticket FTS, which is a known > bug: https://issues.bestpractical.com/Ticket/Display.html?id=9370 > Sadly, there's not good solution to that bug at this time. I'm losing what little is left of my mind, apparently. I looked last week at the problem ticket and could swear it had no merging in its history. Looking today, I see it *does*. > perl -MRT=-init -le > '$t=RT::Tickets->new(RT->SystemUser);$t->FromSQL("@ARGV");print > $t->BuildSelectQuery' \ > "Content LIKE 'f...@domain.com'" Handy! Used this to generate the SQL. Pasted the SQL into psql for kicks and got the unfortunately now-expected 0 results. Filed this snippet away for later. Thanks Alex. Jeff
Re: [rt-users] Pg FTS query works, RT search returns 0 results ( was: '.' as delimiter/boundary breaks domain name searches)
On Mon, Aug 31, 2015 at 09:39:19AM -0400, Jeff Blaine wrote: > I'm reviving this one time in case anyone has further ideas. > > * PostgreSQL 8.4.20 (RHEL 6.6) with FTS does the right > thing when parsing an email address[1] > > * An RT 4.2.12 search for the same string returns 0 results[2]. > I'm 98% certain the ticket *is* indexed though as other queries > return it[2]. > > * The rt-fulltext-indexer script runs every 10 minutes and has > no errors. Is the ticket in question a merged ticket? You had a mail from the same timeframe which dealt with merged ticket FTS, which is a known bug: https://issues.bestpractical.com/Ticket/Display.html?id=9370 Sadly, there's not good solution to that bug at this time. Otherwise, seeing the query RT is generating, and stripping clauses out of it until it matches, is likely your best bet. You can see the SQL generated from a TicketSQL query by running: perl -MRT=-init -le '$t=RT::Tickets->new(RT->SystemUser);$t->FromSQL("@ARGV");print $t->BuildSelectQuery' \ "Content LIKE 'f...@domain.com'" - Alex
Re: [rt-users] Pg FTS query works, RT search returns 0 results ( was: '.' as delimiter/boundary breaks domain name searches)
On 31 Aug 2015, at 9:39, Jeff Blaine wrote: I'm reviving this one time in case anyone has further ideas. * PostgreSQL 8.4.20 (RHEL 6.6) with FTS does the right thing when parsing an email address[1] I question that. I know that with Pg 9.0 the FTS indexer does arguably "right" things in parsing email addresses, hostnames, and IP addresses BUT that behavior effectively hides the octets inside an IP, the host and domain name elements in a hostname, and the local and domain parts of an email address. For our business (customized hosting & related services) this made the indexing do more harm than good. We have adequate capacity (and suitable config) on our DB server to handle most of the searches we need to do in reasonable time without the indexing, so we disabled it.
[rt-users] Pg FTS query works, RT search returns 0 results ( was: '.' as delimiter/boundary breaks domain name searches)
I'm reviving this one time in case anyone has further ideas. * PostgreSQL 8.4.20 (RHEL 6.6) with FTS does the right thing when parsing an email address[1] * An RT 4.2.12 search for the same string returns 0 results[2]. I'm 98% certain the ticket *is* indexed though as other queries return it[2]. * The rt-fulltext-indexer script runs every 10 minutes and has no errors. Jeff Footnote 1: rt4=# SELECT alias, description, token FROM ts_debug('f...@domain.com'); alias | description | token ---+---+ email | Email address | f...@domain.com (1 row) rt4=# SELECT alias, description, token FROM ts_debug(''); alias | description | token ---+---+ blank | Space symbols | < email | Email address | f...@domain.com blank | Space symbols | > (3 rows) rt4=# Footnote 2: Content LIKE 'f...@domain.com' 0 results Content LIKE 'domain.com' 0 results Content LIKE 'domain' 0 results Content LIKE 'another unique string in the ticket I am trying to hit' 1 result On 10/2/2014 11:08 AM, k...@rice.edu wrote: > On Thu, Oct 02, 2014 at 10:56:56AM -0400, Kevin Falcone wrote: >> On Wed, Oct 01, 2014 at 03:50:43PM -0400, Jeff Blaine wrote: >>> [ Similar, but unrelated to my other message from 10 minutes ago. ] >>> >>> It appears any '.' is interpreted as a word boundary with >>> Pg full-text indexing turned on. >>> >>> Is that known to be true, or am I wrong? >>> >>> This breaks searches for FQDNs names in ticket contents. >>> >>> Searching for 'foobar' will hit foobar.org >>> >>> Searching for 'foobar.org' will not hit 'foobar.org' >> >> What FTS will match/return is dictated by your database and its >> configuration. >> >> Have you reviewed the Postgres full text search documentation for your >> release of Pg? >> >> http://www.postgresql.org/docs/8.4/static/textsearch.html >> >> -kevin > > > Wow! PostgreSQL 8.4, 4 major releases back! I cannot be certain that I > am recalling this correctly, but the default parser in older versions > of PostgreSQL did have that behavior. I do not know when they made the > change to fix it. What do you get when you run: > > rt3=# select plainto_tsquery('rice.edu'); > plainto_tsquery > - > 'rice.edu' > (1 row) > > I seem to recall that in the older version when I saw this issue, it > returned: > > plainto_tsquery > - > 'rice' & 'edu' > > You may be able to make a custom config for your text search using > the definitions from the current release. I just ended up searching > for 'rice' instead of 'rice.edu', for example. > > Regards, > Ken > -- Jeff Blaine kickflop.net PGP/GnuPG Key ID: 0x0C8EDD02