Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-14 Thread Steve Atkins
On Feb 14, 2011, at 12:09 AM, Artur Zając wrote: >> Looks like you've almost re-invented the trigram module: >> http://www.postgresql.org/docs/9.0/static/pgtrgm.html > > I didn't know about this module. > Idea to use three letters strings and use Full Text Search is the same, but > the rest is

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-14 Thread Greg Smith
Gnanakumar wrote: Thanks for your suggestion. Our production server is currently running PostgreSQL v8.2.3. I think pg_trgm contrib module is not available for 8.2 series. You're going to find that most of the useful answers here will not work on 8.2. Full-text search was not fully inte

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-14 Thread Shaun Thomas
On 02/14/2011 12:59 AM, Gnanakumar wrote: QUERY: DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%' EMAIL column is VARCHAR(256). Honestly? You'd be better off normalizing this column and maybe hiding that fact in a view if your app requires email as a single column. Split it like this:

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-14 Thread Artur Zając
> Looks like you've almost re-invented the trigram module: > http://www.postgresql.org/docs/9.0/static/pgtrgm.html I didn't know about this module. Idea to use three letters strings and use Full Text Search is the same, but the rest is not. Is the idea to use similarity for this problem is real

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-14 Thread Gnanakumar
> The closest you can do is something like Artur's option (or the pg_trgm > module - handy since you are looking at 3-chars and up) to select likely > matches combined with a separate search on '%domain.com%' to confirm > that fact. Thanks for your suggestion. Our production server is currentl

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-14 Thread Artur Zając
>How can we boost performance of queries containing pattern matching >characters? In my case, we're using a percent sign (%) that matches any string of zero or more characters. > > QUERY: DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%' > > EMAIL column is VARCHAR(256). > > As it is clear

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Richard Huxton
On 14/02/11 07:46, Gnanakumar wrote: If you really need to match all those options, you can't use an index. A substring-matching index would need to have multiple entries per character per value (since it doesn't know what you will search for). The index-size becomes unmanageable very quickly.

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Richard Huxton
On 14/02/11 07:38, Artur Zając wrote: I had almost the same problem. To resolve it, I created my own text search parser (myftscfg) which divides text in column into three letters parts, for example: someem...@domain.com is divided to som, ome,mee,eem,ema,mai,ail,il@, l@d,@do,dom,oma,mai,ain,in.,

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Gnanakumar
> If you really need to match all those options, you can't use an index. A > substring-matching index would need to have multiple entries per > character per value (since it doesn't know what you will search for). > The index-size becomes unmanageable very quickly. > That's why I asked what you

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Richard Huxton
On 14/02/11 07:28, Gnanakumar wrote: Is that really what you are after? Or, did you just want to match: us...@domain.com us...@sub.domain.com I understand that because I've (%) at the beginning and end, it's going to match unrelated domains, etc., which as you said rightly, it is wide-r

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Gnanakumar
> Is that really what you are after? Or, did you just want to match: >us...@domain.com >us...@sub.domain.com I understand that because I've (%) at the beginning and end, it's going to match unrelated domains, etc., which as you said rightly, it is wide-ranging. But my point here is that h

Re: [PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Richard Huxton
On 14/02/11 06:59, Gnanakumar wrote: How can we boost performance of queries containing pattern matching characters? QUERY: DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%' As it is clear from the above query, email is matched "partially and case-insensitively", which my application r

[PERFORM] How to boost performance of queries containing pattern matching characters

2011-02-13 Thread Gnanakumar
Hi, How can we boost performance of queries containing pattern matching characters? In my case, we're using a percent sign (%) that matches any string of zero or more characters. QUERY: DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%' EMAIL column is VARCHAR(256). As it is clear from the