Re: [PATCHES] First implementation of GIN for pg_trgm

2007-02-22 Thread Guillaume Smet
On 2/22/07, Oleg Bartunov wrote: You're right, it would be nice. This is what we need for faster ranking in tsearch2, since currently we should consult heap to get positional information, which slowdowns search. We didn't investigate the possibility to keep additional information with index, but

Re: [PATCHES] First implementation of GIN for pg_trgm

2007-02-22 Thread Teodor Sigaev
I think it can be interesting for other flavours of GIN usage. Is there a way to add the number of entries of the considered indexed item to the consistent prototype without adding too much overhead and complexity? We are thinking about adding extra value, but it's still only thinking. ---

Re: [PATCHES] First implementation of GIN for pg_trgm

2007-02-22 Thread Oleg Bartunov
On Thu, 22 Feb 2007, Guillaume Smet wrote: On 2/22/07, Teodor Sigaev <[EMAIL PROTECTED]> wrote: How long is average length of strings in table? test=# SELECT MIN(length(word)), MAX(length(word)), AVG(length(word)) FROM lieu_mots_gin; min | max |avg -+-+ 1

Re: [PATCHES] First implementation of GIN for pg_trgm

2007-02-22 Thread Guillaume Smet
On 2/22/07, Teodor Sigaev <[EMAIL PROTECTED]> wrote: How long is average length of strings in table? test=# SELECT MIN(length(word)), MAX(length(word)), AVG(length(word)) FROM lieu_mots_gin; min | max |avg -+-+ 1 | 38 | 7.4615463141373282 (1 row) I don't

Re: [PATCHES] First implementation of GIN for pg_trgm

2007-02-22 Thread Teodor Sigaev
I didn't see any improvement in terms of size of the index (14 MB for 642 738 rows in the index in both cases) or speed. Our dictionary table contains 78367 words and its size is 3 MB. Did I miss something? Comparing integers is cheaper than strings. Although it hasn't significant matter for ind

Re: [PATCHES] First implementation of GIN for pg_trgm

2007-02-22 Thread Guillaume Smet
On 2/22/07, Teodor Sigaev <[EMAIL PROTECTED]> wrote: > From a previous discussion with Teodor, it would be better to store an > int in the index instead of a text (it takes less space and is > faster). I couldn't find any example so if anyone has an advice to fix > that, it's welcome (mostly how

Re: [PATCHES] First implementation of GIN for pg_trgm

2007-02-22 Thread Teodor Sigaev
From a previous discussion with Teodor, it would be better to store an int in the index instead of a text (it takes less space and is faster). I couldn't find any example so if anyone has an advice to fix that, it's welcome (mostly how to pack the trigram into an int instead of a text). Somethi