Re: [PATCHES] Partial match in GIN

Teodor Sigaev Tue, 08 Apr 2008 03:54:40 -0700

Looking at the patch, you require that the TIDBitmap fits in work_mem innon-lossy format. I don't think that's acceptable, it can easily exceedwork_mem if you search for some very common word. Failing to execute avalid query is not good.

But way is better than nothing. In really, that way was chosen to have fastmerge of (potentially) hundreds of sorted lists of ItemPointers. Other ways ismuch slower.

Some calculations: with 8Mb of mem_work TIDBimap in non-lossy mode can store atleast 200000 pages, which gives to us no less than 200000 tuples. For frequentword, that number should multiplied to 10 or 100, because practically everytuple will contain it. Practical limit to number of articles/document served byone servers is about 10 millions.


There are no so many alternatives:
- collect all needed ItemPointers and sort then unique them.
- merge each posting list with already collected ones
- N-way merge, where N can be very big
- Rerun index scan with all possible combinations

All this ways will be much slower even for not very big collections.

I don't think the storage size of tsquery matters much, so whatever isthe best solution in terms of code readability etc.

That was about tsqueryesend/recv format? not a storage on disk. We don't requirecompatibility of binary format of db's files, but I have some doubts aboutbinary dump.

Hmm. match_special_index_operator() already checks that the index'sopfamily is pattern_ops, or text_ops with C-locale. Are you reusing thesame operator families for wildspeed? Doesn't it then also get confusedif you do a "WHERE textcol > 'foo'" query by hand?

No, wildspeed use the same operator ~~

match_special_index_operator() isn't called at all: inmatch_clause_to_indexcol() function is_indexable_operator() is called beforematch_special_index_operator() and returns true.

expand_indexqual_opclause() sees that operation is a OID_TEXT_LIKE_OP and callsprefix_quals() which fails because it wishes only several Btree opfamilies.

NOTICE 2: it seems to me, that similar technique could be implementedfor ordinary BTree to eliminate hack around LIKE support.
LIKE expression. I wonder what the size and performance of that would belike, in comparison to the proposed GIN solution?

GIN speeds up '%foo%' too - which is impossible for btree. But I don't like ahack around LIKE support in BTree. This support uses outflank ways missingregular one.

I'm thinking about add new strategy to Btree and allow directly support ofprefix LIKE search. And BTree will scan index while compare method with optionreturns true.


--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [PATCHES] Partial match in GIN

Reply via email to