Looking at the patch, you require that the TIDBitmap fits in work_mem in non-lossy format. I don't think that's acceptable, it can easily exceed work_mem if you search for some very common word. Failing to execute a valid query is not good.
But way is better than nothing. In really, that way was chosen to have fast merge of (potentially) hundreds of sorted lists of ItemPointers. Other ways is much slower.

Some calculations: with 8Mb of mem_work TIDBimap in non-lossy mode can store at least 200000 pages, which gives to us no less than 200000 tuples. For frequent word, that number should multiplied to 10 or 100, because practically every tuple will contain it. Practical limit to number of articles/document served by one servers is about 10 millions.

There are no so many alternatives:
- collect all needed ItemPointers and sort then unique them.
- merge each posting list with already collected ones
- N-way merge, where N can be very big
- Rerun index scan with all possible combinations

All this ways will be much slower even for not very big collections.

I don't think the storage size of tsquery matters much, so whatever is the best solution in terms of code readability etc.
That was about tsqueryesend/recv format? not a storage on disk. We don't require compatibility of binary format of db's files, but I have some doubts about binary dump.

Hmm. match_special_index_operator() already checks that the index's opfamily is pattern_ops, or text_ops with C-locale. Are you reusing the same operator families for wildspeed? Doesn't it then also get confused if you do a "WHERE textcol > 'foo'" query by hand?
No, wildspeed use the same operator ~~
match_special_index_operator() isn't called at all: in match_clause_to_indexcol() function is_indexable_operator() is called before match_special_index_operator() and returns true.

expand_indexqual_opclause() sees that operation is a OID_TEXT_LIKE_OP and calls prefix_quals() which fails because it wishes only several Btree opfamilies.

NOTICE 2: it seems to me, that similar technique could be implemented for ordinary BTree to eliminate hack around LIKE support.
LIKE expression. I wonder what the size and performance of that would be like, in comparison to the proposed GIN solution?

GIN speeds up '%foo%' too - which is impossible for btree. But I don't like a hack around LIKE support in BTree. This support uses outflank ways missing regular one.

I'm thinking about add new strategy to Btree and allow directly support of prefix LIKE search. And BTree will scan index while compare method with option returns true.

Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/

Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:

Reply via email to