Looking at the patch, you require that the TIDBitmap fits in work_mem in
non-lossy format. I don't think that's acceptable, it can easily exceed
work_mem if you search for some very common word. Failing to execute a
valid query is not good.
But way is better than nothing. In really, that way was chosen to have fast
merge of (potentially) hundreds of sorted lists of ItemPointers. Other ways is
Some calculations: with 8Mb of mem_work TIDBimap in non-lossy mode can store at
least 200000 pages, which gives to us no less than 200000 tuples. For frequent
word, that number should multiplied to 10 or 100, because practically every
tuple will contain it. Practical limit to number of articles/document served by
one servers is about 10 millions.
There are no so many alternatives:
- collect all needed ItemPointers and sort then unique them.
- merge each posting list with already collected ones
- N-way merge, where N can be very big
- Rerun index scan with all possible combinations
All this ways will be much slower even for not very big collections.
I don't think the storage size of tsquery matters much, so whatever is
the best solution in terms of code readability etc.
That was about tsqueryesend/recv format? not a storage on disk. We don't require
compatibility of binary format of db's files, but I have some doubts about
Hmm. match_special_index_operator() already checks that the index's
opfamily is pattern_ops, or text_ops with C-locale. Are you reusing the
same operator families for wildspeed? Doesn't it then also get confused
if you do a "WHERE textcol > 'foo'" query by hand?
No, wildspeed use the same operator ~~
match_special_index_operator() isn't called at all: in
match_clause_to_indexcol() function is_indexable_operator() is called before
match_special_index_operator() and returns true.
expand_indexqual_opclause() sees that operation is a OID_TEXT_LIKE_OP and calls
prefix_quals() which fails because it wishes only several Btree opfamilies.
NOTICE 2: it seems to me, that similar technique could be implemented
for ordinary BTree to eliminate hack around LIKE support.
LIKE expression. I wonder what the size and performance of that would be
like, in comparison to the proposed GIN solution?
GIN speeds up '%foo%' too - which is impossible for btree. But I don't like a
hack around LIKE support in BTree. This support uses outflank ways missing
I'm thinking about add new strategy to Btree and allow directly support of
prefix LIKE search. And BTree will scan index while compare method with option
Teodor Sigaev E-mail: [EMAIL PROTECTED]
Sent via pgsql-patches mailing list (firstname.lastname@example.org)
To make changes to your subscription: