We (Oleg and me) would like to present patch implements partial match for GIN
index and two extensions which use this new feature. We hope that after short
review they will be committed to CVS.

This work was sponsored by EnterpriseDB.

http://www.sigaev.ru/misc/partial_match_gin-0.7.gz
Implements partial match for GIN. It extends interface of support function but
keeps backward compatibility. The basic idea is to find first greater or equal
value in index and scan sequentially until support function says stop. For each
matched entry all corresponding ItemPointers are collected in TIDBitmap
structure to effective merge ItemPointers from different entries. Patch
introduces following changes in interface:
 - compare function has third (optional) argument, of boolean type, it points to
   kind of compare: partial or exact match. If argument is equal to 'false',
   function should produce comparing as usual, else function's result is
   treated as:
       = 0  - match
       < 0  - doesn't match but continue scan
       > 0  - stop scan
 - extractQuery function has fourth (optional) argument of bool** type. Function
   is responsible to allocate correct memory for that array with the same size
   as returning array of searching entries. if extractQuery wishs to point
   partial match for some entry it should set corresponding element of bool
   array to true.

If function described above hasn't extra arguments then GIN will not be able to
use partial match.

http://www.sigaev.ru/misc/tsearch_prefix-0.6.gz
Implements prefix search. This was one of the most wanted feature of text search. Lexeme to partial match should be labeled with asterisk:

select count(*) from apod where fti @@ 'star:*';
or even
select count(*) from apod where fti @@ to_tsquery('star:*');

Dictionary may set a normalized lexeme with flag (TSL_PREFIX) to point to its prefix path.

Here there is a unclean issue: now tsquery has new flag to label prefix search and cstring representation has backward compatibility, but external binary hasn't it now. Now, extra byte is used for storage of this flag. In other hand, there 4 unused bits in external binary representation (in byte stores weights of lexeme), so it's possible to use one of them to store this flag. What are opinions?

http://www.sigaev.ru/misc/wildspeed-0.10.tgz
docs: http://mira.sai.msu.su/~megera/pgsql/pgdoc/wildspeed.html
      http://www.sai.msu.su/~megera/wiki/wildspeed
In short, it's a contrib module that speeds up LIKE operation with any kind of
expression, like 'foo%bar' or '%foo%' or even '%foo%bar'. This module is based
on partial match patch of GIN.

NOTICE 1: current index support of LIKE believes that only BTree can speed up LIKE and becomes confused with this module with error 'unexpected opfamily' in
prefix_quals(). For this reason, partial match patch adds small check before
calling expand_indexqual_opclause().

NOTICE 2: it seems to me, that similar technique could be implemented for ordinary BTree to eliminate hack around LIKE support.

--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/


--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Reply via email to