http://www.sigaev.ru/misc/builtin_knngist_core-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_itself-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_proc-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_pg_trgm-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_btree_gist-0.8.gz

New version, synced with CVS HEAD
http://www.sigaev.ru/misc/builtin_knngist_itself-0.8.2.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_btree_gist-0.8.1.gz

AFAICS, these patches include no documentation.  That's pretty much a
fatal flaw for a feature of this magnitude.  At an absolute minimum,
you need to update the system catalog documentation and the
documentation on CREATE / ALTER OPERATOR CLASS.  There might be some
other places that need to be touched, also.
Oleg promised to do that


+               if (opform->oprresult == BOOLOID)
+                       ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+                                        errmsg("index ordering
operators must not return boolean")));

My first thought was that this code was there to prevent people from
doing the wrong thing by accident.  But I have a niggling feeling that
you're actually relying on this for the correctness of the system.  I
hope I'm wrong, because I don't think that would be a very good idea.

This play is around do we really want to have support of boolean-distance in GiST? I think no, because it's a strange idea to measure distance in true/false measurement units. I can't imagine such real-life distance definition and never heard about that.

Next, pg_amop_opr_fam_index requires uniqueness of operation in operation family and a lot of places in planner believes in that. Suppose, changing that requires a lot of work which has the single aim to support boolean distance in ORDER BY clause.


The GIST code code use more comments; and perhaps the names of some of
the functions and structures could be chosen to be more descriptive.
I think that what used to be called GISTSearchStack has apparently
been replaced with DataPointer; it's not obvious to me that it's good
to change the name, but if it is I don't think DataPointer is a good
GISTSearchStack is replaced by RBTree (GISTScanOpaqueData->stack), tree's nodes contain a StackElem struct which represents list of pointers at the same distance. Each pointer could be a pointer to the inner index's page or to the
heap's tuple and this struct is a DataPointer.

Note, list of DataPointer in StackElem struct is organized by non-obvious way: we keep pointer to the head of list and pointer to the middle of list. New pointer-to-heap is inserted in the beginning of list, pointers-to-index-page - in the middle. That's done because we would like to:
1) pop pointers-to-heap as fast as possible, before any pointers-to-index-page
2) pop pointers-to-index-page to deep page (which is closer to leaf pages)
   first. That's good for KNN performance and emulates classical first-depth
   search in ordinary search.

> choice.  gistindex_keytest has been replaced (sort of) by
> processIndexTuple, which again seems more generic than what it
> replaced.
Renamed, comments are improved

Minor nit: the word "shoould" is mis-spelled.
fixed

BTW, now consistentFn is able to "manage" tree traversal - even for for ordinary search, GiST will choose child page with minimal distance.

--
Teodor Sigaev                                   E-mail: teo...@sigaev.ru
                                                   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to