Re: [HACKERS] knngist questions

Teodor Sigaev Fri, 12 Nov 2010 10:11:32 -0800

1. Is KNNGIST intended to work if there's more than one pathkey?  If
so, how?  Example:


SELECT * FROM tab ORDER BY this_point<->  '(0,0)', this_point<->  '(1,1)'

Why not, if distances from two points to '(0,0)' are equal, it's need to comparedistances to '(1,1)'. Nothing new here, KNN-GiST supports it, that a reason whyStackElem struct has variable-size array for distances.

In practice, it could be used for finding closest restaurant with some fish inits name (pg_trgm has a support of disttance):

SELECT * FROM restaurants ORDER BY MY_COORDS <-> r_coords, 'fish' <-> r_name;

2. I'm concerned by the fact that the new consistent() methods only
return 0 or -1.0 for non-leaf pages.  If we have a query like:

M-m, it's returns 0 or -1.0 only for operation from WHERE clause, for ORDER BYoperation it should return >= 0 value. That's why consistentFn shoulf be able todistinguish source of operation (actually, it's need because now we allowboolean distances, if distance could not be a boolean then operation could notcome from WHERE clause)

SELECT * FROM tab WHERE this_point<->  '(0,0)'

For point, it's incorrect query: non-boolean operations could not present inWHERE clause.

you'd need to have the consistent function for each page return a
minimum and maximum distance to '(0,0)'.  If you have one page that

Only minimum distance which guarantees that in it's sub-tree there is no moreclose point. For R-tree it's rather obvious because keys on inner pages areactually a bounding box of its subtree.

has a minimum distance of 0 and a maximum distance of 100 and another
page which has a minimum distance of 101 and a maximum distance of
200, you don't even need to look at the second page until all the keys
from the first one have been processed.  If there's a third page with

Right, it works so, until I made a big mistake in code. But performance testallows me to believe that I didn't :)

It's read a page (for the start - root page) and puts all keys in binary treeordered by distance and then takes left-most key from binary tree. If that keyis pointer to heap, GiST returns it to postgres, if it's a pointer to index'spage then repeat loop with new page.

tree up front.  If you are already doing something like this somewhere
in the code, please point me in the right direction...

getNextNearest() function: it gets next pointer from tree with a help ofgetNextDataPointer(), and then, depending of pointer type (to index or heappage) returns a pointer or call processPage() which puts new pointers into tree.

May be, not obvious thing that all pointers on the same distance are stored insingle binary tree node (struct StackElem). Each pointer is represented byDataPointer which contains information needed for support concurrency. List ofthat structs is keeped "semi-ordered": pointer to heap are always in thebeginning of list to increase response time and memory requirement.


3. I've been scratching my head over the following bit of code and it
doesn't make any sense to me.  As far as I can tell, this is
effectively comparing the number of columns in the ORDER BY clause to
the number of restriction clauses applicable to the relation being
scanned.  Those two quantities don't seem to have much to do with each
other, so either I'm confused or the code is.  It doesn't seem like it

Oops, it seems to me that's artefact of developing, sorry. I'm not very familiarwith planner/optimizer so that required a lot of my brain and I just missed thatafter another attempt to get it work.

should matter anyway, since I don't think we're planning on any AMs
being both amoptionalkey and amcanorderbyop.


Agree.
--
Teodor Sigaev                                   E-mail: teo...@sigaev.ru
                                                   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist questions

Reply via email to