On Thu, Sep 22, 2011 at 2:05 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote:
> Regarding the quadtree, have you compared the performance of that with > Alexander's improved split algorithm? I ran some tests using the test > harness I still had lying around from the fast GiST index build tests: > > testname | time | accesses | indexsize > -------------------------+----**-------------+----------+-----**------ > points unordered auto | 00:03:58.188866 | 378779 | 522 MB > points ordered auto | 00:07:14.362355 | 177534 | 670 MB > points unordered auto | 00:02:59.130176 | 46561 | 532 MB > points ordered auto | 00:04:00.50756 | 45066 | 662 MB > points unordered spgist | 00:03:05.569259 | 78871 | 394 MB > points ordered spgist | 00:01:46.06855 | 422104 | 417 MB > (8 rows) > I assume first two rows to be produced by new linear split algorithm(current) and secound two rows by double sorting split algorithm(my patch). > These tests were with a table with 7500000 random points. In the > ordered-tests, the table is sorted by x,y coordinates. 'time' is the time > used to build the index on it, and 'accesses' is the total number of index > blocks hit by a series of 10000 bounding box queries, measured from > pg_statio_user_indexes.idx_**blks_hit + idx_blks_read. > > The first two tests in the list are with a GiST index on unpatched > PostgreSQL. The next six tests are with Alexander's double-sorting split > patch. The last two tests are with an SP-GiST index. > > It looks like the query performance with GiST using the double-sorting > split is better than SP-GiST, although the SP-GiST index is somewhat > smaller. The ordered case seems pathologically bad, is that some sort of a > worst-case scenario for quadtrees? Comparison of search speed using number of page accesses is quite comprehensive for various GiST indexes. But when we're comparing SP-GiST vs GiST we should take into accoung that they have different CPU/IO ratio. GiST scans whole page which it accesses. SP-GiST can scan only fraction of page because several nodes can be packed into single page. Thereby it would be interesting to compare also CPU load GiST vs. SP-GiST. Also, there is some hope to reduce number of page accesses in SP-GiST by improving clustering algorithm. ------ With best regards, Alexander Korotkov.