On 06.09.2011 20:34, Oleg Bartunov wrote:
Here is the latest spgist patch, which has all planned features as well as
all overhead, introduced by concurrency and recovery, so performance
measurement should be realistic now.

I'm ignoring the text suffix-tree part of this for now, because of the issue with non-C locales that Alexander pointer out.

Regarding the quadtree, have you compared the performance of that with Alexander's improved split algorithm? I ran some tests using the test harness I still had lying around from the fast GiST index build tests:

        testname         |      time       | accesses | indexsize
 points unordered auto   | 00:03:58.188866 |   378779 | 522 MB
 points ordered auto     | 00:07:14.362355 |   177534 | 670 MB
 points unordered auto   | 00:02:59.130176 |    46561 | 532 MB
 points ordered auto     | 00:04:00.50756  |    45066 | 662 MB
 points unordered spgist | 00:03:05.569259 |    78871 | 394 MB
 points ordered spgist   | 00:01:46.06855  |   422104 | 417 MB
(8 rows)

These tests were with a table with 7500000 random points. In the ordered-tests, the table is sorted by x,y coordinates. 'time' is the time used to build the index on it, and 'accesses' is the total number of index blocks hit by a series of 10000 bounding box queries, measured from pg_statio_user_indexes.idx_blks_hit + idx_blks_read.

The first two tests in the list are with a GiST index on unpatched PostgreSQL. The next six tests are with Alexander's double-sorting split patch. The last two tests are with an SP-GiST index.

It looks like the query performance with GiST using the double-sorting split is better than SP-GiST, although the SP-GiST index is somewhat smaller. The ordered case seems pathologically bad, is that some sort of a worst-case scenario for quadtrees?

  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to