Re: [HACKERS] Testing of various opclasses for ranges

2012-07-10 Thread Alexander Korotkov
On Tue, Jul 10, 2012 at 1:38 PM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> I think the ultimate question is, which ones of these should we include in
> core? We cannot drop the existing range_ops opclass, if only because that
> would break pg_upgrade. However, range_ops2 seems superior to it, so I
> think we should make that the default for new indexes.
>

Actually, I'm not fully satisfied with range_ops2. I expect it could be
recommend for all cases, but actually it builds significantly slower and
sometimes requires more pages for search. Likely, we have to write some
recommedation in docs about which opclass to use in particular.
Additionally, somebody could think GiST range indexing becoming tangled.

For SP-GiST, I don't think we need to include both quad and k-d tree
> implementations. They have quite similar characteristics, so IMHO we should
> just pick one. Which one would you prefer? Is there any difference in terms
> of code complexity between them? Looking at the performance test results,
> quad tree seems to be somewhat slower to build, but is faster to query.
> Based on that, I think we should pick the quad tree, query performance
> seems more important.


Agree, I think we should stay at quad tree implemetation.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Testing of various opclasses for ranges

2012-07-10 Thread Heikki Linnakangas

On 10.07.2012 02:33, Alexander Korotkov wrote:

Hackers,

I've tested various opclasses for ranges (including currently in-core one
and my patches). I've looked into scholar papers for which datasets they
are using for testing. The lists below show kinds of datasets used in
papers.


Great! That's a pretty comprehensive suite of datasets.


I've merged all 3 patches into 1 (see 2d_map_range_indexing.patch). In this
patch following opclasses are available for ranges:
1) range_ops - currently in-core GiST opclass
2) range_ops2 - GiST opclass based on 2d-mapping
3) range_ops_quad - SP-GiST quad tree based opclass
4) range_ops_kd - SP-GiST k-d tree based opclass


I think the ultimate question is, which ones of these should we include 
in core? We cannot drop the existing range_ops opclass, if only because 
that would break pg_upgrade. However, range_ops2 seems superior to it, 
so I think we should make that the default for new indexes.


For SP-GiST, I don't think we need to include both quad and k-d tree 
implementations. They have quite similar characteristics, so IMHO we 
should just pick one. Which one would you prefer? Is there any 
difference in terms of code complexity between them? Looking at the 
performance test results, quad tree seems to be somewhat slower to 
build, but is faster to query. Based on that, I think we should pick the 
quad tree, query performance seems more important.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers