On 21.3.2014 08:23, Peter Geoghegan wrote:
> On Thu, Mar 13, 2014 at 3:39 PM, Peter Geoghegan <p...@heroku.com> wrote:
>> On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark <st...@mit.edu> wrote:
>>> It does sound like the main question here is which opclass should
>>> be the default. From the discussion there's a jsonb_hash_ops
>>> which works on all input values but supports fewer operators and
>>> a jsonb_ops which supports more operators but can't handle json
>>> with larger individual elements. Perhaps it's better to make
>>> jsonb_hash_ops the default so at least it's always safe to create
>>> a default gin index?
>> Personally, I don't think it's a good idea to change the default.
> I must admit that I'm coming around to the view that jsonb_hash_ops
> would make a better default. Its performance is superb, and I think
> there's a strong case to be made for that more than making up for it
> not supporting all indexable operators - the existence operators
> just aren't that useful in comparison.
I don't think that's how we should choose the default operator class.
Wouldn't an operator class supporting wider range of functionality be a
better fit, as we don't really know what are the users are going to do?
You might be right that existence operators are used less frequently
than conditions on values, how big the difference is? And do we gain
something by using jsonb_hash_ops by default in the end?
Say an application does one '?' query per 100 '@>' queries. If the
default opclass does not support '?' queries (forcing a seqscan), the
total duration may easily be much higher than with the default opclass.
I like that jsonb_hash_ops produces smaller indexes (~50% compared to
jsonb_ops on the delicious dataset), and that it's faster (2-5x on the
simple queries I've tried). But is that worth the risk?
Keeping jsonb_ops as the default seems better / safer to me.
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: