"David E. Wheeler" <da...@justatheory.com> writes: > On May 6, 2014, at 2:20 PM, Bruce Momjian <br...@momjian.us> wrote: >> Well, then, we only have a few days to come up with a name.
> What are the options? We have no proposals as yet. I've been looking at the source code to try to understand the difference between the two opclasses (and BTW I concur with the opinions expressed recently about the poor state of the internal documentation for jsonb). If I've got it straight: jsonb_ops indexes keys and values separately, so for instance "{xyz: 2}" would give rise to GIN entries that are effectively the strings "Kxyz" and "V2". If you're looking for tuples containing "{xyz: 2}" then you would be looking for the AND of those independent index entries, which fortunately GIN is pretty good at computing. But you could also look for just keys or just values. jsonb_hash_ops creates an index entry only for values, but what it stores is a hash of both the value and the key it's stored under. So in this example you'd get a hash combining "xyz" and "2". This means the only type of query you can perform is like "find JSON tuples containing {xyz: 2}". Because jsonb_ops stores the *whole* value, you can do lossless index searches (no recheck needed on the heap tuple), but you also run the risk of long strings failing to fit into an index entry. Since jsonb_ops reduces everything to a hash, there's no possibility of index failure, but all queries are lossy and require recheck. TBH, at this point I'm sort of agreeing with the thought expressed upthread that maybe neither of these should be the default as-is. They seem like rather arbitrary combinations of choices. In particular I wonder why there's not an option to store keys and values separately, but as hashes not as the original strings, so that indexability of everything could be guaranteed. Or a variant of that might be to hash only strings that are too large to fit in an index entry, and force recheck only when searching for a string that needed hashing. I wonder whether the most effective use of time at this point wouldn't be to fix jsonb_ops to do that, rather than arguing about what to rename it to. If it didn't have the failure-for-long-strings problem I doubt anybody would be unhappy about making it the default. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers