Re: [HACKERS] jsonb status

Andrew Dunstan Mon, 17 Mar 2014 10:49:14 -0700


On 03/16/2014 04:10 AM, Peter Geoghegan wrote:

On Thu, Mar 13, 2014 at 2:00 PM, Andrew Dunstan <[email protected]> wrote:

I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has
finished by the time I am back on deck late tomorrow and that I am able to
commit this on Saturday.

I asked Andrew to hold off on committing this today. It was agreed
that we weren't quite ready, because there were one or two remaining
bugs (since fixed), but also because I felt that it would be useful to
first hear the opinions of more people before proceeding. I think that
we're not that far from having something committed. Obviously I hope
to get this into 9.4, and attach a lot of strategic importance to
having the feature, which is why I made a large effort to help land
it.


Attached patch has a number of notable revisions. Throughout, it has
been possible for anyone to follow our progress here:
https://github.com/feodor/postgres/commits/jsonb_and_hstore

* In general, the file jsonb_support.c (renamed to jsonb_utils.c) is
vastly better commented, and has a much clearer structure. This was
not something I did much with in the previous revision, and so it has
been a definite focus of this one.

* Hashing is refactored to not use CRC32 anymore. I felt this was a
questionable method of hashing, both within jsonb_hash(), as well as
the jsonb_hash_ops GIN operator class.

* Dead code elimination.

* I got around to fixing the memory leaks in B-Tree support function one.

* Andrew added hstore_to_jsonb, hstore_to_jsonb_loose functions and a
cast. One goal of this effort is to preserve a parallel set of
facilities for the json and jsonb types, and that includes
hstore-related features.

* A fix from Alexander for the jsonb_hash_ops @>operator issue I
complained about during the last submission was merged.

* There is no longer any GiST opclass. That just leaves B-Tree, hash,
GIN (default) and GIN jsonb_hash_ops opclasses.

My outstanding concerns are:

* Have we got things right with GIN indexing, containment semantics,
etc? See my remarks in the patch, by grepping "contain" within
jsonb_util.c. Is the GIN text storage serialization format appropriate
and correct?

* General design concerns. By far the largest source of these is the
file jsonb_util.c.

* Is the on-disk format that we propose to tie Postgres to as good as
it could be?

I've been working through all the changes and fixes that Peter andothers have made, and they look pretty good to me. There are a fewmostly cosmetic changes I want to make, but nothing that would be worthholding up committing this for. I'm fairly keen to get this committed,get some buildfarm coverage and get more people playing with it andtesting it.

Like Peter, I would like to see more comments from people on the GINsupport, especially.

The one outstanding significant question of substance I have is this:given the commit 5 days ago of provision for triConsistent functions forGIN opclasses, should be be adding these to the two GIN opclasses we areproviding, and what should they look like? Again, this isn't an issuethat I think needs to hold up committing what we have now.

Regarding Peter's last question, if we're not satisfied with the on-diskformat proposed it would mean throwing the whole effort out and startingagain. The only thing I have thought of as an alternative would be tostore the structure and values separately rather than with values inlinewith the structure. That way you could have a hash of values more orless, which would eliminate redundancy of storage of things like objectfield names. But such a structure might well involve at least as muchcomputational overhead as the current structure. And nobody's beensaying all along "hold on, we can do better than this." So I'm prettyinclined to go with what we have.


cheers

andrew






--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb status

Reply via email to