Kevin Grittner <kgri...@ymail.com> writes: >> Stephen Frost <sfr...@snowman.net> writes: >>> Trying to move the header to the end just for the sake of this >>> doesn't strike me as a good solution as it'll make things quite >>> a bit more complicated.
> Why is that? How much harder would it be to add a single offset > field to the front to point to the part we're shifting to the end? > It is not all that unusual to put a directory at the end, like in > the .zip file format. Yeah, I was wondering that too. Arguably, directory-at-the-end would be easier to work with for on-the-fly creation, not that we do any such thing at the moment. I think the main thing that's bugging Stephen is that doing that just to make pglz_compress happy seems like a kluge (and I have to agree). Here's a possibly more concrete thing to think about: we may very well someday want to support JSONB object field or array element extraction without reading all blocks of a large toasted JSONB value, if the value is stored external without compression. We already went to the trouble of creating analogous logic for substring extraction from a long uncompressed text or bytea value, so I think this is a plausible future desire. With the current format you could imagine grabbing the first TOAST chunk, and then if you see the header is longer than that you can grab the remainder of the header without any wasted I/O, and for the array-subscripting case you'd now have enough info to fetch the element value from the body of the JSONB without any wasted I/O. With directory-at-the-end you'd have to read the first chunk just to get the directory pointer, and this would most likely not give you any of the directory proper; but at least you'd know exactly how big the directory is before you go to read it in. The former case is probably slightly better. However, if you're doing an object key lookup not an array element fetch, neither of these formats are really friendly at all, because each binary-search probe probably requires bringing in one or two toast chunks from the body of the JSONB value so you can look at the key text. I'm not sure if there's a way to redesign the format to make that less painful/expensive --- but certainly, having the key texts scattered through the JSONB value doesn't seem like a great thing from this standpoint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers