Josh Berkus <j...@agliodbs.com> writes:
> On 08/26/2014 11:40 AM, Tom Lane wrote:
>> I was hoping you'd get some useful data from that, but so far it seems
>> like a rehash of points made in the on-list thread :-(
> Unfortunately even the outside commentors don't seem to understand that
> storage size *is* related to speed, it's exchanging I/O speed for CPU speed.
Yeah, exactly. Given current hardware trends, data compression is
becoming more of a win not less as time goes on: CPU cycles are cheap
even compared to main memory access, let alone mass storage. So I'm
thinking we want to adopt a compression-friendly data format even if
it measures out as a small loss currently.
I wish it were cache-friendly too, per the upthread tangent about having
to fetch keys from all over the place within a large JSON object.
... and while I was typing that sentence, lightning struck. The existing
arrangement of object subfields with keys and values interleaved is just
plain dumb. We should rearrange that as all the keys in order, then all
the values in the same order. Then the keys are naturally adjacent in
memory and object-key searches become much more cache-friendly: you
probably touch most of the key portion of the object, but none of the
values portion, until you know exactly what part of the latter to fetch.
This approach might complicate the lookup logic marginally but I bet not
very much; and it will be a huge help if we ever want to do smart access
to EXTERNAL (non-compressed) JSON values.
I will go prototype that just to see how much code rearrangement is
regards, tom lane
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: