On Tue, Sep 16, 2014 at 1:11 PM, Josh Berkus <j...@agliodbs.com> wrote:
>>> Well, I can only judge from the use cases I personally have, none of
>>> which involve more than 100 keys at any level for most rows.  So far
>>> I've seen some people argue hypotetical use cases involving hundreds of
>>> keys per level, but nobody who *actually* has such a use case.
>>
>> I already told you that I did, and that it was the only and only app I
>> had written for JSONB.
>
> Ah, ok, I thought yours was a test case.  Did you check how it performed
> on the two patches at all?  My tests with 185 keys didn't show any
> difference, including for a "last key" case.

No, I didn't test it.   But I think Heikki's test results pretty much
tell us everything there is to see here.  This isn't really that
complicated; I've read a few papers on index compression over the
years and they seem to often use techniques that have the same general
flavor as what Heikki did here, adding complexity in the data format
to gain other advantages.  So I don't think we should be put off.

Basically, I think that if we make a decision to use Tom's patch
rather than Heikki's patch, we're deciding that the initial decision,
by the folks who wrote the original jsonb code, to make array access
less than O(n) was misguided.  While that could be true, I'd prefer to
bet that those folks knew what they were doing.  The only way reason
we're even considering changing it is that the array of lengths
doesn't compress well, and we've got an approach that fixes that
problem while preserving the advantages of fast lookup.  We should
have a darn fine reason to say no to that approach, and "it didn't
benefit my particular use case" is not it.

In practice, I'm not very surprised that the impact doesn't seem too
bad when you're running SQL queries from the client.  There's so much
other overhead, for de-TOASTing and client communication and even just
planner and executor costs, that this gets lost in the noise.  But
think about a PL/pgsql procedure, say, where somebody might loop over
all of the elements in array.  If those operations go from O(1) to
O(n), then the loop goes from O(n) to O(n^2).  I will bet you a
beverage of your choice that somebody will find that behavior within a
year of release and be dismayed by it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to