On Tue, Sep 16, 2014 at 1:11 PM, Josh Berkus <j...@agliodbs.com> wrote: >>> Well, I can only judge from the use cases I personally have, none of >>> which involve more than 100 keys at any level for most rows. So far >>> I've seen some people argue hypotetical use cases involving hundreds of >>> keys per level, but nobody who *actually* has such a use case. >> >> I already told you that I did, and that it was the only and only app I >> had written for JSONB. > > Ah, ok, I thought yours was a test case. Did you check how it performed > on the two patches at all? My tests with 185 keys didn't show any > difference, including for a "last key" case.
No, I didn't test it. But I think Heikki's test results pretty much tell us everything there is to see here. This isn't really that complicated; I've read a few papers on index compression over the years and they seem to often use techniques that have the same general flavor as what Heikki did here, adding complexity in the data format to gain other advantages. So I don't think we should be put off. Basically, I think that if we make a decision to use Tom's patch rather than Heikki's patch, we're deciding that the initial decision, by the folks who wrote the original jsonb code, to make array access less than O(n) was misguided. While that could be true, I'd prefer to bet that those folks knew what they were doing. The only way reason we're even considering changing it is that the array of lengths doesn't compress well, and we've got an approach that fixes that problem while preserving the advantages of fast lookup. We should have a darn fine reason to say no to that approach, and "it didn't benefit my particular use case" is not it. In practice, I'm not very surprised that the impact doesn't seem too bad when you're running SQL queries from the client. There's so much other overhead, for de-TOASTing and client communication and even just planner and executor costs, that this gets lost in the noise. But think about a PL/pgsql procedure, say, where somebody might loop over all of the elements in array. If those operations go from O(1) to O(n), then the loop goes from O(n) to O(n^2). I will bet you a beverage of your choice that somebody will find that behavior within a year of release and be dismayed by it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers