On Thu, Sep 11, 2014 at 10:01 PM, Josh Berkus <j...@agliodbs.com> wrote:
> So, I finally got time to test Tom's latest patch on this. > > TLDR: we want to go with Tom's latest patch and release beta3. > > Figures: > > So I tested HEAD against the latest lengths patch. Per Arthur Silva, I > checked uncompressed times for JSONB against compressed times. This > changed the picture considerably. > > TABLE SIZES > ----------- > > HEAD > > ?column? | pg_size_pretty > ---------------------+---------------- > json text format | 393 MB > jsonb: compressed | 1147 MB > jsonb: uncompressed | 1221 MB > > PATCHED > > ?column? | pg_size_pretty > ---------------------+---------------- > json text format | 394 MB > jsonb: compressed | 525 MB > jsonb: uncompressed | 1200 MB > > > EXTRACTION TIMES > ---------------- > > HEAD > > Q1 (search via GIN index followed by extracting 100,000 values from rows): > > jsonb compressed: 4000 > jsonb uncompressed: 3250 > > > Q2 (seq scan and extract 200,000 values from rows): > > json: 11700 > jsonb compressed: 3150 > jsonb uncompressed: 2700 > > > PATCHED > > Q1: > > jsonb compressed: 6750 > jsonb uncompressed: 3350 > > Q2: > > json: 11796 > jsonb compressed: 4700 > jsonb uncompressed: 2650 > > ---------------------- > > Conclusion: with Tom's patch, compressed JSONB is 55% smaller when > compressed (EXTENDED). Extraction times are 50% to 70% slower, but this > appears to be almost entirely due to decompression overhead. When not > compressing (EXTERNAL), extraction times for patch versions are > statistically the same as HEAD, and file sizes are similar to HEAD. > > USER REACTION > ------------- > > I polled at both PDXpgDay and at FOSS4G, asking some ~~ 80 Postgres > users how they would feel about a compression vs. extraction time > tradeoff. The audience was evenly split. > > However, with the current patch, the user can choose. Users who know > enough for performance tuning can set JSONB columns to EXTERNAL, and the > the same performance as the unpatched version. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > The compression ratio difference is exaggerated in this case but it does support that Tom's patch alleviates the extraction penalty. In my testings with the github archive data the savings <-> performance-penalty was fine, but I'm not confident in those results since there were only 8 top level keys. For comparison, some twitter api objects[1] have 30+ top level keys. If I have time in the next couple of days I'll conduct some testings with the public twitter fire-hose data. [1] https://dev.twitter.com/rest/reference/get/statuses/home_timeline