On Thu, Sep 11, 2014 at 10:01 PM, Josh Berkus <j...@agliodbs.com> wrote:

> So, I finally got time to test Tom's latest patch on this.
>
> TLDR: we want to go with Tom's latest patch and release beta3.
>
> Figures:
>
> So I tested HEAD against the latest lengths patch.  Per Arthur Silva, I
> checked uncompressed times for JSONB against compressed times.  This
> changed the picture considerably.
>
> TABLE SIZES
> -----------
>
> HEAD
>
>       ?column?       | pg_size_pretty
> ---------------------+----------------
>  json text format    | 393 MB
>  jsonb: compressed   | 1147 MB
>  jsonb: uncompressed | 1221 MB
>
> PATCHED
>
>       ?column?       | pg_size_pretty
> ---------------------+----------------
>  json text format    | 394 MB
>  jsonb: compressed   | 525 MB
>  jsonb: uncompressed | 1200 MB
>
>
> EXTRACTION TIMES
> ----------------
>
> HEAD
>
> Q1 (search via GIN index followed by extracting 100,000 values from rows):
>
> jsonb compressed: 4000
> jsonb uncompressed: 3250
>
>
> Q2 (seq scan and extract 200,000 values from rows):
>
> json: 11700
> jsonb compressed: 3150
> jsonb uncompressed: 2700
>
>
> PATCHED
>
> Q1:
>
> jsonb compressed: 6750
> jsonb uncompressed: 3350
>
> Q2:
>
> json: 11796
> jsonb compressed: 4700
> jsonb uncompressed: 2650
>
> ----------------------
>
> Conclusion: with Tom's patch, compressed JSONB is 55% smaller when
> compressed (EXTENDED).  Extraction times are 50% to 70% slower, but this
> appears to be almost entirely due to decompression overhead.  When not
> compressing (EXTERNAL), extraction times for patch versions are
> statistically the same as HEAD, and file sizes are similar to HEAD.
>
> USER REACTION
> -------------
>
> I polled at both PDXpgDay and at FOSS4G, asking some ~~ 80 Postgres
> users how they would feel about a compression vs. extraction time
> tradeoff.  The audience was evenly split.
>
> However, with the current patch, the user can choose.  Users who know
> enough for performance tuning can set JSONB columns to EXTERNAL, and the
> the same performance as the unpatched version.
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


The compression ratio difference is exaggerated in this case but it does
support that Tom's patch alleviates the extraction penalty.

In my testings with the github archive data the savings <->
performance-penalty was fine, but I'm not confident in those results since
there were only 8 top level keys.
For comparison, some twitter api objects[1] have 30+ top level keys. If I
have time in the next couple of days I'll conduct some testings with the
public twitter fire-hose data.

[1] https://dev.twitter.com/rest/reference/get/statuses/home_timeline

Reply via email to