What data are you using right now Josh? There's the github archive http://www.githubarchive.org/ Here's some sample data https://gist.github.com/igrigorik/2017462
-- Arthur Silva On Wed, Aug 20, 2014 at 6:09 PM, Josh Berkus <j...@agliodbs.com> wrote: > On 08/20/2014 08:29 AM, Tom Lane wrote: > > Josh Berkus <j...@agliodbs.com> writes: > >> On 08/15/2014 04:19 PM, Tom Lane wrote: > >>> Personally I'd prefer to go to the all-lengths approach, but a large > >>> part of that comes from a subjective assessment that the hybrid > approach > >>> is too messy. Others might well disagree. > > > >> ... So, that extraction test is about 1% *slower* than the basic Tom > Lane > >> lengths-only patch, and still 80% slower than original JSONB. And it's > >> the same size as the lengths-only version. > > > > Since it's looking like this might be the direction we want to go, I took > > the time to flesh out my proof-of-concept patch. The attached version > > takes care of cosmetic issues (like fixing the comments), and includes > > code to avoid O(N^2) penalties in findJsonbValueFromContainer and > > JsonbIteratorNext. I'm not sure whether those changes will help > > noticeably on Josh's test case; for me, they seemed worth making, but > > they do not bring the code back to full speed parity with the all-offsets > > version. But as we've been discussing, it seems likely that those costs > > would be swamped by compression and I/O considerations in most scenarios > > with large documents; and of course for small documents it hardly > matters. > > Table sizes and extraction times are unchanged from the prior patch > based on my workload. > > We should be comparing all-lengths vs length-and-offset maybe using > another workload as well ... > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com >