> For Vertical Bit-Parallel (VBP), I think the reason why I didn't think it
> would be useful for Parquet is that it is really expensive to produce and
> really expensive to reconstruct values that aren't filtered out.
Julien, this would be a thing I think the list would love to hear from Jignesh
> For Vertical Bit-Parallel (VBP), I think the reason why I didn't think it
> would be useful for Parquet is that it is really expensive to produce and
> really expensive to reconstruct values that aren't filtered out.
Yes - you can see in Figure 12(a) that the aggregation time went up for the
I looked into this a while ago. Assuming that I remember correctly, the
conclusion I came to was that Horizontal Bit-Parallel (HBP) might be
helpful, but the vertical option was probably not appropriate.
HBP would allow Parquet readers to run predicates on multiple values at
once without needing
On 2018/10/08 22:08:16, Julien Le Dem wrote:
> it's a variation of bit packing. right?
I looked into it on
https://github.com/apache/parquet-format/blob/master/Encodings.md and I believe
that the Horizontal Bit-Parallel encoding in the paper is a variant on bit
packing. There are three
If you want (and if you don't already know him) I'm happy to ask Jignesh if
he wants an intro.
I think he would be happy to tell you about it.
On Mon, Oct 8, 2018 at 4:04 PM Jim Apple wrote:
> > That sounds like an interesting possibility. It's not that fresh in my
> mind
> > but I'd say from
> That sounds like an interesting possibility. It's not that fresh in my mind
> but I'd say from the storage perspective it's a variation of bit packing.
> right?
I'm not familiar with bit packing, so I'd have to look into that. I found the
paper readable enough at the time that I didn't end up
Hi Jim,
I remember chatting with Jignesh Patel about it at the time.
Since his company locomatix was acquired by twitter we had him as an
adviser for some time.
That sounds like an interesting possibility. It's not that fresh in my mind
but I'd say from the storage perspective it's a variation of
The BitWeaving paper from a few years ago demonstrates some large performance
wins in predicate evaluation based partially on reconfiguring the storage
layout:
http://pages.cs.wisc.edu/~jignesh/publ/BitWeaving.pdf
Is it technically possible for Parquet to support "Vertical Bit-Parallel"