Hi Evan,
> Hope everyone is staying safe!
Thanks you too.
A fairly substantial amount of CPU is needed for translating from Parquet;
> main memory bandwidth becomes a factor. Thus, it seems speed and
> constraining factors varies widely by application
I agree performance is going to be applic
Hi Micah,
Hope everyone is staying safe!
> On Mar 16, 2020, at 9:41 PM, Micah Kornfield wrote:
>
> I feel a little uncomfortable in the fact that there isn't a more clearly
> defined dividing line for what belongs in Arrow and what doesn't. I suppose
> this is what discussions like these are
in
>> >> wrote:
>> >>>>
>> >>>> Hey Evan,
>> >>>>
>> >>>>
>> >>>> thank you for the interest.
>> >>>>
>> >>>> There has been some effort for compressin
ast as large as >= 15 bits
> >> of entropy per element. I suppose the encoding might actually also make
> >> sense for high-entropy integer data but I am not super sure.
> >>>> For low-entropy data, the dictionary encoding is good though I suspect
> >> there can be room for perform
ight actually also make
> >> sense for high-entropy integer data but I am not super sure.
> >>>> For low-entropy data, the dictionary encoding is good though I suspect
> >> there can be room for performance improvements.
> >>>> This is my final report
pressor, such as ZSTD, LZ4, etc, is used. It only works well for
> >> high-entropy floating-point data, somewhere at least as large as >= 15 bits
> >> of entropy per element. I suppose the encoding might actually also make
> >> sense for high-entropy integer data but
opy integer data but I am not super sure.
>>>> For low-entropy data, the dictionary encoding is good though I suspect
>> there can be room for performance improvements.
>>>> This is my final report for the encoding here:
>> https://github.com/martinradev/
Hi,
Le 11/03/2020 à 06:31, Micah Kornfield a écrit :
>
> I still think we should be careful on what is added to the spec, in
> particular, we should be focused on encodings that can be used to improve
> computational efficiency rather than just smaller size. Also, it is
> important to note that
ion as the one in https://github.com/powturbo/Turbo-Transpose.
> > >
> > >
> > > Maybe the points I sent can be helpful.
> > >
> > >
> > > Kinds regards,
> > >
> > > Martin
> > >
> > > __
> >
> > Maybe the points I sent can be helpful.
> >
> >
> > Kinds regards,
> >
> > Martin
> >
> >
> > From: evan_c...@apple.com on behalf of Evan Chan
> >
> > Sent: Tuesday, March 10, 2020 5
igation turned out be quite the same
>> solution as the one in https://github.com/powturbo/Turbo-Transpose.
>>
>>
>> Maybe the points I sent can be helpful.
>>
>>
>> Kinds regards,
>>
>> Martin
>>
>> _
__
> From: evan_c...@apple.com on behalf of Evan Chan
>
> Sent: Tuesday, March 10, 2020 5:15:48 AM
> To: dev@arrow.apache.org
> Subject: Summary of RLE and other compression efforts?
>
> Hi folks,
>
> I’m curious about the state of efforts for more compressed e
nspose.
>
>
> Maybe the points I sent can be helpful.
>
>
> Kinds regards,
>
> Martin
>
>
> From: evan_c...@apple.com on behalf of Evan Chan
>
> Sent: Tuesday, March 10, 2020 5:15:48 AM
> To: dev@arrow.apache.org
> Su
ary of RLE and other compression efforts?
Hi folks,
I’m curious about the state of efforts for more compressed encodings in the
Arrow columnar format. I saw discussions previously about RLE, but is there a
place to summarize all of the different efforts that are ongoing to bring more
compres
Hi folks,
I’m curious about the state of efforts for more compressed encodings in the
Arrow columnar format. I saw discussions previously about RLE, but is there a
place to summarize all of the different efforts that are ongoing to bring more
compressed encodings?
Is there an effort to compre
15 matches
Mail list logo