Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

Antoine Pitrou Wed, 01 Apr 2020 08:11:07 -0700


The read times are still with memory mapping for the uncompressed case?
 If so, impressive!


Regards

Antoine.


Le 01/04/2020 à 16:44, Wes McKinney a écrit :
> Several pieces of work got done in the last few days:
> 
> * Changing from LZ4 raw to LZ4 frame format (what is recommended for
> interoperability)
> * Parallelizing both compression and decompression at the field level
> 
> Here are the results (using 8 threads on an 8-core laptop). I disabled
> the "memory map" feature so that in the uncompressed case all of the
> data must be read off disk into memory. This helps illustrate the
> compression/IO tradeoff to wall clock load times
> 
> File size (only LZ4 may be different): https://ibb.co/CP3VQkp
> Read time: https://ibb.co/vz9JZMx
> Write time: https://ibb.co/H7bb68T
> 
> In summary, now with multicore compression and decompression,
> LZ4-compressed files are faster both to read and write even on a very
> fast SSD, as are ZSTD-compressed files with a low ZSTD compression
> level. I didn't notice a major difference between LZ4 raw and LZ4
> frame formats. The reads and writes could be made faster still by
> pipelining / making concurrent the disk read/write and
> compression/decompression steps -- the current implementation performs
> these tasks serially. We can improve this in the near future
> 
> I'll update the Format proposal this week so we can move toward
> something we can vote on. I would recommend that we await
> implementations and integration tests for this before releasing this
> as stable, in line with prior discussions about adding stuff to the
> IPC protocol
> 
> On Thu, Mar 26, 2020 at 4:57 PM Wes McKinney <wesmck...@gmail.com> wrote:
>>
>> Here are the results:
>>
>> File size: https://ibb.co/71sBsg3
>> Read time: https://ibb.co/4ZncdF8
>> Write time: https://ibb.co/xhNkRS2
>>
>> Code: 
>> https://github.com/wesm/notebooks/blob/master/20190919file_benchmarks/FeatherCompression.ipynb
>> (based on https://github.com/apache/arrow/pull/6694)
>>
>> High level summary:
>>
>> * Chunksize 1024 vs 64K has relatively limited impact on file sizes
>>
>> * Wall clock read time is impacted by chunksize, maybe 30-40%
>> difference between 1K row chunks versus 16K row chunks. One notable
>> thing is that you can see clearly the overhead associated with IPC
>> reconstruction even when the data is memory mapped. For example, in
>> the Fannie Mae dataset there are 21,661 batches (each batch has 31
>> fields) when the chunksize is 1024. So a read time of 1.3 seconds
>> indicates ~60 microseconds of overhead for each record batch. When you
>> consider the amount of business logic involved with reconstructing a
>> record batch, 60 microseconds is pretty good. This also shows that
>> every microsecond counts and we need to be carefully tracking
>> microperformance in this critical operation.
>>
>> * Small chunksize results in higher write times for "expensive" codecs
>> like ZSTD with a high compression ratio. For "cheap" codecs like LZ4
>> it doesn't make as much of a difference
>>
>> * Note that LZ4 compressor results in faster wall clock time to disk
>> presumably because the compression speed is faster than my SSD's write
>> speed
>>
>> Implementation notes:
>> * There is no parallelization or pipelining of reads or writes. For
>> example, on write, all of the buffers are compressed with a single
>> thread and then compression stops until the write to disk completes.
>> On read, buffers are decompressed serially
>>
>>
>> On Thu, Mar 26, 2020 at 12:24 PM Wes McKinney <wesmck...@gmail.com> wrote:
>>>
>>> I'll run a grid of batch sizes (from 1024 to 64K or 128K) and let you
>>> know the read/write times and compression ratios. Shouldn't take too
>>> long
>>>
>>> On Wed, Mar 25, 2020 at 10:37 PM Fan Liya <liya.fa...@gmail.com> wrote:
>>>>
>>>> Thanks a lot for sharing the good results.
>>>>
>>>> As investigated by Wes, we have existing zstd library for Java (zstd-jni) 
>>>> [1], and lz4 library for Java (lz4-java) [2].
>>>> +1 for the 1024 batch size, as it represents an important scenario where 
>>>> the batch fits into the L1 cache (IMO).
>>>>
>>>> Best,
>>>> Liya Fan
>>>>
>>>> [1] https://github.com/luben/zstd-jni
>>>> [2] https://github.com/lz4/lz4-java
>>>>
>>>> On Thu, Mar 26, 2020 at 2:38 AM Micah Kornfield <emkornfi...@gmail.com> 
>>>> wrote:
>>>>>
>>>>> If it isn't hard could you run with batch sizes of 1024 or 2048 records?  
>>>>> I
>>>>> think there was a question previously raised if there was benefit for
>>>>> smaller sizes buffers.
>>>>>
>>>>> Thanks,
>>>>> Micah
>>>>>
>>>>>
>>>>> On Wed, Mar 25, 2020 at 8:59 AM Wes McKinney <wesmck...@gmail.com> wrote:
>>>>>
>>>>>> On Tue, Mar 24, 2020 at 9:22 PM Micah Kornfield <emkornfi...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Compression ratios ranging from ~50% with LZ4 and ~75% with ZSTD on
>>>>>>>> the Taxi dataset to ~87% with LZ4 and ~90% with ZSTD on the Fannie Mae
>>>>>>>> dataset. So that's a huge space savings
>>>>>>>
>>>>>>> One more question on this.  What was the average row-batch size used?  I
>>>>>>> see in the proposal some buffers might not be compressed, did you this
>>>>>>> feature in the test?
>>>>>>
>>>>>> I used 64K row batch size. I haven't implemented the optional
>>>>>> non-compressed buffers (for cases where there is little space savings)
>>>>>> so everything is compressed. I can check different batch sizes if you
>>>>>> like
>>>>>>
>>>>>>
>>>>>>> On Mon, Mar 23, 2020 at 4:40 PM Wes McKinney <wesmck...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>> hi folks,
>>>>>>>>
>>>>>>>> Sorry it's taken me a little while to produce supporting benchmarks.
>>>>>>>>
>>>>>>>> * I implemented experimental trivial body buffer compression in
>>>>>>>> https://github.com/apache/arrow/pull/6638
>>>>>>>> * I hooked up the Arrow IPC file format with compression as the new
>>>>>>>> Feather V2 format in
>>>>>>>> https://github.com/apache/arrow/pull/6694#issuecomment-602906476
>>>>>>>>
>>>>>>>> I tested a couple of real-world datasets from a prior blog post
>>>>>>>> https://ursalabs.org/blog/2019-10-columnar-perf/ with ZSTD and LZ4
>>>>>>>> codecs
>>>>>>>>
>>>>>>>> The complete results are here
>>>>>>>> https://github.com/apache/arrow/pull/6694#issuecomment-602906476
>>>>>>>>
>>>>>>>> Summary:
>>>>>>>>
>>>>>>>> * Compression ratios ranging from ~50% with LZ4 and ~75% with ZSTD on
>>>>>>>> the Taxi dataset to ~87% with LZ4 and ~90% with ZSTD on the Fannie Mae
>>>>>>>> dataset. So that's a huge space savings
>>>>>>>> * Single-threaded decompression times exceeding 2-4GByte/s with LZ4
>>>>>>>> and 1.2-3GByte/s with ZSTD
>>>>>>>>
>>>>>>>> I would have to do some more engineering to test throughput changes
>>>>>>>> with Flight, but given these results on slower networking (e.g. 1
>>>>>>>> Gigabit) my guess is that the compression and decompression overhead
>>>>>>>> is little compared with the time savings due to high compression
>>>>>>>> ratios. If people would like to see these numbers to help make a
>>>>>>>> decision I can take a closer look
>>>>>>>>
>>>>>>>> As far as what Micah said about having a limited number of
>>>>>>>> compressors: I would be in favor of having just LZ4 and ZSTD. It seems
>>>>>>>> anecdotally that these outperform Snappy in most real world scenarios
>>>>>>>> and generally have > 1 GB/s decompression performance. Some Linux
>>>>>>>> distributions (Arch at least) have already started adopting ZSTD over
>>>>>>>> LZMA or GZIP [1]
>>>>>>>>
>>>>>>>> - Wes
>>>>>>>>
>>>>>>>> [1]:
>>>>>>>>
>>>>>> https://www.archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/
>>>>>>>>
>>>>>>>> On Fri, Mar 6, 2020 at 8:42 AM Fan Liya <liya.fa...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi Wes,
>>>>>>>>>
>>>>>>>>> Thanks a lot for the additional information.
>>>>>>>>> Looking forward to see the good results from your experiments.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Liya Fan
>>>>>>>>>
>>>>>>>>> On Thu, Mar 5, 2020 at 11:42 PM Wes McKinney <wesmck...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I see, thank you.
>>>>>>>>>>
>>>>>>>>>> For such a scenario, implementations would need to define a
>>>>>>>>>> "UserDefinedCodec" interface to enable codecs to be registered from
>>>>>>>>>> third party code, similar to what is done for extension types [1]
>>>>>>>>>>
>>>>>>>>>> I'll update this thread when I get my experimental C++ patch up to
>>>>>> see
>>>>>>>>>> what I'm thinking at least for the built-in codecs we have like
>>>>>> ZSTD.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>> https://github.com/apache/arrow/blob/apache-arrow-0.16.0/docs/source/format/Columnar.rst#extension-types
>>>>>>>>>>
>>>>>>>>>> On Thu, Mar 5, 2020 at 7:56 AM Fan Liya <liya.fa...@gmail.com>
>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Wes,
>>>>>>>>>>>
>>>>>>>>>>> Thanks a lot for your further clarification.
>>>>>>>>>>>
>>>>>>>>>>> Some of my prelimiary thoughts:
>>>>>>>>>>>
>>>>>>>>>>> 1. We assign a unique GUID to each pair of
>>>>>> compression/decompression
>>>>>>>>>>> strategies. The GUID is stored as part of the
>>>>>>>> Message.custom_metadata.
>>>>>>>>>> When
>>>>>>>>>>> receiving the GUID, the receiver knows which decompression
>>>>>> strategy
>>>>>>>> to
>>>>>>>>>> use.
>>>>>>>>>>>
>>>>>>>>>>> 2. We serialize the decompression strategy, and store it into the
>>>>>>>>>>> Message.custom_metadata. The receiver can decompress data after
>>>>>>>>>>> deserializing the strategy.
>>>>>>>>>>>
>>>>>>>>>>> Method 1 is generally used in static strategy scenarios while
>>>>>> method
>>>>>>>> 2 is
>>>>>>>>>>> generally used in dynamic strategy scenarios.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Liya Fan
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 4, 2020 at 11:39 PM Wes McKinney <
>>>>>> wesmck...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Okay, I guess my question is how the receiver is going to be
>>>>>> able
>>>>>>>> to
>>>>>>>>>>>> determine how to "rehydrate" the record batch buffers:
>>>>>>>>>>>>
>>>>>>>>>>>> What I've proposed amounts to the following:
>>>>>>>>>>>>
>>>>>>>>>>>> * UNCOMPRESSED: the current behavior
>>>>>>>>>>>> * ZSTD/LZ4/...: each buffer is compressed and written with an
>>>>>> int64
>>>>>>>>>>>> length prefix
>>>>>>>>>>>>
>>>>>>>>>>>> (I'm close to putting up a PR implementing an experimental
>>>>>> version
>>>>>>>> of
>>>>>>>>>>>> this that uses Message.custom_metadata to transmit the codec,
>>>>>> so
>>>>>>>> this
>>>>>>>>>>>> will make the implementation details more concrete)
>>>>>>>>>>>>
>>>>>>>>>>>> So in the USER_DEFINED case, how will the library know how to
>>>>>>>> obtain
>>>>>>>>>>>> the uncompressed buffer? Is some additional metadata structure
>>>>>>>>>>>> required to provide instructions?
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 4, 2020 at 8:05 AM Fan Liya <liya.fa...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Wes,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am thinking of adding an option named "USER_DEFINED" (or
>>>>>>>> something
>>>>>>>>>>>>> similar) to enum CompressionType in your proposal.
>>>>>>>>>>>>> IMO, this option should be used primarily in Flight.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Liya Fan
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 4, 2020 at 11:12 AM Wes McKinney <
>>>>>>>> wesmck...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 3, 2020, 8:11 PM Fan Liya <
>>>>>> liya.fa...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sure. I agree with you that we should not overdo this.
>>>>>>>>>>>>>>> I am wondering if we should provide an option to allow
>>>>>> users
>>>>>>>> to
>>>>>>>>>>>> plugin
>>>>>>>>>>>>>>> their customized compression strategies.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you provide a patch showing changes to Message.fbs (or
>>>>>>>>>> Schema.fbs)
>>>>>>>>>>>> that
>>>>>>>>>>>>>> make this idea more concrete?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Liya Fan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Mar 3, 2020 at 9:47 PM Wes McKinney <
>>>>>>>> wesmck...@gmail.com
>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Mar 3, 2020, 7:36 AM Fan Liya <
>>>>>>>> liya.fa...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am so glad to see this discussion, and I am
>>>>>> willing to
>>>>>>>>>> provide
>>>>>>>>>>>> help
>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>> the Java side.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In the proposal, I see the support for basic
>>>>>> compression
>>>>>>>>>>>> strategies
>>>>>>>>>>>>>>>>> (e.g.gzip, snappy).
>>>>>>>>>>>>>>>>> IMO, applying a single basic strategy is not likely
>>>>>> to
>>>>>>>>>> achieve
>>>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>>>> improvement for most scenarios.
>>>>>>>>>>>>>>>>> The optimal compression strategy is often obtained by
>>>>>>>>>> composing
>>>>>>>>>>>> basic
>>>>>>>>>>>>>>>>> strategies and tuning parameters.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I hope we can support such highly customized
>>>>>> compression
>>>>>>>>>>>> strategies.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think very much beyond trivial one-shot buffer level
>>>>>>>>>> compression
>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> probably out of the question for addition to the
>>>>>> current
>>>>>>>>>>>> "RecordBatch"
>>>>>>>>>>>>>>>> Flatbuffers type, because the additional metadata
>>>>>> would add
>>>>>>>>>>>> undesirable
>>>>>>>>>>>>>>>> bloat (which I would be against). If people have other
>>>>>>>> ideas it
>>>>>>>>>>>> would
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>> great to see exactly what you are thinking as far as
>>>>>>>> changes
>>>>>>>>>> to the
>>>>>>>>>>>>>>>> protocol files.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'll try to assemble some examples to show the
>>>>>> before/after
>>>>>>>>>>>> results of
>>>>>>>>>>>>>>>> applying the simple strategy.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Liya Fan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Mar 3, 2020 at 8:15 PM Antoine Pitrou <
>>>>>>>>>>>> anto...@python.org>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If we want to use a HTTP header, it would be more
>>>>>> of a
>>>>>>>>>>>>>>> Accept-Encoding
>>>>>>>>>>>>>>>>>> header, no?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In any case, we would have to put non-standard
>>>>>> values
>>>>>>>> there
>>>>>>>>>>>> (e.g.
>>>>>>>>>>>>>>> lz4),
>>>>>>>>>>>>>>>>>> so I'm not sure how desirable it is to repurpose
>>>>>> HTTP
>>>>>>>>>> headers
>>>>>>>>>>>> for
>>>>>>>>>>>>>>> that,
>>>>>>>>>>>>>>>>>> rather than add some dedicated field to the Flight
>>>>>>>>>> messages.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Antoine.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Le 03/03/2020 à 12:52, David Li a écrit :
>>>>>>>>>>>>>>>>>>> gRPC supports headers so for Flight, we could
>>>>>> send
>>>>>>>>>>>> essentially an
>>>>>>>>>>>>>>>>> Accept
>>>>>>>>>>>>>>>>>>> header and perhaps a Content-Type header.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Mar 2, 2020, 23:15 Micah Kornfield <
>>>>>>>>>>>>>> emkornfi...@gmail.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Wes,
>>>>>>>>>>>>>>>>>>>> A few thoughts on this.  In general, I think it
>>>>>> is a
>>>>>>>>>> good
>>>>>>>>>>>> idea.
>>>>>>>>>>>>>>> But
>>>>>>>>>>>>>>>>>> before
>>>>>>>>>>>>>>>>>>>> proceeding, I think the following points are
>>>>>> worth
>>>>>>>>>>>> discussing:
>>>>>>>>>>>>>>>>>>>> 1.  Does this actually improve
>>>>>> throughput/latency
>>>>>>>> for
>>>>>>>>>>>> Flight? (I
>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>> mentioned you would follow-up with benchmarks).
>>>>>>>>>>>>>>>>>>>> 2.  I think we should limit the number of
>>>>>> supported
>>>>>>>>>>>> compression
>>>>>>>>>>>>>>>>> schemes
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> only 1 or 2.  I think the criteria for selection
>>>>>>>> speed
>>>>>>>>>> and
>>>>>>>>>>>>>> native
>>>>>>>>>>>>>>>>>>>> implementations available across the widest
>>>>>> possible
>>>>>>>>>>>> languages.
>>>>>>>>>>>>>>> As
>>>>>>>>>>>>>>>>> far
>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>> i can tell zstd only have bindings in java via
>>>>>> JNI,
>>>>>>>> but
>>>>>>>>>> my
>>>>>>>>>>>>>>>>>> understanding is
>>>>>>>>>>>>>>>>>>>> it is probably the type of compression for our
>>>>>>>>>> use-cases.
>>>>>>>>>>>> So I
>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>>>>> zstd + potentially 1 more.
>>>>>>>>>>>>>>>>>>>> 3.  Commitment from someone on the Java side to
>>>>>>>>>> implement
>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>>>>>> 4.  This doesn't need to be coupled with this
>>>>>> change
>>>>>>>>>> per-se
>>>>>>>>>>>> but
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> something like flight it would be good to have a
>>>>>>>>>> standard
>>>>>>>>>>>>>>> mechanism
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> negotiating server/client capabilities (e.g.
>>>>>> client
>>>>>>>>>> doesn't
>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>>>>>> compression or only supports a subset).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Micah
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Mar 1, 2020 at 1:24 PM Wes McKinney <
>>>>>>>>>>>>>> wesmck...@gmail.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sun, Mar 1, 2020 at 3:14 PM Antoine Pitrou <
>>>>>>>>>>>>>>> anto...@python.org>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Le 01/03/2020 à 22:01, Wes McKinney a écrit :
>>>>>>>>>>>>>>>>>>>>>>> In the context of a "next version of the
>>>>>> Feather
>>>>>>>>>> format"
>>>>>>>>>>>>>>>> ARROW-5510
>>>>>>>>>>>>>>>>>>>>>>> (which is consumed only by Python and R at
>>>>>> the
>>>>>>>>>> moment), I
>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>>>>>>>> looking at compressing buffers using fast
>>>>>>>> compressors
>>>>>>>>>>>> like
>>>>>>>>>>>>>> ZSTD
>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>>>>> writing the RecordBatch bodies. This could be
>>>>>>>> handled
>>>>>>>>>>>>>> privately
>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>>> implementation detail of the Feather file,
>>>>>> but
>>>>>>>> since
>>>>>>>>>> ZSTD
>>>>>>>>>>>>>>>>> compression
>>>>>>>>>>>>>>>>>>>>>>> could improve throughput in Flight, for
>>>>>> example,
>>>>>>>> I
>>>>>>>>>>>> thought I
>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>>>> bring it up for discussion.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I can see two simple compression strategies:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> * Compress the entire message body in
>>>>>> one-shot,
>>>>>>>>>> writing
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> result
>>>>>>>>>>>>>>>>>>>> out
>>>>>>>>>>>>>>>>>>>>>>> with an 8-byte int64 prefix indicating the
>>>>>>>>>> uncompressed
>>>>>>>>>>>> size
>>>>>>>>>>>>>>>>>>>>>>> * Compress each non-zero-length constituent
>>>>>>>> Buffer
>>>>>>>>>> prior
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> writing
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> the body (and using the same
>>>>>>>>>> uncompressed-length-prefix
>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>> writing
>>>>>>>>>>>>>>>>>>>>>>> the compressed buffer)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> The latter strategy is preferable for
>>>>>> scenarios
>>>>>>>>>> where we
>>>>>>>>>>>> may
>>>>>>>>>>>>>>>>> project
>>>>>>>>>>>>>>>>>>>>>>> out only a few fields from a larger record
>>>>>> batch
>>>>>>>>>> (such as
>>>>>>>>>>>>>>> reading
>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>>> a memory-mapped file).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Agreed.  It may also allow using different
>>>>>>>> compression
>>>>>>>>>>>>>>> strategies
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>> different kinds of buffers (for example a
>>>>>>>> bytestream
>>>>>>>>>>>> splitting
>>>>>>>>>>>>>>>>>> strategy
>>>>>>>>>>>>>>>>>>>>>> for floats and doubles, or a delta encoding
>>>>>>>> strategy
>>>>>>>>>> for
>>>>>>>>>>>>>>>> integers).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If we wanted to allow for different
>>>>>> compression to
>>>>>>>>>> apply to
>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>>> buffers, I think we will need a new Message
>>>>>> type
>>>>>>>>>> because
>>>>>>>>>>>> this
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>> inflate metadata sizes in a way that is not
>>>>>> likely
>>>>>>>> to
>>>>>>>>>> be
>>>>>>>>>>>>>>> acceptable
>>>>>>>>>>>>>>>>>>>>> for the current uncompressed use case.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Here is my strawman proposal
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>> https://github.com/apache/arrow/compare/master...wesm:compression-strawman
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Implementation could be accomplished by one
>>>>>> of
>>>>>>>> the
>>>>>>>>>>>> following
>>>>>>>>>>>>>>>>> methods:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> * Setting a field in Message.custom_metadata
>>>>>>>>>>>>>>>>>>>>>>> * Adding a new field to Message
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I think it has to be a new field in Message.
>>>>>>>> Making
>>>>>>>>>> it an
>>>>>>>>>>>>>>>> ignorable
>>>>>>>>>>>>>>>>>>>>>> metadata field means non-supporting receivers
>>>>>> will
>>>>>>>>>> decode
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> interpret
>>>>>>>>>>>>>>>>>>>>>> the data wrongly.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Antoine.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

Reply via email to