[
https://issues.apache.org/jira/browse/ARROW-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975288#comment-15975288
]
Kazuaki Ishizaki commented on ARROW-300:
----------------------------------------
[~wesmckinn] Thank you for your kindly and positive comment. I will work for
preparing a proposal (It would take some time since I have to prepare a
presentation for GTC, too).
[~xhochy] IIUC, Parquet is used for a persistent file. Arrow is used for
in-memory format.
What level of proposal do you expect? For example,
* What we want to do (e.g. RLE, Delta-encoding)
* New meta data format to support new compression schemes (new .fbs file)
* Data format for new compression schemes
* Prototype implementation
* others
Also, will that proposal be posted into another JIRA entry or a comment in this
JIRA entry?
> [Format] Add buffer compression option to IPC file format
> ---------------------------------------------------------
>
> Key: ARROW-300
> URL: https://issues.apache.org/jira/browse/ARROW-300
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Format
> Reporter: Wes McKinney
>
> It may be useful if data is to be sent over the wire to compress the data
> buffers themselves as their being written in the file layout.
> I would propose that we keep this extremely simple with a global buffer
> compression setting in the file Footer. Probably only two compressors worth
> supporting out of the box would be zlib (higher compression ratios) and lz4
> (better performance).
> What does everyone think?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)