[
https://issues.apache.org/jira/browse/ARROW-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458249#comment-17458249
]
Antoine Pitrou commented on ARROW-15074:
----------------------------------------
Hmm, I think we should make the spec more precise about this.
Normally, you don't need to emit multiple frames to support streaming
compression. At least the LZ4 C API allows streaming compression inside a
single frame. Also, emitting multiple frames is probably worse for compression
efficiency (because "Each frame is considered independent" as per the [LZ4
spec|https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md#general-structure-of-lz4-frame-format]).
> [C++] Support multiple frames in LZ4?
> -------------------------------------
>
> Key: ARROW-15074
> URL: https://issues.apache.org/jira/browse/ARROW-15074
> Project: Apache Arrow
> Issue Type: Improvement
> Reporter: Jorge Leitão
> Priority: Major
> Attachments: b.arrow
>
>
> When reading an arrow file with buffers LZ4-compressed with multiple frames,
> we get
> {code:java}
> OSError: Lz4 compressed input contains more than one frame
> {code}
> Attached is an example of such a file, which can be opened with
> {code:java}
> import pyarrow.ipc
> with pa.ipc.open_file("b.arrow") as reader:
> print(reader.get_batch(0))
> {code}
> that fails with the error above.
> The LZ4 frame supports multiple frames and we do not refer that only one
> frame should be on a buffer as part of the spec.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)