It might be worth opening up an issue with the lz4-java library.  This
seems like the java implementation doesn't fully support the LZ4 stream
protocol?

Antoine in this case it looks like Joris is applying the compression and
decompression at the file level NOT the IPC level.

On Thu, Jan 28, 2021 at 10:01 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Le 28/01/2021 à 17:59, Joris Peeters a écrit :
> > From Python, I'm dumping an LZ4-compressed arrow stream to a file, using
> >
> >     with pa.output_stream(path, compression = 'lz4') as fh:
> >         writer = pa.RecordBatchStreamWriter(fh, table.schema)
> >         writer.write_table(table)
> >         writer.close()
> >
> > I then try reading this file from Java, starting with
> >
> >     var is = new LZ4FrameInputStream(new FileInputStream(path.toFile()));
> >
> > using the lz4-java library. That fails, however, with
>
> Well, that sounds expected.  LZ4 compression in the IPC format does not
> work by compressing the whole stream.  Instead, buffers in the stream
> are compressed individually, while metadata is uncompressed.
>
> So, you needn't wrap the stream with LZ4 yourself.  Instead, just let
> the Java implementation of Arrow handle compression.  It *should* work.
>
> Regards
>
> Antoine.
>

Reply via email to