Thanks for the pointer. I didn't realize that was already implemented. I think the 3-byte header you are talking about is for the compressed size (and the "isOriginal" flag). I was asking about adding the uncompressed size as part of the compressor-specific compressed data block (like Snappy has). However, since Apache ORC already implements it, the compressed data format has already been decided.
On Wed, Feb 7, 2018 at 9:58 PM, Owen O'Malley <[email protected]> wrote: > In general this is probably better on dev@orc, but this works. ORC-77 > (62fe9504b) implemented the LZ4 codec using airlift. The structure is the > same as the other codecs and it always uses a 3 byte header (#2). > >
