chairmank commented on pull request #7789: URL: https://github.com/apache/arrow/pull/7789#issuecomment-660275200
What do we think about making the Decompress method of the new hadoop lz4 codec fall back to alternate implementation if it fails to decompress? Then the new hadoop lz4 codec could be used unconditionally, without trying to guess Parquet writer version from the file metadata. There would be a performance cost when attempting to read data pages that were written with incompatible lz4 codec. But this may be acceptable, because * after this change, only a minority of Parquet files will have this incompatible lz4ccompression * lz4 tends to error quickly when it tries and fails to read the first sequence (https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md#compressed-block-format) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org