[GitHub] [parquet-format] pitrou commented on pull request #168: PARQUET-1996: [Format] Add interoperable LZ4 codec, deprecate existing LZ4 codec

2021-03-11 Thread GitBox
pitrou commented on pull request #168: URL: https://github.com/apache/parquet-format/pull/168#issuecomment-797009554 I've opened https://issues.apache.org/jira/browse/PARQUET-1998 for the C++ implementation. @ggershinsky Can I let you do the same for the Java implementation?

[GitHub] [parquet-format] pitrou commented on pull request #168: PARQUET-1996: [Format] Add interoperable LZ4 codec, deprecate existing LZ4 codec

2021-03-11 Thread GitBox
pitrou commented on pull request #168: URL: https://github.com/apache/parquet-format/pull/168#issuecomment-796877818 It does have a couple [other things](https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md), including a magic number and an optional checksum (computed using

[GitHub] [parquet-format] pitrou commented on pull request #168: PARQUET-1996: [Format] Add interoperable LZ4 codec, deprecate existing LZ4 codec

2021-03-11 Thread GitBox
pitrou commented on pull request #168: URL: https://github.com/apache/parquet-format/pull/168#issuecomment-796687561 Ok, distilling here the feedback from Yann Collet (the author of LZ4): the frame format is beneficial as it provides a standard encoding for the compressed and uncompressed

[GitHub] [parquet-format] pitrou commented on pull request #168: PARQUET-1996: [Format] Add interoperable LZ4 codec, deprecate existing LZ4 codec

2021-03-10 Thread GitBox
pitrou commented on pull request #168: URL: https://github.com/apache/parquet-format/pull/168#issuecomment-795853899 Thanks for the reference. I'll wait a bit in case I get some feedback from the LZ4 mailing-list. In any case, what is the procedure for merging an update in the

[GitHub] [parquet-format] pitrou commented on pull request #168: PARQUET-1996: [Format] Add interoperable LZ4 codec, deprecate existing LZ4 codec

2021-03-10 Thread GitBox
pitrou commented on pull request #168: URL: https://github.com/apache/parquet-format/pull/168#issuecomment-795390108 I've also sent [a message](https://groups.google.com/g/lz4c/c/d6V12JKr5Fw) to the LZ4 mailing-list to gather their feedback on this proposal.

[GitHub] [parquet-format] pitrou commented on pull request #168: PARQUET-1996: [Format] Add interoperable LZ4 codec, deprecate existing LZ4 codec

2021-03-10 Thread GitBox
pitrou commented on pull request #168: URL: https://github.com/apache/parquet-format/pull/168#issuecomment-795136194 > What do you think about adding links to the codec specifications directly instead of to the C/C++ implementations? I was not aware that those even existed. I'll add