Le 06/07/2020 à 17:57, Steve Kim a écrit : > The Parquet format specification is ambiguous about the exact details of > LZ4 compression. However, the *de facto* reference implementation in Java > (parquet-mr) uses the Hadoop LZ4 codec. > > I think that it is important for Parquet c++ to have compatibility and > feature parity with parquet-mr when possible. I prefer to change the > LZ4 implementation in Parquet c++ to match the Hadoop LZ4 implementation > that is used by parquet-mr ( > https://issues.apache.org/jira/browse/PARQUET-1878). I think that this > change will be quick and easy. I have an intern under my supervision who is > available to work on it full time, starting immediately. Please let me know > if we ought to proceed.
Would that keep compatibility with existing files produces by Parquet C++? Regards Antoine.