Martin Loncaric created PARQUET-2132:
----------------------------------------

             Summary: Support Quantile Compression q_compress column codec
                 Key: PARQUET-2132
                 URL: https://issues.apache.org/jira/browse/PARQUET-2132
             Project: Parquet
          Issue Type: New Feature
          Components: parquet-cpp, parquet-format, parquet-mr
            Reporter: Martin Loncaric


Quantile Compression (https://github.com/mwlon/quantile-compression) is a 
recent but stable compression algorithm for numerical sequences that averages 
35%+ higher compression ratio than the next best codec (zstd), given the same 
compression time. It has fairly fast decompression speed, close to that of 
zstd. Adding q_compress as a column codec for all numerical columns could 
substantially reduce the size of most parquet files.

q_compress is implemented in Rust, which has good interop with C++ and can run 
in JVM via JNI (e.g. https://github.com/pancake-db/pancake-scala-client).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to