felipecrv commented on PR #35: URL: https://github.com/apache/arrow-experiments/pull/35#issuecomment-2345011710
I added the option to dictionary-encode a column in the compression example. Results are interesting. From not dictionary-encoded to sharing the same dictionary of 60 strings in the `ticker` column: ```output.arrows 941M 803M -138M output.arrows.gz 344M 247M -97M output.arrows.zstd 336M 205M -131M output.arrows.brotli 35M 39M +4M``` Interestingly, brotli compresses better when the data is not dictionary encoded. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
