Hi Andrew, I'm Naohiro, and I'm the person Julien has been in touch with. I was planning to attend the sync yesterday but unfortunately missed it due to the timezone difference. (I’m in Japan)
Thanks for kicking off this discussion, I'm definitely interested in contributing. To start with, I'm currently working on a POC in parquet-java to evaluate ALP. While ALP and floating-point compression are my main focus at the moment, I'm also interested in exploring other encoding strategies that could benefit Parquet. I'm also drafting a proposal in Google Docs, and once it's ready, I'll share the link. I'd love to hear if others are working on similar efforts, especially around floating-point compression, to avoid duplication and potentially collaborate. On 2025/10/01 18:11:51 Andrew Lamb wrote: > I would like to start a discussion to help organize and rally anyone > interested in adding new encodings to Parquet. > > I am pretty sure there are many people interested in adding new encodings, > but there are only a few mentions on the mailing list, such as pcode [1] > and FSST/ALP/FastLanes [2]. Prateek mentioned on the sync call today > that he is working on evaluating some potential encodings and hopes to have > some information to share soon, and Julien mentioned he had spoken to > someone else who might be doing something similar. > > Now that Julien has defined a process to extend the spec[3] I think the > steps are much clearer. > > So, I would like to invite anyone interested in adding new encodings to > respond and let us know if you are willing to help evaluate new encodings > and prototype integrations into Parquet implementations? > > Andrew > > > [1]: https://lists.apache.org/thread/bdmfcj4g6y1ccd3mfgrp7d43d73s6zf6 > [2]: https://lists.apache.org/thread/s3o9jk0hr942pv6ono4ymnvvj6pfdsdw > [3]: > https://github.com/apache/parquet-format/blob/master/proposals/README.md >
