Jun 11, 2025 | Apache Parquet Community Sync <https://www.google.com/calendar/event?eid=MmZvYnM1cXRoOWQ2aHVwbWRjcTF1azZpdmFfMjAyNTA1MjhUMTcwMDAwWiBqdWxpZW4ubGVkZW1AbQ>
Attendees: Apache Parquet Community Sync <apache-parquet-community-s...@googlegroups.com> - Micah Kornfield: Databricks - Talat Uyarer: Google - Martin Prammer: CMU, how to coordinate on the VariantType work - Aditya Bhatnagar: CMU - Neil Chao: Snowflake - Prateek Gaur: Snowflake - Martin Loncaric: Jane Street, encodings - Rok Mihevc: G-Research - Sandeep Gottimukkala: Agenda: - Update on Variants in Rust - Moderator for next meeting Notes: - Is there a centralized location to discuss design goals for VariantType - There is the spec in parquet-format, not really helpful for coordination - Lots of people working on variant (arrow, parquet) - Databricks, snowflake, CMU - Shared framework for variant builder apis across frameworks? - Rust draft - More focused forum for coordinating variant - Micah - Maybe arrow rust has a slack/discord? - We could consider creating a Parquet discord - Use more emails / go fishing in email - Only really central thing is spec for variant type - No big conflicts happened yet, but a lot of emails/juggling - Java has variant mostly implement, C++ and Rust are coming along quickly - - Encodings - Parquet new features <https://docs.google.com/document/d/1qGDnOyoNyPvcN4FCRhbZGAvp0SfewlWo-WVsai5IKUo/edit?tab=t.0> - Micah prepared a doc on process for improving parquet - Datasets - we might want to drop some or add more based on community feedback - Multiple implementations - lots of pros and cons - Concerns about JVM performance. We should benchmark - Pluggable encodings - Wasm - could download from trusted source or embedded in file, but would be security risk and/or increase file size - Custom encoding - somehow Parquet determines library to use at runtime. Would need to build extension mechanism into each applicable language - Building out benchmarks - Martin P has some stuff already built? - Martin L has a benchmark tool that encompasses Parquet as well (though doesn’t measure every facet Parquet would want) - Extension types (how would these work in the context of DecFloat)