Attendees and topics: -
Micah Databricks: follow up on guidelines for new encodings. - Andrew Lamb: InfluxData, Rust Parquet maintainer, discuss Variant binary - Kenny: HyParquet, JS. - Dewey Whereabout. C++ geometry, feng Java impl. Update on Geo - Adam: G-Research OSS, ParquetSharp. Work with Rok, Key mgmt tool api to arrow rust - Talat Google BigQuery - Raul quantstack, arrow Parquet cpp - Rok: KMS question for encryption, Variant - Fokko DB - Gabor dremio: Variant, Parquet-java/avro module CVE - Dan: Databricks, Variant, geotypes. (iceberg dep) - Martin CMU, Variant in Rust - Jiaying: CMU. Rust Arrow, Parquet - Aihua: Snowflake, Variant - Gene Databricks, Variant - Steve Loughran Notes - parquet-java/avro - We need a proper way to limit risk with reflection - Need avro expertise - Option to remove functionality or forcing opt in a a system property. - dev-list: https://lists.apache.org/thread/c91s61tqkbbrc7xj180xh2rx89yx8pfk - Avro GitHub issues: - https://github.com/apache/parquet-java/issues/3194 - https://github.com/apache/parquet-java/issues/3195 - Encryption KMS - KMS metadata format not officially standardised but used in parquet-java and C++/PyArrow - Current spec: https://parquet.apache.org/docs/file-format/data-pages/encryption/#43-key-metadata - PR to add KMS API to arrow-rs https://github.com/apache/arrow-rs/pull/7387 - Possibly maintained externally to start with rather than in arrow-rs. - Update on geometry types - 2 PRs being reviewed - All inconsistencies between java and C++ resolved now - Ex: Null - Canonical way to represent totally empty things - Need Arrow in and out of that. - Variant - Rust: https://github.com/apache/arrow-rs/pull/7404 - See epic https://github.com/apache/arrow-rs/issues/6736 - Discussion on how to fail early when we have an unknown version of the Variant spec. - Testing Binary compatibility [Andrew] - Made a PR with example Binary variants: https://github.com/apache/parquet-testing/issues/75 - Existing implementations: - Spark can read/write variant - Iceberg implementation to read Variant Binary (java) - GO: https://github.com/apache/arrow-go/pull/344/files - Logical type: https://github.com/apache/parquet-java/pull/3072 Action items - [image: unchecked] [Gabor and Steve] to follow up on the list on restricting more Avro deserialization. - Follow up on adding files generated by different implementations of the Variant spec. On Tue, Apr 15, 2025 at 4:07 PM Julien Le Dem <jul...@apache.org> wrote: > The next Parquet sync is tomorrow Apr 16th at 10am PT - 1pm ET - 7pm CET > To join the invite: > https://calendar.app.google/rCLANWLz1xg69mTL7 > Please contact me to be added to the recurring invite. (every two weeks) > Everybody is welcome, bring your topic or just listen in. > > (Some more details on how the meeting is run: > https://lists.apache.org/thread/bjdkscmx7zvgfbw0wlfttxy8h6v3f71t ) > > Best > Julien >