Hi everyone, Here’s my draft for the September Iceberg board report. Let me know if you’d like to add anything!
I know that JB wanted to add conference talks last time, but I’m not aware of any that have happened this quarter. If you’ve given a talk recently, please let me know! Ryan Description: Apache Iceberg is a table format for huge analytic datasets that is designed for high performance and ease of use. Project Status: Current project status: Ongoing Issues for the board: none Membership Data: Apache Iceberg was founded 2020-05-19 (3 years ago) There are currently 24 committers and 16 PMC members in this project. The Committer-to-PMC ratio is 3:2. Community changes, past quarter: - No new PMC members. Last addition was Szehon Ho on 2023-04-20. - No new committers. Last addition was Amogh Jahagirdar on 2023-04-25. Project Activity: Releases: - PyIcberg 0.4.0 was released on 2023-07-23 - 1.3.1 was released on 2023-07-25 Java: - Preparing for a 1.4.0 release in Sept/Oct - Added dependency bundles for AWS, GCP, and Azure - Added Azure FileIO implementation - Added API for multi-table commits - Performance optimizations for delete file scan planning - Spark: Implemented adaptive split sizing - Spark: Implemented function pushdown in v2 expressions - Flink: Added bucketing only key-by strategy - Build: Updated to Gradle version catalog - Making progress on the reference implementation of common views - Continuing work on table encryption Python: - 0.5.0 rc1 vote is under way - Added support for serverless environments - Implemented schema evolution - Moved to Pydantic v2 - Added support for positional deletes - Substantially improved Avro read performance - Added conversion from Parquet to Iceberg schemas - Added support for FSSpec and HDFS data - Added SQL filter parsing Rust: - Created a repository for the Rust implementation, iceberg-rust - 25 PRs merged - Implemented base table metadata (e.g., types, transforms) - Implemented visitors for working with nested structures - Added Avro/Iceberg schema conversion - Added build tooling Go: - Created a repository for the Go implementation, iceberg-go - Added schema and types Community Health: The largest development in the community is the addition of the Rust and Go repositories, which is shown in the increase in code contributors this quarter. The new implementations will also lead to new committers and PMC members. The community has had good discussions about how manage contributions, to build confidence in the implementations as well as to help new contributors become familiar with the way the Apache community operates. (Along with ASF requirements like license documentation.) Two community metrics show decreases. Dev list traffic tends to vary because of how the community uses the dev list — that is, mostly for large design discussions. The number of issues closed was also lower than normal and is not expected to fluctuate. We will take a look and see what the difference is. -- Ryan Blue Tabular