Hi everyone,

Here’s my draft for the September Iceberg board report. Let me know if
you’d like to add anything!

I know that JB wanted to add conference talks last time, but I’m not aware
of any that have happened this quarter. If you’ve given a talk recently,
please let me know!

Ryan
Description:

Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.
Project Status:

Current project status: Ongoing
Issues for the board: none
Membership Data:

Apache Iceberg was founded 2020-05-19 (3 years ago)
There are currently 24 committers and 16 PMC members in this project.
The Committer-to-PMC ratio is 3:2.

Community changes, past quarter:

   - No new PMC members. Last addition was Szehon Ho on 2023-04-20.
   - No new committers. Last addition was Amogh Jahagirdar on 2023-04-25.

Project Activity:

Releases:

   - PyIcberg 0.4.0 was released on 2023-07-23
   - 1.3.1 was released on 2023-07-25

Java:

   - Preparing for a 1.4.0 release in Sept/Oct
   - Added dependency bundles for AWS, GCP, and Azure
   - Added Azure FileIO implementation
   - Added API for multi-table commits
   - Performance optimizations for delete file scan planning
   - Spark: Implemented adaptive split sizing
   - Spark: Implemented function pushdown in v2 expressions
   - Flink: Added bucketing only key-by strategy
   - Build: Updated to Gradle version catalog
   - Making progress on the reference implementation of common views
   - Continuing work on table encryption

Python:

   - 0.5.0 rc1 vote is under way
   - Added support for serverless environments
   - Implemented schema evolution
   - Moved to Pydantic v2
   - Added support for positional deletes
   - Substantially improved Avro read performance
   - Added conversion from Parquet to Iceberg schemas
   - Added support for FSSpec and HDFS data
   - Added SQL filter parsing

Rust:

   - Created a repository for the Rust implementation, iceberg-rust
   - 25 PRs merged
   - Implemented base table metadata (e.g., types, transforms)
   - Implemented visitors for working with nested structures
   - Added Avro/Iceberg schema conversion
   - Added build tooling

Go:

   - Created a repository for the Go implementation, iceberg-go
   - Added schema and types

Community Health:

The largest development in the community is the addition of the Rust and Go
repositories, which is shown in the increase in code contributors this
quarter.
The new implementations will also lead to new committers and PMC members.
The
community has had good discussions about how manage contributions, to build
confidence in the implementations as well as to help new contributors become
familiar with the way the Apache community operates. (Along with ASF
requirements like license documentation.)

Two community metrics show decreases. Dev list traffic tends to vary
because of
how the community uses the dev list — that is, mostly for large design
discussions. The number of issues closed was also lower than normal and is
not
expected to fluctuate. We will take a look and see what the difference is.
-- 
Ryan Blue
Tabular

Reply via email to