Hey Iceberg Nation, Everyone is welcome to attend syncs. Subscribe to this calendar <https://calendar.google.com/calendar/embed?src=3905d492f1b450ba0712f2ae6afa76eb757f13d85220cc03aa4527885adc5629%40group.calendar.google.com&ctz=Asia%2FShanghai> to receive a notification. Note: This meeting note is backdated as I forgot to post it here earlier. 2023-07-19 (Meeting Recording <https://www.youtube.com/watch?v=BJwgLWrCIHI> ⭕ ) Highlights - PyIceberg 0.4.0 is out - Python Avro reads are 18% faster - Python concurrency updated for AWS Lambda - Added Avro writes to Python - Fixed Spark deleteWhere with WAP branch - Added registerTable to REST catalog - FLIP-27 Flink source switched to JSON parser for FileScanTask Releases - Please vote on 1.3.1 - Java 1.4.0 - Targeting August for RC - Anton volunteered to RM - Distributed planning - Row-level operation updates: MoR schema pruning, etc. - Dynamic pruning stretch goal, mainly targeting MoR - Python 0.5.0 Discussion - View API issues ( https://github.com/apache/iceberg/pull/7992) - Should Projections take in schema vs spec? Are there issues evaluating filters, with Time Travel because we use the wrong schema? ) Came up while looking at this issue: https://github.com/apache/iceberg/issues/7774 - Gradle version catalog support - Applying spotless for scala code - Add Golang Iceberg to Repo? AI-generated chapter summaries: 0:00 <https://www.youtube.com/watch?v=BJwgLWrCIHI&t=0s> Chapter 1 The team discussed updates and progress on both the Python and Java sides, including new features, performance improvements, and upcoming releases. They also talked about the UAPI and the need to deprecate and move certain interfaces. 10:40 <https://www.youtube.com/watch?v=BJwgLWrCIHI&t=640s> Chapter 2 The team discussed the issue of generated classes appearing in the API package and decided to break those classes and improve the generation process in the future. They also discussed the problem of projections binding expressions to the schema and agreed that passing the schema to the projections would be a better solution. 21:37 <https://www.youtube.com/watch?v=BJwgLWrCIHI&t=1297s> Chapter 3 Eduard raised awareness about updating the dependency versioning plugin and ensuring compatibility with Dependable. Anton expressed concerns about applying spotless for Scala code due to differences with Spark, but agreed to revisit the topic once Spark 3.5 is released. Matt proposed a Golang implementation of iceberg and discussed the possibility of integrating it into the main repository, with separate versioning and considerations for release scripts and CI. 31:52 <https://www.youtube.com/watch?v=BJwgLWrCIHI&t=1912s> Chapter 4 Matt and Steven discussed the process of moving the code into the foundation, including licensing and practical issues. They decided to start small PRs to get more eyes on the code and build understanding, with Jacob offering to assist. 42:10 <https://www.youtube.com/watch?v=BJwgLWrCIHI&t=2530s> Chapter 5 Matt and Rusty discussed the need for a common representation of tasks in Arrow and the desire to create a substrate plan for iceberg scans with pushdown and deletes. They aimed to simplify the integration of different languages and make querying iceberg tables more efficient. 51:29 <https://www.youtube.com/watch?v=BJwgLWrCIHI&t=3089s> Chapter 6 Matt, Fokko, and others discussed the benefits of representing plans as substrate plans and the need for correct column projection in Arrow. They also mentioned the possibility of opening an issue to coordinate on implementing iceberg column resolution in C++.