Hi all, Today Iceberg has five language implementations (Java, Python, Rust, Go, C++), each in its own repository. We naturally see divergence in how the spec is interpreted across them, even as we take a lot of care writing it as expressively as possible.
One way I'd like to propose to improve on this is by having a shared fixture repository, modeled on arrow-testing and parquet-testing. The repository will be focused on hosting fixtures for the edge cases where that divergence shows up. These fixtures will act as additive checks to the subprojects' existing test frameworks, give the community a single place to anchor spec discussions about literal values, and make it cheaper for subprojects to validate their interpretation of the spec against a known set of values. To POC this, I seeded a fork with existing, known issues, and integrating the tests also surfaced new issues. Here is what that looks like in my forks: - iceberg-testing (proposed fixture repository): https://github.com/sungwy/iceberg-testing - pyiceberg against it: https://github.com/sungwy/iceberg-python/pull/1 - iceberg-rust against it: https://github.com/sungwy/iceberg-rust/pull/2 Let me know your thoughts. Here's a detailed doc [1] where we can discuss specifics regarding the project scope and layout for those interested. Thanks, Sung [1] https://docs.google.com/document/d/1diwxjG24IMW9jSkkyG8fet1YFDorDU1PNOFJ6tSWEWQ/edit?tab=t.0
