hcrosse opened a new issue, #800: URL: https://github.com/apache/iceberg-go/issues/800
### Feature Request / Improvement [`GetWriteProperties()`](https://github.com/apache/iceberg-go/blob/main/table/internal/parquet_files.go#L203) hardcodes `parquet.WithDataPageVersion(parquet.DataPageV2)` with no way to override it. This causes issues with consumers that don't fully support DataPage V2 (e.g. Snowflake). iceberg-java supports configuring this via [`WriteBuilder.writerVersion()`](https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java#L268), but iceberg-go has no equivalent. I checked a few Iceberg library implementations, and iceberg-go is the only one I found that defaults to DataPage V2: - [iceberg-java hardcodes](https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java#L171) `WriterVersion.PARQUET_1_0` which produces V1 pages - pyiceberg doesn't set `data_page_version`, so [PyArrow defaults to V1](https://github.com/apache/iceberg-python/blob/main/pyiceberg/io/pyarrow.py) - iceberg-rust doesn't set it, so [arrow-rs defaults to V1](https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/writer/file_writer/parquet_writer.rs) I propose that we add a way to configure DataPage version, similar to iceberg-java's `WriteBuilder.writerVersion()`. I'm happy to discuss the best place to expose this in the API. We should still keep the current default (V2) for backward compatibility. I'm happy to submit a PR for this if the approach is approved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
