hcrosse opened a new issue, #800:
URL: https://github.com/apache/iceberg-go/issues/800

   ### Feature Request / Improvement
   
   
[`GetWriteProperties()`](https://github.com/apache/iceberg-go/blob/main/table/internal/parquet_files.go#L203)
 hardcodes `parquet.WithDataPageVersion(parquet.DataPageV2)` with no way to 
override it. This causes issues with consumers that don't fully support 
DataPage V2 (e.g. Snowflake).
   
   iceberg-java supports configuring this via 
[`WriteBuilder.writerVersion()`](https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java#L268),
 but iceberg-go has no equivalent.
   
   I checked a few Iceberg library implementations, and iceberg-go is the only 
one I found that defaults to DataPage V2:
   - [iceberg-java 
hardcodes](https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java#L171)
 `WriterVersion.PARQUET_1_0` which produces V1 pages
   - pyiceberg doesn't set `data_page_version`, so [PyArrow defaults to 
V1](https://github.com/apache/iceberg-python/blob/main/pyiceberg/io/pyarrow.py)
   - iceberg-rust doesn't set it, so [arrow-rs defaults to 
V1](https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/writer/file_writer/parquet_writer.rs)
   
   I propose that we add a way to configure DataPage version, similar to 
iceberg-java's `WriteBuilder.writerVersion()`. I'm happy to discuss the best 
place to expose this in the API. We should still keep the current default (V2) 
for backward compatibility.
   
   I'm happy to submit a PR for this if the approach is approved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to