dbtsai commented on code in PR #8299:
URL: https://github.com/apache/iceberg/pull/8299#discussion_r1292030532
##########
core/src/main/java/org/apache/iceberg/TableProperties.java:
##########
@@ -143,6 +143,7 @@ private TableProperties() {}
public static final String PARQUET_COMPRESSION =
"write.parquet.compression-codec";
public static final String DELETE_PARQUET_COMPRESSION =
"write.delete.parquet.compression-codec";
public static final String PARQUET_COMPRESSION_DEFAULT = "gzip";
+ public static final String PARQUET_COMPRESSION_NEW_TABLE_DEFAULT = "zstd";
Review Comment:
Then, we don't need `PARQUET_COMPRESSION_NEW_TABLE_DEFAULT`
##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -149,6 +149,10 @@ public BaseMetastoreCatalogTableBuilder(TableIdentifier
identifier, Schema schem
this.identifier = identifier;
this.schema = schema;
this.tableProperties.putAll(tableDefaultProperties());
+ // Explicitly set ZSTD for new tables
Review Comment:
Maybe `// Explicitly set Parquet compression codec for new tables`
##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -149,6 +149,10 @@ public BaseMetastoreCatalogTableBuilder(TableIdentifier
identifier, Schema schem
this.identifier = identifier;
this.schema = schema;
this.tableProperties.putAll(tableDefaultProperties());
+ // Explicitly set ZSTD for new tables
+ this.tableProperties.put(
+ TableProperties.PARQUET_COMPRESSION,
+ TableProperties.PARQUET_COMPRESSION_NEW_TABLE_DEFAULT);
Review Comment:
```Java
this.tableProperties.put(
TableProperties.PARQUET_COMPRESSION,
TableProperties.PARQUET_COMPRESSION_DEFAULT);
```
so we don't need to introduce a new conf,
`PARQUET_COMPRESSION_NEW_TABLE_DEFAULT`
##########
core/src/main/java/org/apache/iceberg/SnapshotProducer.java:
##########
@@ -381,6 +382,15 @@ public void commit() {
update.setBranchSnapshot(newSnapshot, targetBranch);
}
+ // Explicitly set GZIP for existing tables if unset
+ if (base.properties().get(TableProperties.PARQUET_COMPRESSION)
== null) {
+ Map<String, String> newProperties =
Maps.newHashMap(base.properties());
+ newProperties.put(
+ TableProperties.PARQUET_COMPRESSION,
+ TableProperties.PARQUET_COMPRESSION_DEFAULT);
Review Comment:
```java
if (base.properties().get(TableProperties.PARQUET_COMPRESSION) == null) {
Map<String, String> newProperties = Maps.newHashMap(base.properties());
newProperties.put(TableProperties.PARQUET_COMPRESSION, "gzip");
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]