jackye1995 commented on a change in pull request #3181:
URL: https://github.com/apache/iceberg/pull/3181#discussion_r719007937
##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -107,6 +107,18 @@ private TableProperties() {
public static final String DELETE_PARQUET_COMPRESSION_LEVEL =
"write.delete.parquet.compression-level";
public static final String PARQUET_COMPRESSION_LEVEL_DEFAULT = null;
+ public static final String PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT =
+ "write.parquet.row-group-check-min-record-count";
+ public static final String DELETE_PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT =
+ "write.delete.parquet.row-group-check-min-record-count";
+ public static final String PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT_DEFAULT
= "100";
+
+ public static final String PARQUET_ROW_GROUP_CHECK_MAX_RECORD_COUNT =
+ "write.parquet.row-group-check-max-record-count";
+ public static final String DELETE_PARQUET_ROW_GROUP_CHECK_MAX_RECORD_COUNT =
+ "write.delete.parquet.row-group-check-max-record-count";
+ public static final String PARQUET_ROW_GROUP_CHECK_MAX_RECORD_COUNT_DEFAULT
= "10000";
Review comment:
this can be integer
##########
File path: parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java
##########
@@ -316,7 +331,13 @@ static Context dataContext(Map<String, String> config) {
String compressionLevel =
config.getOrDefault(PARQUET_COMPRESSION_LEVEL,
PARQUET_COMPRESSION_LEVEL_DEFAULT);
- return new Context(rowGroupSize, pageSize, dictionaryPageSize, codec,
compressionLevel);
+ int rowGroupCheckMinRecordCount = Integer.parseInt(config.getOrDefault(
+ PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT,
PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT_DEFAULT));
+ int rowGroupCheckMaxRecordCount = Integer.parseInt(config.getOrDefault(
+ PARQUET_ROW_GROUP_CHECK_MAX_RECORD_COUNT,
PARQUET_ROW_GROUP_CHECK_MAX_RECORD_COUNT_DEFAULT));
+
Review comment:
I think we need to add some basic validations such as the config value
is positive, max is greater than min.
##########
File path: parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java
##########
@@ -316,7 +331,13 @@ static Context dataContext(Map<String, String> config) {
String compressionLevel =
config.getOrDefault(PARQUET_COMPRESSION_LEVEL,
PARQUET_COMPRESSION_LEVEL_DEFAULT);
- return new Context(rowGroupSize, pageSize, dictionaryPageSize, codec,
compressionLevel);
+ int rowGroupCheckMinRecordCount = Integer.parseInt(config.getOrDefault(
Review comment:
can use `PropertyUtil.propertyAsInt`
##########
File path: parquet/src/main/java/org/apache/iceberg/parquet/ParquetWriter.java
##########
@@ -46,7 +46,7 @@
class ParquetWriter<T> implements FileAppender<T>, Closeable {
- private static DynConstructors.Ctor<PageWriteStore> pageStoreCtorParquet =
DynConstructors
+ private static final DynConstructors.Ctor<PageWriteStore>
pageStoreCtorParquet = DynConstructors
Review comment:
Thanks for working on that! I think we are accumulating more and more of
these warnings, better to fix them as much as possible along the way.
##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -107,6 +107,18 @@ private TableProperties() {
public static final String DELETE_PARQUET_COMPRESSION_LEVEL =
"write.delete.parquet.compression-level";
public static final String PARQUET_COMPRESSION_LEVEL_DEFAULT = null;
+ public static final String PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT =
+ "write.parquet.row-group-check-min-record-count";
+ public static final String DELETE_PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT =
+ "write.delete.parquet.row-group-check-min-record-count";
+ public static final String PARQUET_ROW_GROUP_CHECK_MIN_RECORD_COUNT_DEFAULT
= "100";
Review comment:
this can be integer
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]