aihuaxu commented on code in PR #15596:
URL: https://github.com/apache/iceberg/pull/15596#discussion_r3213274601


##########
flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/FlinkWriteConf.java:
##########
@@ -262,4 +262,22 @@ public Duration tableRefreshInterval() {
         .flinkConfig(FlinkWriteOptions.TABLE_REFRESH_INTERVAL)
         .parseOptional();
   }
+
+  public boolean parquetShredVariants() {
+    return confParser
+        .booleanConf()
+        .option(FlinkWriteOptions.PARQUET_SHRED_VARIANTS.key())
+        .tableProperty(TableProperties.PARQUET_SHRED_VARIANTS)
+        .defaultValue(TableProperties.PARQUET_SHRED_VARIANTS_DEFAULT)
+        .parse();
+  }
+
+  public int variantInferenceBufferSize() {

Review Comment:
   Should this be Parquet specific as well? 



##########
flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/data/FlinkFormatModels.java:
##########
@@ -33,7 +34,9 @@ public static void register() {
             RowType.class,
             FlinkParquetWriters::buildWriter,
             (icebergSchema, fileSchema, engineSchema, idToConstant) ->
-                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, 
idToConstant)));
+                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, 
idToConstant),
+            new FlinkVariantShreddingAnalyzer(),
+            (row, rowType) -> new RowDataSerializer(rowType).copy(row)));

Review Comment:
   +1. We should be able to reuse RowDataSerializer so we don't need to create 
new instance for every row.



##########
flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/FlinkWriteOptions.java:
##########
@@ -105,4 +105,10 @@ private FlinkWriteOptions() {}
   //  specify the uidSuffix to be used for the underlying IcebergSink
   public static final ConfigOption<String> UID_SUFFIX =
       ConfigOptions.key("uid-suffix").stringType().defaultValue("");
+
+  public static final ConfigOption<Boolean> PARQUET_SHRED_VARIANTS =
+      
ConfigOptions.key("parquet-shred-variants").booleanType().defaultValue(false);
+
+  public static final ConfigOption<Integer> VARIANT_INFERENCE_BUFFER_SIZE =
+      
ConfigOptions.key("variant-inference-buffer-size").intType().defaultValue(10);

Review Comment:
   Maybe default to 100 to align with 
TableProperties.PARQUET_VARIANT_BUFFER_SIZE_DEFAULT value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to