yihua commented on code in PR #8128:
URL: https://github.com/apache/hudi/pull/8128#discussion_r1133478509


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java:
##########
@@ -69,6 +69,10 @@ public class HoodieIndexConfig extends HoodieConfig {
 
   private static final Logger LOG = 
LogManager.getLogger(HoodieIndexConfig.class);
 
+  public static final boolean DEFAULT_BLOOM_INDEX_UPDATE_PARTITION_PATH = true;
+  public static final boolean DEFAULT_SIMPLE_INDEX_UPDATE_PARTITION_PATH = 
true;
+
+

Review Comment:
   We don't need the variables and we can assume that the default behavior 
reflects these.



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -288,9 +288,6 @@ private HoodieWriteConfig createMetadataWriteConfig(
         .withCompactionConfig(HoodieCompactionConfig.newBuilder()
             .withInlineCompaction(false)
             
.withMaxNumDeltaCommitsBeforeCompaction(writeConfig.getMetadataCompactDeltaCommitMax())
-            // by default, the HFile does not keep the metadata fields, set up 
as false
-            // to always use the metadata of the new record.
-            .withPreserveCommitMetadata(false)

Review Comment:
   Have you audited the code path to see there's no behavior change for 
metadata writing?



##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieClientOnCopyOnWriteStorage.java:
##########
@@ -199,13 +199,8 @@ private static Stream<Arguments> 
populateMetaFieldsParams() {
     return Arrays.stream(new Boolean[][] {{true}, {false}}).map(Arguments::of);
   }
 
-  private static Stream<Arguments> 
populateMetaFieldsAndPreserveMetadataParams() {
-    return Arrays.stream(new Boolean[][] {
-        {true, true},
-        {false, true},
-        {true, false},
-        {false, false}
-    }).map(Arguments::of);
+  private static Stream<Arguments> populateMetaFields() {

Review Comment:
   nit: rename to `populateMetaFieldsParams`



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java:
##########
@@ -48,6 +48,8 @@
         + "which optimizes the storage layout for better query performance by 
sorting and sizing data files.")
 public class HoodieClusteringConfig extends HoodieConfig {
 
+  public static final boolean DEFAULT_PRESERVE_COMMIT_METADATA = true;

Review Comment:
   We don't need the variable and we can assume that the default behavior 
reflects this.  Similar for the compaction.



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java:
##########
@@ -297,21 +295,15 @@ private FlinkOptions() {
       .booleanType()
       .defaultValue(false)// default read as batch
       .withDescription("Whether to skip compaction instants and avoid reading 
compacted base files for streaming read to improve read performance.\n"
-          + "There are two cases that this option can be used to avoid reading 
duplicates:\n"
-          + "1) you are definitely sure that the consumer reads [faster 
than/completes before] any compaction instants "
-          + "when " + HoodieCompactionConfig.PRESERVE_COMMIT_METADATA.key() + 
" is set to false.\n"
-          + "2) changelog mode is enabled, this option is a solution to keep 
data integrity");
+          + "This option can be used to avoid reading duplicates when 
changelog mode is enabled, it is a solution to keep data integrity\n");

Review Comment:
   @danny0405 compaction always preserves the commit metadata by default and 
now we make it the only behavior to avoid corruption.  I assume this is ok for 
Flink too.  Could you confirm it?



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/ClusteringOperator.java:
##########
@@ -133,7 +133,7 @@ public class ClusteringOperator extends 
TableStreamOperator<ClusteringCommitEven
 
   public ClusteringOperator(Configuration conf, RowType rowType) {
     this.conf = conf;
-    this.preserveHoodieMetadata = 
conf.getBoolean(HoodieClusteringConfig.PRESERVE_COMMIT_METADATA.key(), 
HoodieClusteringConfig.PRESERVE_COMMIT_METADATA.defaultValue());
+    this.preserveHoodieMetadata = 
HoodieClusteringConfig.DEFAULT_PRESERVE_COMMIT_METADATA;

Review Comment:
   nit: we don't need this and we can simplify `this.rowType` below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to