nsivabalan commented on code in PR #13649:
URL: https://github.com/apache/hudi/pull/13649#discussion_r2244034204


##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -811,6 +819,10 @@ public int getFileCacheMaxSizeMB() {
     return getInt(METADATA_FILE_CACHE_MAX_SIZE_MB);
   }
 
+  public int getWriteStatusCoalesceParallelism() {

Review Comment:
   once we fix the naming, lets also fix the getters to align as well 



##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -554,6 +554,14 @@ public final class HoodieMetadataConfig extends 
HoodieConfig {
           + "bloom filter row for the files in the metadata table. Only 
applies if the filter "
           + "type (" + BLOOM_FILTER_TYPE.key() + " ) is 
BloomFilterTypeCode.DYNAMIC_V0.");
 
+  public static final ConfigProperty<Integer> 
WRITE_STATUS_COALESCE_PARALLELISM = ConfigProperty
+      .key(METADATA_PREFIX + ".write.status.coalesce.parallelism")
+      .defaultValue(0)
+      .markAdvanced()
+      .sinceVersion("1.1.0")
+      .withDocumentation("When set to a positive number, this config reduces 
the number of "
+          + "write status partitions to the set number when the current number 
of partitions is higher.");

Review Comment:
   we can keep it generic. 
   `when set to positive number, metadata table record preparation stages honor 
the set value for number of tasks. If not, number of write status's from data 
table writes will be used for metadata table record preparation` 
   



##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java:
##########
@@ -167,6 +167,11 @@ public HoodieWriteMetadata<HoodieData<WriteStatus>> 
execute(HoodieData<HoodieRec
 
   @Override
   public HoodieWriteMetadata<HoodieData<WriteStatus>> 
execute(HoodieData<HoodieRecord<T>> inputRecords, Option<HoodieTimer> 
sourceReadAndIndexTimer) {
+    int coalesceParallelism = 
config.getMetadataConfig().getWriteStatusCoalesceParallelism();

Review Comment:
   we can just name this `parallelism`. 
   
   and 
   ```
   if (table.isMetadataTable() && parallelism > 0) {
     inputRecords = inputRecords.coalesce(parallelism); 
   } 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to