nsivabalan commented on code in PR #13649:
URL: https://github.com/apache/hudi/pull/13649#discussion_r2244034204
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -811,6 +819,10 @@ public int getFileCacheMaxSizeMB() {
return getInt(METADATA_FILE_CACHE_MAX_SIZE_MB);
}
+ public int getWriteStatusCoalesceParallelism() {
Review Comment:
once we fix the naming, lets also fix the getters to align as well
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -554,6 +554,14 @@ public final class HoodieMetadataConfig extends
HoodieConfig {
+ "bloom filter row for the files in the metadata table. Only
applies if the filter "
+ "type (" + BLOOM_FILTER_TYPE.key() + " ) is
BloomFilterTypeCode.DYNAMIC_V0.");
+ public static final ConfigProperty<Integer>
WRITE_STATUS_COALESCE_PARALLELISM = ConfigProperty
+ .key(METADATA_PREFIX + ".write.status.coalesce.parallelism")
+ .defaultValue(0)
+ .markAdvanced()
+ .sinceVersion("1.1.0")
+ .withDocumentation("When set to a positive number, this config reduces
the number of "
+ + "write status partitions to the set number when the current number
of partitions is higher.");
Review Comment:
we can keep it generic.
`when set to positive number, metadata table record preparation stages honor
the set value for number of tasks. If not, number of write status's from data
table writes will be used for metadata table record preparation`
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java:
##########
@@ -167,6 +167,11 @@ public HoodieWriteMetadata<HoodieData<WriteStatus>>
execute(HoodieData<HoodieRec
@Override
public HoodieWriteMetadata<HoodieData<WriteStatus>>
execute(HoodieData<HoodieRecord<T>> inputRecords, Option<HoodieTimer>
sourceReadAndIndexTimer) {
+ int coalesceParallelism =
config.getMetadataConfig().getWriteStatusCoalesceParallelism();
Review Comment:
we can just name this `parallelism`.
and
```
if (table.isMetadataTable() && parallelism > 0) {
inputRecords = inputRecords.coalesce(parallelism);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]