[
https://issues.apache.org/jira/browse/HUDI-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-4924:
----------------------------
Description:
This causes some tests like
testToWriteWithoutParametersIncludedInHoodieTableConfig taking longer time to
finish than expected. For this particular one, reducing the upsert shuffle
parallelism from default 200 to 2 makes the test finish in 15s instead of 90s
locally.
The actual root cause of the problem is that the dedup parallelism is taken
directly from the Hudi write shuffle parallelism, without auto-tuning.
was:
This causes some tests like
testToWriteWithoutParametersIncludedInHoodieTableConfig taking longer time to
finish than expected. For this particular one, reducing the upsert shuffle
parallelism from default 200 to 2 makes the test finish in 15s instead of 90s
locally.
The actual root cause of the problem is that
> Dedup parallelism is not auto tuned based on input
> --------------------------------------------------
>
> Key: HUDI-4924
> URL: https://issues.apache.org/jira/browse/HUDI-4924
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Ethan Guo
> Assignee: Ethan Guo
> Priority: Major
> Fix For: 0.12.1
>
>
> This causes some tests like
> testToWriteWithoutParametersIncludedInHoodieTableConfig taking longer time to
> finish than expected. For this particular one, reducing the upsert shuffle
> parallelism from default 200 to 2 makes the test finish in 15s instead of 90s
> locally.
>
> The actual root cause of the problem is that the dedup parallelism is taken
> directly from the Hudi write shuffle parallelism, without auto-tuning.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)