[ 
https://issues.apache.org/jira/browse/HUDI-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4924:
----------------------------
    Description: 
This causes some tests like 
testToWriteWithoutParametersIncludedInHoodieTableConfig taking longer time to 
finish than expected.  For this particular one, reducing the upsert shuffle 
parallelism from default 200 to 2 makes the test finish in 15s instead of 90s 
locally.

 

The actual root cause of the problem is that the dedup parallelism is taken 
directly from the Hudi write shuffle parallelism, without auto-tuning.

  was:
This causes some tests like 
testToWriteWithoutParametersIncludedInHoodieTableConfig taking longer time to 
finish than expected.  For this particular one, reducing the upsert shuffle 
parallelism from default 200 to 2 makes the test finish in 15s instead of 90s 
locally.

 

The actual root cause of the problem is that 


> Dedup parallelism is not auto tuned based on input
> --------------------------------------------------
>
>                 Key: HUDI-4924
>                 URL: https://issues.apache.org/jira/browse/HUDI-4924
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Ethan Guo
>            Assignee: Ethan Guo
>            Priority: Major
>             Fix For: 0.12.1
>
>
> This causes some tests like 
> testToWriteWithoutParametersIncludedInHoodieTableConfig taking longer time to 
> finish than expected.  For this particular one, reducing the upsert shuffle 
> parallelism from default 200 to 2 makes the test finish in 15s instead of 90s 
> locally.
>  
> The actual root cause of the problem is that the dedup parallelism is taken 
> directly from the Hudi write shuffle parallelism, without auto-tuning.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to