[ 
https://issues.apache.org/jira/browse/HUDI-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sherhomhuang closed HUDI-4280.
------------------------------
    Resolution: Fixed

It is improved in HUDI-4101

> Support more parallelisms in flink when writing data to less bucket num but 
> more than one partiton path.
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-4280
>                 URL: https://issues.apache.org/jira/browse/HUDI-4280
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: flink
>            Reporter: sherhomhuang
>            Assignee: sherhomhuang
>            Priority: Major
>             Fix For: 0.12.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Support more parallelisms in flink when writing data to less bucket num but 
> more than one partiton path.
> *Existing shortcoming:*    
>      Suppose a table is just set to be _*N*_ bucket num, but it may has a 
> large historical data in *_M_* partition paths({_}*M >> N*{_}). When 
> importing historical data, the speed of writing to the table will be limited 
> , because parallelism cannot be set greater than _*N*_ for the algorithm in 
> class {_}BucketIndexPartitioner{_}. 
> {*}Improvement{*}: 
>     Optimize the method of partitioner, to support _*M * N*_ parallelisms 
> when importing to _*N*_ bucket num table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to