[
https://issues.apache.org/jira/browse/HUDI-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sherhomhuang closed HUDI-4280.
------------------------------
Resolution: Fixed
It is improved in HUDI-4101
> Support more parallelisms in flink when writing data to less bucket num but
> more than one partiton path.
> --------------------------------------------------------------------------------------------------------
>
> Key: HUDI-4280
> URL: https://issues.apache.org/jira/browse/HUDI-4280
> Project: Apache Hudi
> Issue Type: Improvement
> Components: flink
> Reporter: sherhomhuang
> Assignee: sherhomhuang
> Priority: Major
> Fix For: 0.12.0
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> Support more parallelisms in flink when writing data to less bucket num but
> more than one partiton path.
> *Existing shortcoming:*
> Suppose a table is just set to be _*N*_ bucket num, but it may has a
> large historical data in *_M_* partition paths({_}*M >> N*{_}). When
> importing historical data, the speed of writing to the table will be limited
> , because parallelism cannot be set greater than _*N*_ for the algorithm in
> class {_}BucketIndexPartitioner{_}.
> {*}Improvement{*}:
> Optimize the method of partitioner, to support _*M * N*_ parallelisms
> when importing to _*N*_ bucket num table.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)