sherhomhuang created HUDI-4280:
----------------------------------
Summary: Support more parallelisms in flink when writing data to
less bucket num but more than one partiton path.
Key: HUDI-4280
URL: https://issues.apache.org/jira/browse/HUDI-4280
Project: Apache Hudi
Issue Type: Improvement
Components: flink
Reporter: sherhomhuang
Assignee: sherhomhuang
Fix For: 0.12.0
Support more parallelisms in flink when writing data to less bucket num but
more than one partiton path.
*Existing shortcoming:*
Suppose a table is just set to be _*N*_ bucket num, but it may has a large
historical data in *_M_* partition paths({_}*M >> N*{_}). When importing
historical data, the speed of writing to the table will be limited , because
parallelism cannot be set greater than _*N*_ for the algorithm in class
{_}BucketIndexPartitioner{_}.
{*}Improvement{*}:
Optimize the method of partitioner, to support _*M * N*_ parallelisms when
importing to _*N*_ bucket num table.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)