sherhomhuang created HUDI-4280:
----------------------------------

             Summary: Support more parallelisms in flink when writing data to 
less bucket num but more than one partiton path.
                 Key: HUDI-4280
                 URL: https://issues.apache.org/jira/browse/HUDI-4280
             Project: Apache Hudi
          Issue Type: Improvement
          Components: flink
            Reporter: sherhomhuang
            Assignee: sherhomhuang
             Fix For: 0.12.0


Support more parallelisms in flink when writing data to less bucket num but 
more than one partiton path.

*Existing shortcoming:*    

     Suppose a table is just set to be _*N*_ bucket num, but it may has a large 
historical data in *_M_* partition paths({_}*M >> N*{_}). When importing 
historical data, the speed of writing to the table will be limited , because 
parallelism cannot be set greater than _*N*_ for the algorithm in class 
{_}BucketIndexPartitioner{_}. 

{*}Improvement{*}: 
    Optimize the method of partitioner, to support _*M * N*_ parallelisms when 
importing to _*N*_ bucket num table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to