[ 
https://issues.apache.org/jira/browse/HUDI-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shizhi Chen updated HUDI-5068:
------------------------------
    Description: 
*CopyOnWriteInputFormat#createInputSplits* is invoked by 
*org.apache.flink.runtime.executiongraph.ExecutionJobVertex*
in JobManager to create file input splits synchronously. It's found that in 
batch mode this will occupy the largest 
share of job submission time.

So in this PR it will be optimized by creating input splits in thread pool 
executor asynchronously.

> Support cow flink batch create fs input split asynchronously
> ------------------------------------------------------------
>
>                 Key: HUDI-5068
>                 URL: https://issues.apache.org/jira/browse/HUDI-5068
>             Project: Apache Hudi
>          Issue Type: New Feature
>          Components: flink-sql, incremental-query, performance
>            Reporter: Shizhi Chen
>            Assignee: Shizhi Chen
>            Priority: Blocker
>             Fix For: 0.13.0
>
>
> *CopyOnWriteInputFormat#createInputSplits* is invoked by 
> *org.apache.flink.runtime.executiongraph.ExecutionJobVertex*
> in JobManager to create file input splits synchronously. It's found that in 
> batch mode this will occupy the largest 
> share of job submission time.
> So in this PR it will be optimized by creating input splits in thread pool 
> executor asynchronously.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to