[ 
https://issues.apache.org/jira/browse/FLINK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KevinyhZou updated FLINK-25335:
-------------------------------
    Attachment: image-2021-12-16-11-14-36-030.png

> Improvoment of task deployment by enable source split async enumerate
> ---------------------------------------------------------------------
>
>                 Key: FLINK-25335
>                 URL: https://issues.apache.org/jira/browse/FLINK-25335
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.12.1
>            Reporter: KevinyhZou
>            Priority: Major
>         Attachments: image-2021-12-16-11-14-36-030.png
>
>
> When submit olap query by flink client to Flink Session Cluster, the 
> JobMaster will start scheduling and  enumerate the hive source split by 
> `HiveSourceFileEnumerator`, and then deploy the query task and execute it. if 
> the source
> table has a lot of partition and the partition file is big, the source split 
> enumerate will cost a lot of time, which would block the task deployment & 
> execution for a long time, and the dashboard can not appear
>  
> JobMaster should async enumerate the hive split, and meanwhile deploy the 
> query task and execute it. when the deployment is finished, source operator 
> fetch split and read data, and the split enumeration is also going on.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to