[
https://issues.apache.org/jira/browse/FLINK-25335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
KevinyhZou updated FLINK-25335:
-------------------------------
Attachment: image-2021-12-16-11-14-36-030.png
> Improvoment of task deployment by enable source split async enumerate
> ---------------------------------------------------------------------
>
> Key: FLINK-25335
> URL: https://issues.apache.org/jira/browse/FLINK-25335
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.12.1
> Reporter: KevinyhZou
> Priority: Major
> Attachments: image-2021-12-16-11-14-36-030.png
>
>
> When submit olap query by flink client to Flink Session Cluster, the
> JobMaster will start scheduling and enumerate the hive source split by
> `HiveSourceFileEnumerator`, and then deploy the query task and execute it. if
> the source
> table has a lot of partition and the partition file is big, the source split
> enumerate will cost a lot of time, which would block the task deployment &
> execution for a long time, and the dashboard can not appear
>
> JobMaster should async enumerate the hive split, and meanwhile deploy the
> query task and execute it. when the deployment is finished, source operator
> fetch split and read data, and the split enumeration is also going on.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)