[ 
https://issues.apache.org/jira/browse/FLINK-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huweihua updated FLINK-16069:
-----------------------------
    Description: 
The deploy of tasks will take long time when we submit a high parallelism job. 
And Execution#deploy run in mainThread, so it will block JobMaster process 
other akka messages, such as Heartbeat. The creation of 
TaskDeploymentDescriptor take most of time. We can put the creation in future.

For example, A job [source(8000)->sink(8000)], the total 16000 tasks from 
SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of 
TaskManager timeout and job never success.

  was:
The deploy of tasks will took long time when we submit a high parallelism job. 
And Execution#deploy run in mainThread, so it will block JobMaster process 
other akka messages, such as Heartbeat. The creation of 
TaskDeploymentDescriptor take most of time. We can put the creation in future.

For example, A job [source(8000)->sink(8000)], the total 16000 tasks from 
SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of 
TaskManager timeout and job never success.


> Create TaskDeploymentDescriptor in future.
> ------------------------------------------
>
>                 Key: FLINK-16069
>                 URL: https://issues.apache.org/jira/browse/FLINK-16069
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Task
>            Reporter: huweihua
>            Priority: Major
>
> The deploy of tasks will take long time when we submit a high parallelism 
> job. And Execution#deploy run in mainThread, so it will block JobMaster 
> process other akka messages, such as Heartbeat. The creation of 
> TaskDeploymentDescriptor take most of time. We can put the creation in future.
> For example, A job [source(8000)->sink(8000)], the total 16000 tasks from 
> SCHEDULED to DEPLOYING took more than 1mins. This caused the heartbeat of 
> TaskManager timeout and job never success.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to