[ 
https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513974#comment-14513974
 ] 

ASF GitHub Bot commented on FLINK-1925:
---------------------------------------

Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/622#discussion_r29140453
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ---
    @@ -345,26 +346,9 @@ public void onComplete(Throwable failure, Object 
success) throws Throwable {
                                                }
                                        }
                                        else {
    -                                           if (success == null) {
    -                                                   markFailed(new 
Exception("Failed to deploy the task to slot " + slot + ": TaskOperationResult 
was null"));
    -                                           }
    -
    -                                           if (success instanceof 
TaskOperationResult) {
    -                                                   TaskOperationResult 
result = (TaskOperationResult) success;
    -
    -                                                   if 
(!result.executionID().equals(attemptId)) {
    -                                                           markFailed(new 
Exception("Answer execution id does not match the request execution id."));
    -                                                   } else if 
(result.success()) {
    -                                                           
switchToRunning();
    -                                                   } else {
    -                                                           // deployment 
failed :(
    -                                                           markFailed(new 
Exception("Failed to deploy the task " +
    -                                                                           
getVertexWithAttempt() + " to slot " + slot + ": " + result
    -                                                                           
.description()));
    -                                                   }
    -                                           } else {
    +                                           if (!(success instanceof 
Messages.Acknowledge$)) {
    --- End diff --
    
    I think this line is not parsable in Eclipse (the $ mess with the Java 
parser).
    A workaround is to expose the case object class and object via a utility 
method and check against that.


> Split SubmitTask method up into two phases: Receive TDD and instantiation of 
> TDD
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-1925
>                 URL: https://issues.apache.org/jira/browse/FLINK-1925
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>
> A user reported that a job times out while submitting tasks to the 
> TaskManager. The reason is that the JobManager expects a TaskOperationResult 
> response upon submitting a task to the TM. The TM downloads then the required 
> jars from the JM which blocks the actor thread and can take a very long time 
> if many TMs download from the JM. Due to this, the SubmitTask future throws a 
> TimeOutException.
> A possible solution could be that the TM eagerly acknowledges the reception 
> of the SubmitTask message and executes the task initialization within a 
> future. The future will upon completion send a UpdateTaskExecutionState 
> message to the JM which switches the state of the task from deploying to 
> running. This means that the handler of SubmitTask future in {{Execution}} 
> won't change the state of the task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to