[
https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513981#comment-14513981
]
ASF GitHub Bot commented on FLINK-1925:
---------------------------------------
Github user StephanEwen commented on a diff in the pull request:
https://github.com/apache/flink/pull/622#discussion_r29140659
--- Diff:
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
---
@@ -795,108 +795,111 @@ extends Actor with ActorLogMessages with
ActorLogging {
var startRegisteringTask = 0L
var task: Task = null
- // all operations are in a try / catch block to make sure we send a
result upon any failure
- try {
- // check that we are already registered
- if (!isConnected) {
- throw new IllegalStateException("TaskManager is not associated
with a JobManager")
- }
- if (slot < 0 || slot >= numberOfSlots) {
- throw new Exception(s"Target slot ${slot} does not exist on
TaskManager.")
- }
+ if (!isConnected) {
+ sender ! Failure(
+ new IllegalStateException("TaskManager is not associated with a
JobManager.")
+ )
+ } else if (slot < 0 || slot >= numberOfSlots) {
+ sender ! Failure(new Exception(s"Target slot $slot does not exist on
TaskManager."))
+ } else {
+ sender ! Acknowledge
- val userCodeClassLoader = libraryCacheManager match {
- case Some(manager) =>
- if (LOG.isDebugEnabled) {
- startRegisteringTask = System.currentTimeMillis()
- }
+ Future {
+ try {
--- End diff --
Can we pull the code in the future into its own method? Makes it easier to
understand by separating the different parts along the methods.
> Split SubmitTask method up into two phases: Receive TDD and instantiation of
> TDD
> --------------------------------------------------------------------------------
>
> Key: FLINK-1925
> URL: https://issues.apache.org/jira/browse/FLINK-1925
> Project: Flink
> Issue Type: Improvement
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
>
> A user reported that a job times out while submitting tasks to the
> TaskManager. The reason is that the JobManager expects a TaskOperationResult
> response upon submitting a task to the TM. The TM downloads then the required
> jars from the JM which blocks the actor thread and can take a very long time
> if many TMs download from the JM. Due to this, the SubmitTask future throws a
> TimeOutException.
> A possible solution could be that the TM eagerly acknowledges the reception
> of the SubmitTask message and executes the task initialization within a
> future. The future will upon completion send a UpdateTaskExecutionState
> message to the JM which switches the state of the task from deploying to
> running. This means that the handler of SubmitTask future in {{Execution}}
> won't change the state of the task.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)