Mehrdad Nurolahzade created AURORA-1869:
-------------------------------------------

             Summary: Investigate the status update processing overhead
                 Key: AURORA-1869
                 URL: https://issues.apache.org/jira/browse/AURORA-1869
             Project: Aurora
          Issue Type: Task
          Components: Scheduler
            Reporter: Mehrdad Nurolahzade
            Priority: Minor


There is a peculiar similarity pattern between the number of task status update 
events received from Mesos and the number of JVM threads started by the system 
([graphview|http://192.168.33.7:8081/graphview?query=rate(jvm_threads_started)%0Arate(scheduler_status_update_events)]).
 It seems like a new thread is started every time a status update event is 
processed.

{{TaskStatusHandlerImpl}} is a singleton service, therefore it should not 
instantiate new threads. Looking at status update reasons/results, the majority 
of status updates are associated with {{RECONCILIATION}} and should result in 
{{NOOP}}. Therefore, they should have no impact on the internal workers. The 
task state machine should short-circuit and return upon realizing that the 
task’s reported new state corresponds to scheduler’s view.

{code:title=TaskStateMachine.updateState()}
if (stateMachine.getState() == taskState) {
  return new TransitionResult(NOOP, ImmutableSet.of());
}
{code}

Given the volume of status update events received upon reconciliation this 
overhead needs to be avoided, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to