Bikas Saha created TEZ-2431:
-------------------------------

             Summary: Recovery of task events (eg. datamovement events) should 
not depend on ordering of task attempt events
                 Key: TEZ-2431
                 URL: https://issues.apache.org/jira/browse/TEZ-2431
             Project: Apache Tez
          Issue Type: Sub-task
            Reporter: Bikas Saha


Today, task attempt events need to go through verteximpl before reaching the 
task in order to maintain ordering guarantees for recovery. This causes these 
events to be routed twice through the dispatcher. This can cause overhead 
delays in large jobs. Also, this makes assumptions about event ordering which 
make the system fragile. Recovery should work independently of other system 
interactions so that evolution of other components is not affected by recovery 
unless it affects recovery logically. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to