Kuhu Shukla created TEZ-3917:
--------------------------------

             Summary: Speculative task attempt's DMEs can cause downstream 
fetcher to NPE or duplicate fetch
                 Key: TEZ-3917
                 URL: https://issues.apache.org/jira/browse/TEZ-3917
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.9.1
            Reporter: Kuhu Shukla
            Assignee: Kuhu Shukla


STA0 , STA1

         |

         |

DTA0 , DTA1

 

Take the above example of  DTA0 initially fetching from upstream source task 
which has 2 attempts, one speculative (say STA1).

There exists a race where in DME from STA1 comes in to DTA0 and is fetched 
followed by the fetch from STA0 (the successful one) being marked as duplicate. 
The DME from STA1 is sent before it is marked as killed by the AM.

This additional event can also lead to an NPE since fetcher thread is assigned 
this additional output to be fetched while ShuffleScheduler thinks it has 
fetched all the mapoutputs since it is not prepared to handle the extra events 
coming in from the the speculative attempts.

There are cases where DTA0 NPEs and DTA1 shows duplicate fetches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to