Gopal V created TEZ-3984:
----------------------------
Summary: Shuffle: Out of Band DME event sending causes errors
Key: TEZ-3984
URL: https://issues.apache.org/jira/browse/TEZ-3984
Project: Apache Tez
Issue Type: Bug
Affects Versions: 0.9.1, 0.8.4, 0.10.0
Reporter: Gopal V
In case of a task Input throwing an exception, the outputs are also closed in
the LogicalIOProcessorRuntimeTask.cleanup().
Cleanup ignore all the events returned by output close, however if any output
tries to send an event out of band by directly calling
outputContext.sendEvents(events), then those events can reach the AM before the
task failure is reported.
This can cause correctness issues with shuffle since zero sized events can be
sent out due to an input failure and downstream tasks may never reattempt a
fetch from the valid attempt.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)