[ 
https://issues.apache.org/jira/browse/TEZ-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634175#comment-16634175
 ] 

Jason Lowe commented on TEZ-3996:
---------------------------------

bq. we launch an external process in the processor and for various reasons 
can't restart the external process with the new version of DMEs

This limitation sounds like it lowers the fault tolerance in the DAG.  If I 
understand correctly, any retroactive failure of an upstream task forces any 
active downstream task to fail because we cannot update the downstream task 
with a new DME when the upstream task rerun completes.  That means it could 
only take four upstream task attempt reruns, across four different upstream 
tasks, to fail a downstream vertex if the upstream re-runs were spread out 
sufficiently in time and a downstream attempt was relaunched in-between each 
upstream retroactive failure.  So instead of any one upstream task failing four 
times to fail the DAG, it becomes any four attempts _across_ the upstream tasks 
worst-case.

Dropping the DME event seems like the right approach, although I worry a bit 
that this may be an expensive thing to do on the AM side with a large number of 
upstream and downstream tasks.  We may need to refactor how those are tracked 
AM-side.  Another approach which isn't as clean but might scale better is to 
have the AM send over an event when the task attempt is "up to date" with 
events -- in other words, the pending event queue is drained on the AM side and 
it could be a while before more events are sent to the task attempt.  Then a 
downstream task can load up all the events, filtering DMEs that have been 
invalidated by later IFEs, until it receives the special, "up to date" event 
which indicates it's OK to start the processing of any valid DMEs received so 
far.


> Reorder input failed events before data movement events
> -------------------------------------------------------
>
>                 Key: TEZ-3996
>                 URL: https://issues.apache.org/jira/browse/TEZ-3996
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Hitesh Sharma
>            Priority: Minor
>
> We have a custom processor (AbstractLogicalIOProcessor) that waits for 
> DataMovementEvent to arrive and then starts an external process to do some 
> work. When a revocation happens then the processor recieves an 
> InputFailedEvent, which tells it about the failed input, and we fail the 
> processor as it is working on old inputs. When the new inputs are available 
> then Tez restarts the processor and sends the InputFailedEvent along with all 
> the DataMovementEvent which includes the older versions and the new version 
> that was revocated.
> The issue we are seeing is that the events arrive out of order i.e. many 
> times we see the older DataMovementEvent first at which our processor thinks 
> it is good to start. We then receive the InputFailedEvent and the new version 
> of DataMovementEvent, but that's late and the processor fails. This keeps 
> repeating on every subsequent task attempt and the task fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to