[
https://issues.apache.org/jira/browse/TEZ-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651604#comment-14651604
]
Bikas Saha commented on TEZ-2647:
---------------------------------
The patch adds support for tracking the last data movement event dependency for
a task. For now, the tracking is within the AM according to last event sent to
an attempt from the AM. This works for now, even with speculation, as the
events are sent from the producer task outputs upon task completion. But with
pipelining, this may break as events from 2 producer attempts might reach the
consumer task and only the consumer input can tell which one was the last that
unblocked it. Tests added.
[~rajesh.balamohan] Please review.
> Add input causality dependency for attempts
> -------------------------------------------
>
> Key: TEZ-2647
> URL: https://issues.apache.org/jira/browse/TEZ-2647
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-2647.1.patch
>
>
> Attempts can have input dependencies on the producer task attempts that
> produced the data being consumed by the attempt.
> DataMovement events capture this dependency. In the interest of space, we
> need to be able to capture the dependency that matters - the one that
> provided the last data for the input to complete.
> For starters, we could
> 1) have the system track the last data movement event that was sent to an
> attempt
> 2) then have the inputs be able to report the last relevant data movement
> event
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)