[ 
https://issues.apache.org/jira/browse/TEZ-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated TEZ-3803:
-----------------------------
    Attachment: TEZ-3803.004.patch

Revised patch that waits with a set timeout (un-configurable) for simplicity. 
We could go to a different value or move this as a variable to ShuffleUtils if 
this approach seems ok. Also changed the test run time. This patch modifies the 
if block to a while block essentially. Will wait for precommit before further 
review requests.

> Tasks can get killed due to insufficient progress while waiting for shuffle 
> inputs to complete
> ----------------------------------------------------------------------------------------------
>
>                 Key: TEZ-3803
>                 URL: https://issues.apache.org/jira/browse/TEZ-3803
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>            Priority: Critical
>         Attachments: TEZ-3803.001.patch, TEZ-3803.002.patch, 
> TEZ-3803.003.patch, TEZ-3803.004.patch
>
>
> In a scenario where a downstream task has no slow start and gets started 
> before all its shuffle inputs are done, the task can timeout as the wait does 
> not notify progress( set the "progress is being made bit") like it does in 
> MapReduce.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to