[
https://issues.apache.org/jira/browse/TEZ-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kuhu Shukla updated TEZ-3803:
-----------------------------
Attachment: TEZ-3803.003.patch
Revamped the patch per Jason's comments. I have not added a new config and am
basically deriving the interval from progress timeout config. the wait interval
is always 1/3 of the progress interval and the thread waits only if the value
is > 0.
> Tasks can get killed due to insufficient progress while waiting for shuffle
> inputs to complete
> ----------------------------------------------------------------------------------------------
>
> Key: TEZ-3803
> URL: https://issues.apache.org/jira/browse/TEZ-3803
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Kuhu Shukla
> Assignee: Kuhu Shukla
> Priority: Critical
> Attachments: TEZ-3803.001.patch, TEZ-3803.002.patch,
> TEZ-3803.003.patch
>
>
> In a scenario where a downstream task has no slow start and gets started
> before all its shuffle inputs are done, the task can timeout as the wait does
> not notify progress( set the "progress is being made bit") like it does in
> MapReduce.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)