[
https://issues.apache.org/jira/browse/TEZ-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated TEZ-4139:
------------------------------
Description:
When lots of downstream attempts fail to pull the information from source task,
source task is marked as failed and it is retried. Currently failure fraction
is handled by looking at unique task attempts from downstream. However, it
should consider taking into account node information for computing
"failureFraction".
https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskAttemptImpl.java#L1845-L1849
was:When lots of downstream attempts fail to pull the information from source
task, source task is marked as failed and it is retried. Currently failure
fraction is handled by looking at unique task attempts from downstream.
However, it should consider taking into account node information for computing
"failureFraction".
> Tez should consider node information for computing failure fraction
> -------------------------------------------------------------------
>
> Key: TEZ-4139
> URL: https://issues.apache.org/jira/browse/TEZ-4139
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: László Bodor
> Priority: Major
>
> When lots of downstream attempts fail to pull the information from source
> task, source task is marked as failed and it is retried. Currently failure
> fraction is handled by looking at unique task attempts from downstream.
> However, it should consider taking into account node information for
> computing "failureFraction".
> https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskAttemptImpl.java#L1845-L1849
--
This message was sent by Atlassian Jira
(v8.3.4#803005)