[ https://issues.apache.org/jira/browse/TEZ-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642016#comment-16642016 ]
Kuhu Shukla commented on TEZ-3990: ---------------------------------- Addressed comments by [~jeagles]. Agreed on the issues mentioned with delay calculation and testability. [~jeagles], should I go ahead and create JIRAs for these issues? > The number of shuffle penalties for a host/inputAttemptIdentifier should be > capped > ---------------------------------------------------------------------------------- > > Key: TEZ-3990 > URL: https://issues.apache.org/jira/browse/TEZ-3990 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.9.1, 0.10.0 > Reporter: Kuhu Shukla > Assignee: Kuhu Shukla > Priority: Major > Attachments: TEZ-3990.001.patch, TEZ-3990.002.patch, > TEZ-3990.003.patch, TEZ-3990.004.patch > > > In a scenario where the same mapId fetches fail, the penalty code allows > adding the same Host/InputAttemptIdentifier over and over with revised > penalty time that grows exponentially. It should at some point drop the > retrying and report failure to the AM asap to allow the job to rectify the > upstream output. -- This message was sent by Atlassian JIRA (v7.6.3#76005)