Jonathan Eagles created TEZ-3426:
------------------------------------
Summary: Second AM attempt launched for session mode and recovery
disabled for certain cases
Key: TEZ-3426
URL: https://issues.apache.org/jira/browse/TEZ-3426
Project: Apache Tez
Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jason Lowe
Priority: Critical
ApplicationSubmissionContext#setMaxAppAttempts does not fully guarantee that
there will be only that many attempts at maximum. There are a few exceptional
cases that are not count. Tez should protect itself from accidentally starting
the second attempt in session mode and when recovery is disabled since the
second attempt will always succeed with no work to do.
{code}
@Override
public boolean shouldCountTowardsMaxAttemptRetry() {
try {
this.readLock.lock();
int exitStatus = getAMContainerExitStatus();
return !(exitStatus == ContainerExitStatus.PREEMPTED
|| exitStatus == ContainerExitStatus.ABORTED
|| exitStatus == ContainerExitStatus.DISKS_FAILED
|| exitStatus == ContainerExitStatus.KILLED_BY_RESOURCEMANAGER);
} finally {
this.readLock.unlock();
}
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)