Jonathan Eagles created TEZ-3426:
------------------------------------

             Summary: Second AM attempt launched for session mode and recovery 
disabled for certain cases
                 Key: TEZ-3426
                 URL: https://issues.apache.org/jira/browse/TEZ-3426
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jonathan Eagles
            Assignee: Jason Lowe
            Priority: Critical


ApplicationSubmissionContext#setMaxAppAttempts does not fully guarantee that 
there will be only that many attempts at maximum. There are a few exceptional 
cases that are not count. Tez should protect itself from accidentally starting 
the second attempt in session mode and when recovery is disabled since the 
second attempt will always succeed with no work to do.

{code}
  @Override
  public boolean shouldCountTowardsMaxAttemptRetry() {
    try {
      this.readLock.lock();
      int exitStatus = getAMContainerExitStatus();
      return !(exitStatus == ContainerExitStatus.PREEMPTED
          || exitStatus == ContainerExitStatus.ABORTED
          || exitStatus == ContainerExitStatus.DISKS_FAILED
          || exitStatus == ContainerExitStatus.KILLED_BY_RESOURCEMANAGER);
    } finally {
      this.readLock.unlock();
    }
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to