Anubhav Dhoot created YARN-3229: ----------------------------------- Summary: Incorrect processing of container as LOST on Interruption during NM shutdown Key: YARN-3229 URL: https://issues.apache.org/jira/browse/YARN-3229 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot
YARN-2846 fixed the issue of writing to the state store incorrectly that the process is LOST. But even after that we still process the ContainerExitEvent. If notInterrupted is false in RecoveredContainerLaunch#call we should skip the following {noformat} if (retCode != 0) { LOG.warn("Recovered container exited with a non-zero exit code " + retCode); this.dispatcher.getEventHandler().handle(new ContainerExitEvent( containerId, ContainerEventType.CONTAINER_EXITED_WITH_FAILURE, retCode, "Container exited with a non-zero exit code " + retCode)); return retCode; } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)