[jira] [Assigned] (YARN-3229) Incorrect processing of container as LOST on Interruption during NM shutdown
[ https://issues.apache.org/jira/browse/YARN-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3229: --- Assignee: (was: Anubhav Dhoot) > Incorrect processing of container as LOST on Interruption during NM shutdown > > > Key: YARN-3229 > URL: https://issues.apache.org/jira/browse/YARN-3229 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Anubhav Dhoot > > YARN-2846 fixed the issue of writing to the state store incorrectly that the > process is LOST. But even after that we still process the ContainerExitEvent. > If notInterrupted is false in RecoveredContainerLaunch#call we should skip > the following > {noformat} > if (retCode != 0) { > LOG.warn("Recovered container exited with a non-zero exit code " > + retCode); > this.dispatcher.getEventHandler().handle(new ContainerExitEvent( > containerId, > ContainerEventType.CONTAINER_EXITED_WITH_FAILURE, retCode, > "Container exited with a non-zero exit code " + retCode)); > return retCode; > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-3229) Incorrect processing of container as LOST on Interruption during NM shutdown
[ https://issues.apache.org/jira/browse/YARN-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3229: --- Assignee: Anubhav Dhoot Incorrect processing of container as LOST on Interruption during NM shutdown Key: YARN-3229 URL: https://issues.apache.org/jira/browse/YARN-3229 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot YARN-2846 fixed the issue of writing to the state store incorrectly that the process is LOST. But even after that we still process the ContainerExitEvent. If notInterrupted is false in RecoveredContainerLaunch#call we should skip the following {noformat} if (retCode != 0) { LOG.warn(Recovered container exited with a non-zero exit code + retCode); this.dispatcher.getEventHandler().handle(new ContainerExitEvent( containerId, ContainerEventType.CONTAINER_EXITED_WITH_FAILURE, retCode, Container exited with a non-zero exit code + retCode)); return retCode; } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)