[jira] [Updated] (YARN-2630) TestDistributedShell#testDSRestartWithPreviousRunningContainers fails
[ https://issues.apache.org/jira/browse/YARN-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2630: -- Attachment: YARN-2630.3.patch Uploaded a patch which renames NodeHeartbeatResponse#getFinishedContainersPulledByAM to getContainersToBeRemovedFromNM, as I think if in the future we add one more channel (not just pulled by AM) to remove containers from NM, the latter is more semantically correct. TestDistributedShell#testDSRestartWithPreviousRunningContainers fails - Key: YARN-2630 URL: https://issues.apache.org/jira/browse/YARN-2630 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He Attachments: YARN-2630.1.patch, YARN-2630.2.patch, YARN-2630.3.patch The problem is that after YARN-1372, in work-preserving AM restart, the re-launched AM will also receive previously failed AM container. But DistributedShell logic is not expecting this extra completed container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2630) TestDistributedShell#testDSRestartWithPreviousRunningContainers fails
[ https://issues.apache.org/jira/browse/YARN-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2630: -- Description: The problem is that after YARN-1372, in work-preserving AM restart, the re-launched AM will also receive previously failed AM container. But DistributedShell logic is not expecting this extra completed container. (was: The problem is that after YARN-1372, the re-launched AM will also receive previously failed AM container. And DistributedShell logic is not expecting this extra completed container. ) TestDistributedShell#testDSRestartWithPreviousRunningContainers fails - Key: YARN-2630 URL: https://issues.apache.org/jira/browse/YARN-2630 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He The problem is that after YARN-1372, in work-preserving AM restart, the re-launched AM will also receive previously failed AM container. But DistributedShell logic is not expecting this extra completed container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2630) TestDistributedShell#testDSRestartWithPreviousRunningContainers fails
[ https://issues.apache.org/jira/browse/YARN-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2630: -- Attachment: YARN-2630.1.patch Uploaded a patch to make RMAppAttempt not return AM container. TestDistributedShell#testDSRestartWithPreviousRunningContainers fails - Key: YARN-2630 URL: https://issues.apache.org/jira/browse/YARN-2630 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He Attachments: YARN-2630.1.patch The problem is that after YARN-1372, in work-preserving AM restart, the re-launched AM will also receive previously failed AM container. But DistributedShell logic is not expecting this extra completed container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2630) TestDistributedShell#testDSRestartWithPreviousRunningContainers fails
[ https://issues.apache.org/jira/browse/YARN-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2630: -- Attachment: YARN-2630.2.patch Fixed test failures. TestDistributedShell#testDSRestartWithPreviousRunningContainers fails - Key: YARN-2630 URL: https://issues.apache.org/jira/browse/YARN-2630 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He Attachments: YARN-2630.1.patch, YARN-2630.2.patch The problem is that after YARN-1372, in work-preserving AM restart, the re-launched AM will also receive previously failed AM container. But DistributedShell logic is not expecting this extra completed container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)