[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16127294#comment-16127294 ] Yuqi Wang commented on YARN-2402: - Thanks, I will do it later. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16127292#comment-16127292 ] Jason Lowe commented on YARN-2402: -- So it does sound like we need this patch before declaring container recovery on Windows completely working, correct? Unfortunately I cannot get Jenkins to comment on this since the parent JIRA has been Closed. We can file a separate JIRA for this so we can get a proper Jenkins run on the patch. We can then mark this as a duplicate of the new one. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126783#comment-16126783 ] Yuqi Wang commented on YARN-2402: - [~jlowe] When the NM is down, if the container process exit with exit code 666, with this patch, 666 will be recorded in the ExitCodeFile. And after NM recovered, NM will detect the container process exited and then read the ExitCodeFile to get the exit code. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126396#comment-16126396 ] Jason Lowe commented on YARN-2402: -- bq. Container recovery for Windows has been fully verified on Windows. Excellent news! Curious, how does the container recovery on Windows reconstruct the exit code for containers that completed while the NM was down? We should really dup this JIRA to whatever JIRA added that functionality, but I didn't see any code that handled this on Windows. The prototype patch attached to this JIRA was doing something along those lines, and I didn't see how Windows was properly recovering exit codes for completed containers without something like it. bq. there is also not unit test for getting exit code from the exitCodeFile for Unix or getting pid from the pidFile for Windows, seems it is trivial to test this simple script. Feel free to file a JIRA for adding those tests, and I can help review it. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116630#comment-16116630 ] Yuqi Wang commented on YARN-2402: - [~djp] [~jlowe] Can we mark it as RESOLVED now? Container recovery for Windows has been fully verified on Windows. BTW, I have checked there is also not unit test for getting exit code from the exitCodeFile for Unix or getting pid from the pidFile for Windows, seems it is trivial to test this simple script. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093322#comment-15093322 ] Yuqi Wang commented on YARN-2402: - Thanks,:) I will try. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091917#comment-15091917 ] Junping Du commented on YARN-2402: -- Thanks for the patch, [~yqwang]! bq. But if it is needed a unit test, I can add it afterwards. It would be very nice if you can add unit test here. :) > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091887#comment-15091887 ] Junping Du commented on YARN-2402: -- [~yqwang], I have added you to the contributor list and assign the JIRA to you. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Yuqi Wang > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091326#comment-15091326 ] Yuqi Wang commented on YARN-2402: - That is very kind of you, Tks :) > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089931#comment-15089931 ] Hadoop QA commented on YARN-2402: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 5s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 31s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 36m 59s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12781165/YARN-2402-v2.patch | | JIRA Issue | YARN-2402 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 86d679c4bc5f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchpro
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089833#comment-15089833 ] Inigo Goiri commented on YARN-2402: --- I temporarily assigned the patch to me to trigger Jenkins. I tried to assign it to Yuqi but I couldn't; if somebody could set it properly, it would be much appreciated. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe > Attachments: YARN-2402-v1.patch, YARN-2402-v2.patch > > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088867#comment-15088867 ] Hadoop QA commented on YARN-2402: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager (total was 21, now 21). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 34s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 6s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 1s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12781152/YARN-2402-v1.patch | | JIRA Issue | YARN-2402 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b7381b11ec8e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT W
[jira] [Commented] (YARN-2402) NM restart: Container recovery for Windows
[ https://issues.apache.org/jira/browse/YARN-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092185#comment-14092185 ] Jason Lowe commented on YARN-2402: -- See YARN-1337 for the changes needed to the container executors to handle this on UNIX/Linux. > NM restart: Container recovery for Windows > -- > > Key: YARN-2402 > URL: https://issues.apache.org/jira/browse/YARN-2402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe > > We should add container recovery for NM restart on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)