[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437868#comment-16437868 ] Billie Rinaldi commented on YARN-8142: -- Thanks for the review and commit, [~eyang]! I plan to cherry-pick this to branch-3.1 if no one objects. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > Attachments: YARN-8142.1.patch > > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437825#comment-16437825 ] Hudson commented on YARN-8142: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13995 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13995/]) YARN-8142. Improve SIGTERM handling for YARN Service Application (eyang: rev 9031a76d447f0c5eaa392144fd17c5b9812e1b20) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/TestYarnNativeServices.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ClientAMService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/ServiceTestUtils.java > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > Attachments: YARN-8142.1.patch > > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434800#comment-16434800 ] genericqa commented on YARN-8142: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 40s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 55m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | YARN-8142 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12918641/YARN-8142.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fcf6b1f34e97 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0d898b7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20315/testReport/ | | Max. process+thread count | 687 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20315/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > yarn servic
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434414#comment-16434414 ] Billie Rinaldi commented on YARN-8142: -- [~eyang], that argument makes sense to me. It does seem unexpected for SIGTERM to be more destructive than SIGKILL. I will put up a patch to make SIGTERM match the SIGKILL behavior. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434316#comment-16434316 ] Eric Yang commented on YARN-8142: - [~billie.rinaldi] [~shaneku...@gmail.com] SIGTERM should not be more destructive than SIGKILL. Hence, I am changing my mind about using SIGTERM to terminate the entire application. It would be good to keep it consistent with existing YARN AM design. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434202#comment-16434202 ] Shane Kumpf commented on YARN-8142: --- I'm leaning in favor of having SIGTERM result in a new AM being started, as SIGKILL does now. Erring on the side of "keep the service running" may outweigh matching the Unix philosophy here, but I can understand that reasoning. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434136#comment-16434136 ] Billie Rinaldi commented on YARN-8142: -- Okay, I could put up a patch that would modify the service AM SIGTERM behavior to match the SIGKILL behavior, but we need to resolve what we would like the AM to do in this situation. Currently, the behavior is: * SIGKILL only kills the AM, so a new AM will be started and the service will keep running * SIGTERM stops the entire service * a service client stop command RPC stops the entire service So the question is whether the SIGTERM behavior should be the same as the SIGKILL or the service client stop command. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434013#comment-16434013 ] Rushabh S Shah commented on YARN-8142: -- My bad. Please ignore my previous comments. I missed to read that it affects {{yarn-native-services}}. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433959#comment-16433959 ] Rushabh S Shah commented on YARN-8142: -- I reproduced the same scenario that [~yeshavora] pointed out in description in cluster running hadoop 2.8. It doesn't kill the whole application. In my case, it launched another AM attempt. I ran SIGKILL as well as SIGTERM. Both of them spawned new AM attempt. So something that went into 3.* changed that behavior. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433405#comment-16433405 ] Eric Yang commented on YARN-8142: - [~cheersyang] If AM is hanging, then it is unlikely to gracefully terminate by SIGTERM. I think SIGKILL would be the right way to handle this, and let RM restart it. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433340#comment-16433340 ] Weiwei Yang commented on YARN-8142: --- I am thinking another scenario, what if an app's AM hangs. User wants to restart the AM without restarting the entire app, would it be reasonable to allow an AM be killed then automatically restarted in another instance? To terminate the application (especially for a long-running service), we need to call a stop command instead of killing an AM directly. Does that make sense? > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM
[ https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433210#comment-16433210 ] Eric Yang commented on YARN-8142: - In Unix terms, SIGTERM is used for terminating application. My impression this is correct behavior rather than start another instance. If other signal is used, then spawning another instance might be the right thing to do. > yarn service application stops when AM is killed with SIGTERM > - > > Key: YARN-8142 > URL: https://issues.apache.org/jira/browse/YARN-8142 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > > Steps: > 1) Launch sleeper job ( non-docker yarn service) > {code} > RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch > fault-test-am-sleeper > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History > server at xxx:10200 > 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition > from local FS: > /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json > 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms > 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: > application_1522887500374_0010 > Exit Code: 0{code} > 2) Wait for sleeper component to be up > 3) Kill AM process PID > > Expected behavior: > New attempt of AM will be started. The pre-existing container will keep > running > > Actual behavior: > Application finishes with State : FINISHED and Final-State : ENDED > New attempt was never launched > Note: > when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting > the entire app down instead of letting it continue to run for another attempt > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org