[
https://issues.apache.org/jira/browse/AMBARI-17859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394265#comment-15394265
]
Hudson commented on AMBARI-17859:
---------------------------------
FAILURE: Integrated in Ambari-trunk-Commit #5391 (See
[https://builds.apache.org/job/Ambari-trunk-Commit/5391/])
AMBARI-17859. YARN service check failed during EU from HDP-2.4.0.0 to
(dlysnichenko:
[http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=33c10efe45ec3488a804c7fb46170afad325cc46])
* ambari-server/src/main/resources/stacks/HDP/2.2/services/stack_advisor.py
> YARN service check failed during EU from HDP-2.4.0.0 to Erie
> ------------------------------------------------------------
>
> Key: AMBARI-17859
> URL: https://issues.apache.org/jira/browse/AMBARI-17859
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.4.0
> Reporter: Dmitry Lysnichenko
> Assignee: Dmitry Lysnichenko
> Fix For: 2.4.0
>
> Attachments: AMBARI-17859.patch
>
>
> *Steps*
> # Deploy HDP-2.4.0.0 cluster with Ambari 2.2.1.1 (secure, non-HA cluster,
> customized service users)
> # Upgrade Ambari to 2.4.0.0
> # Perform EU to 2.5.0.0-934
> *Result*
> During EU, observed YARN service check reported below errors:
> {code}
> Traceback (most recent call last):\n File
> \"/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py\",
> line 159, in <module>\n ServiceCheck().execute()\n File
> \"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\",
> line 280, in execute\n method(env)\n File
> \"/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py\",
> line 117, in service_check\n user=params.smokeuser,\n File
> \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line
> 71, in inner\n result = function(command, **kwargs)\n File
> \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line
> 93, in checked_call\n tries=tries, try_sleep=try_sleep)\n File
> \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line
> 141, in _call_wrapper\n result = _call(command, **kwargs_copy)\n File
> \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line
> 294, in _call\n raise
> Fail(err_msg)\nresource_management.core.exceptions.Fail: Execution of
> '/usr/bin/kinit -kt /etc/security/keytabs/smokeuser.headless.keytab
> [email protected]; yarn
> org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls
> -num_containers 1 -jar
> /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar
> -timeout 300000 --queue default' returned 2. ######## Hortonworks
> #############\nThis is MOTD message, added for testing in qe infra\n16/07/09
> 11:15:01 INFO impl.TimelineClientImpl: Timeline service address:
> http://host:8188/ws/v1/timeline/\n16/07/09 11:15:01 INFO
> distributedshell.Client: Initializing Client\n16/07/09 11:15:01 INFO
> distributedshell.Client: Running Client\n16/07/09 11:15:01 INFO
> client.RMProxy: Connecting to ResourceManager at
> host-5.domainlocal/10.0.113.157:8050\n16/07/09 11:15:03 INFO
> distributedshell.Client: Got Cluster metric info from ASM,
> numNodeManagers=3\n16/07/09 11:15:03 INFO distributedshell.Client: Got
> Cluster node info from ASM\n16/07/09 11:15:03 INFO distributedshell.Client:
> Got node report from ASM for, nodeId=host:25454, nodeAddresshost:8042,
> nodeRackName/default-rack, nodeNumContainers0\n16/07/09 11:15:03 INFO
> distributedshell.Client: Got node report from ASM for,
> nodeId=host-5.domainlocal:25454, nodeAddresshost-5.domainlocal:8042,
> nodeRackName/default-rack, nodeNumContainers0\n16/07/09 11:15:03 INFO
> distributedshell.Client: Got node report from ASM for,
> nodeId=host-1.domainlocal:25454, nodeAddresshost-1.domainlocal:8042,
> nodeRackName/default-rack, nodeNumContainers0\n16/07/09 11:15:03 INFO
> distributedshell.Client: Queue info, queueName=default,
> queueCurrentCapacity=0.0, queueMaxCapacity=1.0, queueApplicationCount=0,
> queueChildQueueCount=0\n16/07/09 11:15:04 INFO distributedshell.Client: User
> ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS\n16/07/09
> 11:15:04 INFO distributedshell.Client: User ACL Info for Queue,
> queueName=default, userAcl=SUBMIT_APPLICATIONS\n16/07/09 11:15:04 INFO
> distributedshell.Client: Max mem capability of resources in this cluster
> 10240\n16/07/09 11:15:04 INFO distributedshell.Client: Max virtual cores
> capabililty of resources in this cluster 1\n16/07/09 11:15:04 INFO
> distributedshell.Client: Copy App Master jar from local filesystem and add to
> local environment\n16/07/09 11:15:04 INFO distributedshell.Client: Set the
> environment for the application master\n16/07/09 11:15:04 INFO
> distributedshell.Client: Setting up app master command\n16/07/09 11:15:04
> INFO distributedshell.Client: Completed setting up app master command
> {{JAVA_HOME}}/bin/java -Xmx10m
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster
> --container_memory 10 --container_vcores 1 --num_containers 1 --priority 0
> 1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr \n16/07/09 11:15:04
> INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 279 for ambari-qa on
> 10.0.113.145:8020\n16/07/09 11:15:04 INFO distributedshell.Client: Got dt for
> hdfs://host-1.domainlocal:8020; Kind: HDFS_DELEGATION_TOKEN, Service:
> 10.0.113.145:8020, Ident: (HDFS_DELEGATION_TOKEN token 279 for
> ambari-qa)\n16/07/09 11:15:04 INFO distributedshell.Client: Submitting
> application to ASM\n16/07/09 11:15:05 INFO impl.YarnClientImpl: Submitted
> application application_1468062580457_0001\n16/07/09 11:15:06 INFO
> distributedshell.Client: Got application report from ASM for, appId=1,
> clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service: },
> appDiagnostics=AM container is launched, waiting for AM container to Register
> with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1,
> appStartTime=1468062904947, yarnAppState=ACCEPTED,
> distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:07 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:08 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:09 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:10 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:11 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:12 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:13 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:14 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=AM container is launched,
> waiting for AM container to Register with RM, appMasterHost=N/A,
> appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947,
> yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:15 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=,
> appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1,
> appStartTime=1468062904947, yarnAppState=RUNNING,
> distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:16 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=,
> appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1,
> appStartTime=1468062904947, yarnAppState=RUNNING,
> distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:17 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=,
> appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1,
> appStartTime=1468062904947, yarnAppState=RUNNING,
> distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:18 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=Token { kind:
> YARN_CLIENT_TOKEN, service: }, appDiagnostics=,
> appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1,
> appStartTime=1468062904947, yarnAppState=RUNNING,
> distributedFinalState=UNDEFINED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:19 INFO distributedshell.Client: Got
> application report from ASM for, appId=1, clientToAMToken=null,
> appDiagnostics=Diagnostics., total=1, completed=1, allocated=1, failed=1,
> appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1,
> appStartTime=1468062904947, yarnAppState=FINISHED,
> distributedFinalState=FAILED,
> appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
> appUser=ambari-qa\n16/07/09 11:15:19 INFO distributedshell.Client:
> Application did finished unsuccessfully. YarnState=FINISHED,
> DSFinalStatus=FAILED. Breaking monitoring loop\n16/07/09 11:15:19 ERROR
> distributedshell.Client: Application failed to complete successfully"
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)