Dmitry Lysnichenko created AMBARI-17859:
-------------------------------------------

             Summary: YARN service check failed during EU from HDP-2.4.0.0 to 
Erie
                 Key: AMBARI-17859
                 URL: https://issues.apache.org/jira/browse/AMBARI-17859
             Project: Ambari
          Issue Type: Bug
            Reporter: Dmitry Lysnichenko
            Assignee: Dmitry Lysnichenko
         Attachments: AMBARI-17859.patch



*Steps*
# Deploy HDP-2.4.0.0 cluster with Ambari 2.2.1.1 (secure, non-HA cluster, 
customized service users)
# Upgrade Ambari to 2.4.0.0
# Perform EU to 2.5.0.0-934

*Result*
During EU, observed YARN service check reported below errors:
{code}
Traceback (most recent call last):\n  File 
\"/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py\",
 line 159, in <module>\n    ServiceCheck().execute()\n  File 
\"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\",
 line 280, in execute\n    method(env)\n  File 
\"/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py\",
 line 117, in service_check\n    user=params.smokeuser,\n  File 
\"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 
71, in inner\n    result = function(command, **kwargs)\n  File 
\"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 
93, in checked_call\n    tries=tries, try_sleep=try_sleep)\n  File 
\"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 
141, in _call_wrapper\n    result = _call(command, **kwargs_copy)\n  File 
\"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 
294, in _call\n    raise 
Fail(err_msg)\nresource_management.core.exceptions.Fail: Execution of 
'/usr/bin/kinit -kt /etc/security/keytabs/smokeuser.headless.keytab 
[email protected]; yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls 
-num_containers 1 -jar 
/usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar
 -timeout 300000 --queue default' returned 2. ######## Hortonworks 
#############\nThis is MOTD message, added for testing in qe infra\n16/07/09 
11:15:01 INFO impl.TimelineClientImpl: Timeline service address: 
http://host:8188/ws/v1/timeline/\n16/07/09 11:15:01 INFO 
distributedshell.Client: Initializing Client\n16/07/09 11:15:01 INFO 
distributedshell.Client: Running Client\n16/07/09 11:15:01 INFO client.RMProxy: 
Connecting to ResourceManager at host-5.domainlocal/10.0.113.157:8050\n16/07/09 
11:15:03 INFO distributedshell.Client: Got Cluster metric info from ASM, 
numNodeManagers=3\n16/07/09 11:15:03 INFO distributedshell.Client: Got Cluster 
node info from ASM\n16/07/09 11:15:03 INFO distributedshell.Client: Got node 
report from ASM for, nodeId=host:25454, nodeAddresshost:8042, 
nodeRackName/default-rack, nodeNumContainers0\n16/07/09 11:15:03 INFO 
distributedshell.Client: Got node report from ASM for, 
nodeId=host-5.domainlocal:25454, nodeAddresshost-5.domainlocal:8042, 
nodeRackName/default-rack, nodeNumContainers0\n16/07/09 11:15:03 INFO 
distributedshell.Client: Got node report from ASM for, 
nodeId=host-1.domainlocal:25454, nodeAddresshost-1.domainlocal:8042, 
nodeRackName/default-rack, nodeNumContainers0\n16/07/09 11:15:03 INFO 
distributedshell.Client: Queue info, queueName=default, 
queueCurrentCapacity=0.0, queueMaxCapacity=1.0, queueApplicationCount=0, 
queueChildQueueCount=0\n16/07/09 11:15:04 INFO distributedshell.Client: User 
ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS\n16/07/09 
11:15:04 INFO distributedshell.Client: User ACL Info for Queue, 
queueName=default, userAcl=SUBMIT_APPLICATIONS\n16/07/09 11:15:04 INFO 
distributedshell.Client: Max mem capability of resources in this cluster 
10240\n16/07/09 11:15:04 INFO distributedshell.Client: Max virtual cores 
capabililty of resources in this cluster 1\n16/07/09 11:15:04 INFO 
distributedshell.Client: Copy App Master jar from local filesystem and add to 
local environment\n16/07/09 11:15:04 INFO distributedshell.Client: Set the 
environment for the application master\n16/07/09 11:15:04 INFO 
distributedshell.Client: Setting up app master command\n16/07/09 11:15:04 INFO 
distributedshell.Client: Completed setting up app master command 
{{JAVA_HOME}}/bin/java -Xmx10m 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
--container_memory 10 --container_vcores 1 --num_containers 1 --priority 0 
1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr \n16/07/09 11:15:04 
INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 279 for ambari-qa on 
10.0.113.145:8020\n16/07/09 11:15:04 INFO distributedshell.Client: Got dt for 
hdfs://host-1.domainlocal:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 
10.0.113.145:8020, Ident: (HDFS_DELEGATION_TOKEN token 279 for 
ambari-qa)\n16/07/09 11:15:04 INFO distributedshell.Client: Submitting 
application to ASM\n16/07/09 11:15:05 INFO impl.YarnClientImpl: Submitted 
application application_1468062580457_0001\n16/07/09 11:15:06 INFO 
distributedshell.Client: Got application report from ASM for, appId=1, 
clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM 
container is launched, waiting for AM container to Register with RM, 
appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1468062904947, yarnAppState=ACCEPTED, 
distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:07 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:08 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:09 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:10 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:11 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:12 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:13 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:14 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=AM container is launched, 
waiting for AM container to Register with RM, appMasterHost=N/A, 
appQueue=default, appMasterRpcPort=-1, appStartTime=1468062904947, 
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:15 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=, 
appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1468062904947, yarnAppState=RUNNING, 
distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:16 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=, 
appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1468062904947, yarnAppState=RUNNING, 
distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:17 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=, 
appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1468062904947, yarnAppState=RUNNING, 
distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:18 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=Token { kind: 
YARN_CLIENT_TOKEN, service:  }, appDiagnostics=, 
appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1468062904947, yarnAppState=RUNNING, 
distributedFinalState=UNDEFINED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:19 INFO distributedshell.Client: Got 
application report from ASM for, appId=1, clientToAMToken=null, 
appDiagnostics=Diagnostics., total=1, completed=1, allocated=1, failed=1, 
appMasterHost=host-1/10.0.113.145, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1468062904947, yarnAppState=FINISHED, 
distributedFinalState=FAILED, 
appTrackingUrl=http://host-5.domainlocal:8088/proxy/application_1468062580457_0001/,
 appUser=ambari-qa\n16/07/09 11:15:19 INFO distributedshell.Client: Application 
did finished unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking 
monitoring loop\n16/07/09 11:15:19 ERROR distributedshell.Client: Application 
failed to complete successfully"
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to