I am upgrading one of our cluster from HDP 2.2 to HDP 2.4.0. version.
The status I see in the Application monitoring URL is YARN Applicaiton Status: ACCEPTED: waiting for AM container to be allocated, launched and register with RM. But when we submit the MR job, then it is running fine. It waits in that state for sometime(300 seconds) and dies and the service check is failed. All nodes are live and Active status. We try to run the job manually , and the job stops at this point. hadoop --config /usr/hdp/2.4.0.0-169/hadoop/conf jar /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput1/ WARNING: Use "yarn jar" to launch YARN applications. 16/06/17 19:04:47 INFO client.TezClient: Tez Client Version: [ component=tez-api, version=0.7.0.2.4.0.0-169, revision=3c1431f45faaca982ecc8dad13a107787b834696, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=20160210-0711 ] 16/06/17 19:04:47 INFO impl.TimelineClientImpl: Timeline service address: http://usw2stdpma03.glassdoor.local:8188/ws/v1/timeline/ 16/06/17 <http://usw2stdpma03.glassdoor.local:8188/ws/v1/timeline/16/06/17> 19:04:48 INFO client.RMProxy: Connecting to ResourceManager at usw2stdpma03.glassdoor.local/172.17.212.107:8050 16/06/17 19:04:48 INFO client.TezClient: Using org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager to manage Timeline ACLs 16/06/17 19:04:48 INFO impl.TimelineClientImpl: Timeline service address: http://usw2stdpma03.glassdoor.local:8188/ws/v1/timeline/ 16/06/17 <http://usw2stdpma03.glassdoor.local:8188/ws/v1/timeline/16/06/17> 19:04:49 INFO examples.OrderedWordCount: Running OrderedWordCount 16/06/17 19:04:49 INFO client.TezClient: Submitting DAG application with id: application_1466115469995_0142 16/06/17 19:04:49 INFO client.TezClientUtils: Using tez.lib.uris value from configuration: /hdp/apps/2.4.0.0-169/tez/tez.tar.gz 16/06/17 19:04:49 INFO client.TezClient: Stage directory /tmp/root/staging doesn't exist and is created 16/06/17 19:04:49 INFO client.TezClient: Tez system stage directory hdfs://dfs-nameservices/tmp/root/staging/.tez/application_1466115469995_0142 doesn't exist and is created 16/06/17 19:04:49 INFO acls.ATSHistoryACLPolicyManager: Created Timeline Domain for History ACLs, domainId=Tez_ATS_application_1466115469995_0142 16/06/17 19:04:50 INFO client.TezClient: Submitting DAG to YARN, applicationId=application_1466115469995_0142, dagName=OrderedWordCount, callerContext={ context=TezExamples, callerType=null, callerId=null } 16/06/17 19:04:50 INFO impl.YarnClientImpl: Submitted application application_1466115469995_0142 16/06/17 19:04:50 INFO client.TezClient: The url to track the Tez AM: http://usw2stdpma03.glassdoor.local:8088/proxy/application_1466115469995_0142/ 16/06/17 <http://usw2stdpma03.glassdoor.local:8088/proxy/application_1466115469995_0142/16/06/17> 19:04:50 INFO impl.TimelineClientImpl: Timeline service address: http://usw2stdpma03.glassdoor.local:8188/ws/v1/timeline/ 16/06/17 <http://usw2stdpma03.glassdoor.local:8188/ws/v1/timeline/16/06/17> 19:04:50 INFO client.RMProxy: Connecting to ResourceManager at usw2stdpma03.glassdoor.local/172.17.212.107:8050 16/06/17 19:04:51 INFO client.DAGClientImpl: Waiting for DAG to start running how do I fix this problem ? Thanks Anand
