Hi I have posted my question for a day,please can somebody help me to figure out what the problem is. Thank you. regards YouPeng Yang
---------- Forwarded message ---------- From: YouPeng Yang <[email protected]> Date: 2013/1/30 Subject: YARN NM containers were killed To: [email protected] i've tested the hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar on my hadoop environment ( 1 RM - Hadoop01 and 3 NM --Hadoop02,Hadoop03,Hadoop04 OS:CDH4.1.2 rhel5.5): ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar wordcount 1/input output when i checked the log .i was confused by the plz: my hadoop creates 2 containers in Hadoop02,1 container in Hadoop03 ,however 0 container Hadoop04. the result of the containers processing: Hadoop02: * container_1359422495723_0001_01_000001 (its state changes as follows:NEW --> LOCALIZING --> LOCALIZED --> RUNNING --> KILLING --> EXITED_WITH_SUCCESS) the log indates that: NodeStatusUpdaterImpl: Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "", exit_status: -1000, ContainerLaunch: Container container_1359422495723_0001_01_000001 succeeded Container: Container container_1359422495723_0001_01_000001 transitioned from RUNNING to EXITED_WITH_SUCCESS ContainerLaunch: Cleaning up container container_1359422495723_0001_01_000001 NMAuditLogger: USER=hadoop OPERATION=Container Finished - Succeeded TARGET=ContainerImpl RESULT=SUCCESSAPPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000001 * container_1359422495723_0001_01_000003 (its state changes as follows:NEW --> LOCALIZING --> LOCALIZED --> RUNNING --> KILLING --> CONTAINER_CLEANEDUP_AFTER_KILL--> DONE) the log indates that: NodeStatusUpdaterImpl: Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: 3, }, state: C_RUNNING, diagnostics: "Container killed by the ApplicationMaster.\n", exit_status: -1000, DefaultContainerExecutor: Exit code from task is : 137 NMAuditLogger: USER=hadoop OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000003 Hadoop03: * container_1359422495723_0001_01_000002 (its state changes as follows:NEW --> LOCALIZING --> LOCALIZED --> RUNNING --> KILLING --> CONTAINER_CLEANEDUP_AFTER_KILL--> DONE) NodeStatusUpdaterImpl: Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: 2, }, state: C_RUNNING, diagnostics: "Container killed by the ApplicationMaster.\n", exit_status: -1000, DefaultContainerExecutor: Exit code from task is : 143 NMAuditLogger: USER=hadoop OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000002 My questions: 1. Why were 2 containers created in Hadoop02,however Hadoop04 got nothing.is it normal ? 2. What is the principle that guides containers to be created. 3. Why were the two containers (the container_*_000003 and the container_*_000002) killed, while the container_*_000001 succeeded. is it normal? logs of Hadoop01 as follows: 2013-01-29 09:23:48,904 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 1 2013-01-29 09:23:50,201 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 1 submitted by user hadoop 2013-01-29 09:23:50,204 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop IP=10.167.14.221 OPERATION=Submit Application Request TARGET=ClientRMServiceRESULT=SUCCESS APPID=application_1359422495723_0001 2013-01-29 09:23:50,221 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1359422495723_0001 State change from NEW to SUBMITTED 2013-01-29 09:23:50,221 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering appattempt_1359422495723_0001_000001 2013-01-29 09:23:50,222 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 State change from NEW to SUBMITTED 2013-01-29 09:23:50,242 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Application Submission: application_1359422495723_0001 from hadoop, currently active: 1 2013-01-29 09:23:50,250 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 State change from SUBMITTED to SCHEDULED 2013-01-29 09:23:50,250 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1359422495723_0001 State change from SUBMITTED to ACCEPTED 2013-01-29 09:23:50,581 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000001 Container Transitioned from NEW to ALLOCATED 2013-01-29 09:23:50,581 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000001 2013-01-29 09:23:50,581 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_1359422495723_0001_01_000001 of capacity memory: 1536 on host Hadoop02:39876, which currently has 1 containers, memory: 1536 used and memory: 6656 available 2013-01-29 09:23:50,582 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000001 Container Transitioned from ALLOCATED to ACQUIRED 2013-01-29 09:23:50,583 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 State change from SCHEDULED to ALLOCATED 2013-01-29 09:23:50,587 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1359422495723_0001_000001 2013-01-29 09:23:50,606 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1359422495723_0001_01_000001, NodeId: Hadoop02:39876, NodeHttpAddress: Hadoop02:8042, Resource: memory: 1536, Priority: org.apache.hadoop.yarn.api.records.impl.pb.PriorityPBImpl@1f, State: NEW, Token: null, Status: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: 1, }, state: C_NEW, ] for AM appattempt_1359422495723_0001_000001 2013-01-29 09:23:50,606 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1359422495723_0001_01_000001 : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=<LOG_DIR> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr 2013-01-29 09:23:51,030 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1359422495723_0001_01_000001, NodeId: Hadoop02:39876, NodeHttpAddress: Hadoop02:8042, Resource: memory: 1536, Priority: org.apache.hadoop.yarn.api.records.impl.pb.PriorityPBImpl@1f, State: NEW, Token: null, Status: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1359422495723, }, attemptId: 1, }, id: 1, }, state: C_NEW, ] for AM appattempt_1359422495723_0001_000001 2013-01-29 09:23:51,030 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 State change from ALLOCATED to LAUNCHED 2013-01-29 09:23:51,575 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000001 Container Transitioned from ACQUIRED to RUNNING 2013-01-29 09:23:57,108 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM registration appattempt_1359422495723_0001_000001 2013-01-29 09:23:57,109 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop IP=10.167.14.222 OPERATION=Register App Master TARGET=ApplicationMasterServicRESULT=SUCCESS APPID=application_1359422495723_0001 APPATTEMPTID=appattempt_1359422495723_0001_000001 2013-01-29 09:23:57,109 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 State change from LAUNCHED to RUNNING 2013-01-29 09:23:57,109 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1359422495723_0001 State change from ACCEPTED to RUNNING 2013-01-29 09:23:58,616 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000002 Container Transitioned from NEW to ALLOCATED 2013-01-29 09:23:58,616 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000002 2013-01-29 09:23:58,616 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_1359422495723_0001_01_000002 of capacity memory: 1024 on host Hadoop03:39387, which currently has 1 containers, memory: 1024 used and memory: 7168 available 2013-01-29 09:23:59,168 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000002 Container Transitioned from ALLOCATED to ACQUIRED 2013-01-29 09:24:00,646 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000003 Container Transitioned from NEW to ALLOCATED 2013-01-29 09:24:00,646 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000003 2013-01-29 09:24:00,646 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_1359422495723_0001_01_000003 of capacity memory: 1024 on host Hadoop02:39876, which currently has 2 containers, memory: 2560 used and memory: 5632 available 2013-01-29 09:24:00,659 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000002 Container Transitioned from ACQUIRED to RUNNING 2013-01-29 09:24:01,196 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000003 Container Transitioned from ALLOCATED to ACQUIRED 2013-01-29 09:24:01,657 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000003 Container Transitioned from ACQUIRED to RUNNING 2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000002 Container Transitioned from RUNNING to COMPLETED 2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1359422495723_0001_01_000002 in state: COMPLETED event:FINISHED 2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000002 2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Released container container_1359422495723_0001_01_000002 of capacity memory: 1024 on host Hadoop03:39387, which currently has 0 containers, memory: 0 used and memory: 8192 available, release resources=true 2013-01-29 09:24:05,674 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Application appattempt_1359422495723_0001_000001 released container container_1359422495723_0001_01_000002 on node: host: Hadoop03:39387 #containers=0 available=8192 used=0 with event: FINISHED 2013-01-29 09:24:07,524 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000003 Container Transitioned from RUNNING to COMPLETED 2013-01-29 09:24:07,524 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1359422495723_0001_01_000003 in state: COMPLETED event:FINISHED 2013-01-29 09:24:07,524 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000003 2013-01-29 09:24:07,524 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Released container container_1359422495723_0001_01_000003 of capacity memory: 1024 on host Hadoop02:39876, which currently has 1 containers, memory: 1536 used and memory: 6656 available, release resources=true 2013-01-29 09:24:07,525 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Application appattempt_1359422495723_0001_000001 released container container_1359422495723_0001_01_000003 on node: host: Hadoop02:39876 #containers=1 available=6656 used=1536 with event: FINISHED 2013-01-29 09:24:11,597 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 State change from RUNNING to FINISHING 2013-01-29 09:24:11,597 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1359422495723_0001 State change from RUNNING to FINISHING 2013-01-29 09:24:12,554 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1359422495723_0001_01_000001 Container Transitioned from RUNNING to COMPLETED 2013-01-29 09:24:12,554 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1359422495723_0001_01_000001 in state: COMPLETED event:FINISHED 2013-01-29 09:24:12,554 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1359422495723_0001 CONTAINERID=container_1359422495723_0001_01_000001 2013-01-29 09:24:12,555 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Released container container_1359422495723_0001_01_000001 of capacity memory: 1536 on host Hadoop02:39876, which currently has 0 containers, memory: 0 used and memory: 8192 available, release resources=true 2013-01-29 09:24:12,555 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Application appattempt_1359422495723_0001_000001 released container container_1359422495723_0001_01_000001 on node: host: Hadoop02:39876 #containers=0 available=8192 used=0 with event: FINISHED 2013-01-29 09:24:12,556 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1359422495723_0001_000001 State change from FINISHING to FINISHED 2013-01-29 09:24:12,557 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1359422495723_0001 State change from FINISHING to FINISHED 2013-01-29 09:24:12,558 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=Application Finished - Succeeded TARGET=RMAppManager RESULT=SUCCESSAPPID=application_1359422495723_0001 2013-01-29 09:24:12,558 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1359422495723_0001 requests cleared 2013-01-29 09:24:12,560 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1359422495723_0001_000001 2013-01-29 09:24:12,560 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1359422495723_0001,name=word count,user=hadoop,queue=default,state=FINISHED,trackingUrl=Hadoop01:8088/proxy/application_1359422495723_0001/jobhistory/job/job_1359422495723_0001,appMasterHost=Hadoop02,startTime=1359422630195,finishTime=1359422651597
