Hi This is the result of ps command. [root@datanode3 ~]# ps -aef|grep 27822 root 6624 6590 0 09:28 pts/0 00:00:00 grep 27822
The process is not there with this command. is it just that the jvm didn't unregister itself? But how to solve it. Thanks. [email protected] From: Rohith Sharma K S Date: 2014-10-31 16:12 To: [email protected] Subject: RE: YarnChild didn't be killed after running mapreduce This is strange!! Can you get ps �Caef | grep <pid> fro this process? What is the application status in RM UI? Thanks & Regards Rohith Sharma K S This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! From: [email protected] [mailto:[email protected]] Sent: 31 October 2014 13:05 To: [email protected] Subject: YarnChild didn't be killed after running mapreduce All I runed mapreduce example successfully,but it always appeared invalid process on the nodemanager nodes,as follow: 27398 DataNode 27961 Jps 13669 QuorumPeerMain 27822 -- process information unavailable 18349 ThriftServer 27557 NodeManager I deleted this invalid process under /tmp/hsperfdata_yarn ,it will be there after running mapreduce(yarn) again. I had modified many parameters in yarn-site.xml and mapred-site.xml. yarn-site.xml <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>4096</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>256</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value> </property> <property> <name>yarn.scheduler.minimum-allocation-vcores</name> <value>1</value> </property> <property> <name>yarn.scheduler.maximum-allocation-vcores</name> <value>2</value> </property> mapred-site.xml <property> <name>mapreduce.map.memory.mb</name> <value>512</value> </property> <property> <name>mapreduce.map.cpu.vcores</name> <value>2</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>512</value> </property> <property> <name>mapreduce.reduce.cpu.vcores</name> <value>2</value> All didn't work. It has been up for a long time. There ware no error log,only found some suspicious logs,as follow: 2014-10-31 14:35:59,306 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1414736576842_0001_01_000008 2014-10-31 14:35:59,350 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 27818 for container-id container_1414736576842_0001_01_000008: 107.9 MB of 1 GB physical memory used; 1.5 GB of 2.1 GB virtual memory used 2014-10-31 14:36:01,068 INFO org.apache.hadoop.mapred.ShuffleHandler: Setting connection close header... 2014-10-31 14:36:01,702 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Stopping container with container Id: container_1414736576842_0001_01_000008 2014-10-31 14:36:01,702 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=root IP=192.168.200.128 OPERATION=Stop Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1414736576842_0001 CONTAINERID=container_1414736576842_0001_01_000008 2014-10-31 14:36:01,703 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1414736576842_0001_01_000008 transitioned from RUNNING to KILLING 2014-10-31 14:36:01,703 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1414736576842_0001_01_000008 2014-10-31 14:36:01,724 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1414736576842_0001_01_000008 is : 143 2014-10-31 14:36:01,791 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1414736576842_0001_01_000008 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2014-10-31 14:36:01,791 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /hadoop/yarn/local/usercache/root/appcache/application_1414736576842_0001/container_1414736576842_0001_01_000008 2014-10-31 14:36:01,792 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=root OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1414736576842_0001 CONTAINERID=container_1414736576842_0001_01_000008 2014-10-31 14:36:01,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1414736576842_0001_01_000008 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE 2014-10-31 14:36:01,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_1414736576842_0001_01_000008 from application application_1414736576842_0001 2014-10-31 14:36:01,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_1414736576842_0001_01_000008 for log-aggregation 2014-10-31 14:36:01,793 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1414736576842_0001 [email protected]
