[ https://issues.apache.org/jira/browse/HDFS-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kihwal Lee resolved HDFS-9863. ------------------------------ Resolution: Invalid > DataNode doesn't log any shutdown info when the process of DataNode exiting > --------------------------------------------------------------------------- > > Key: HDFS-9863 > URL: https://issues.apache.org/jira/browse/HDFS-9863 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.7.1 > Reporter: Lin Yiqun > Attachments: datanode-restart_after.gc.log, > datanode-restart_before.gc.log, datanode.log > > > One of my datanodes exited without any shutdown info. > {code} > 2016-02-25 14:46:00,283 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > PacketResponder: > BP-1942012336-XX.XX.2.191-1406726500544:blk_1730224536_658031130, > type=HAS_DOWNSTREAM_IN_PIPELINE terminating > 2016-02-25 15:03:55,639 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > STARTUP_MSG: > /************************************************************ > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = XX.XX6032/XX.XX.6.32 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 2.7.1 > {code} > I think maybe full gc causes this problem, so I looked the datanode gc log. > There is a cms gc but the time of this gc is after than restart datanode > time. > {code} > 2016-02-25T15:03:57.930+0800: 2.756: [GC2016-02-25T15:03:57.930+0800: 2.756: > [ParNew: 1677824K->24417K(1887488K), 0.0249280 secs] > 1677824K->24417K(8178944K), 0.0251010 secs] [Times: user=0.24 sys=0.07, > real=0.02 secs] > 2016-02-25T15:12:46.498+0800: 531.324: [GC [1 CMS-initial-mark: 0K(6291456K)] > 780481K(8178944K), 0.0554170 secs] [Times: user=0.06 sys=0.00, real=0.07 secs] > 2016-02-25T15:12:46.567+0800: 531.393: [CMS-concurrent-mark-start] > 2016-02-25T15:12:46.574+0800: 531.400: [CMS-concurrent-mark: 0.006/0.007 > secs] [Times: user=0.07 sys=0.02, real=0.01 secs] > 2016-02-25T15:12:46.574+0800: 531.400: [CMS-concurrent-preclean-start] > 2016-02-25T15:12:46.589+0800: 531.415: [CMS-concurrent-preclean: 0.015/0.015 > secs] [Times: user=0.16 sys=0.06, real=0.01 secs] > {code} > It seems this is not the main reason. Gc of time before datanode exiting > seems normal. > {code} > 2016-02-25T14:45:39.743+0800: 5431411.796: [GC2016-02-25T14:45:39.743+0800: > 5431411.796: [ParNew: 1686799K->22696K(1887488K), 0.0385700 secs] > 2908579K->1244476K(8178944K) icms_dc=0 , 0.0388280 secs] [Times: user=0.23 > sys=0.01, real=0.04 secs] > {code} > So it looks confusion. Attach the complete gc logs and datanode log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)