Lin Yiqun created HDFS-9624:
-------------------------------
Summary: DataNode start slowly due to the initial DU command
operations
Key: HDFS-9624
URL: https://issues.apache.org/jira/browse/HDFS-9624
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Lin Yiqun
Assignee: Lin Yiqun
It seems starting datanode so slowly when I am finishing migration of datanodes
and restart them.I look the dn logs:
{code}
2016-01-06 16:05:08,118 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added new
volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4
2016-01-06 16:05:08,118 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added
volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK
2016-01-06 16:05:08,176 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Registered
FSDatasetState MBean
2016-01-06 16:05:08,177 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding
block pool BP-1942012336-xx.xx.xx.xx-1406726500544
2016-01-06 16:05:08,178 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data2/current...
2016-01-06 16:05:08,179 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data3/current...
2016-01-06 16:05:08,179 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data4/current...
2016-01-06 16:05:08,179 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data5/current...
2016-01-06 16:05:08,180 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data6/current...
2016-01-06 16:05:08,180 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data7/current...
2016-01-06 16:05:08,180 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data8/current...
2016-01-06 16:05:08,180 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data9/current...
2016-01-06 16:05:08,181 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data10/current...
2016-01-06 16:05:08,181 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data11/current...
2016-01-06 16:05:08,181 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning
block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume
/home/data/data/hadoop/dfs/data/data12/current...
2016-01-06 16:09:49,646 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data7/current: 281466ms
2016-01-06 16:09:54,235 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data9/current: 286054ms
2016-01-06 16:09:57,859 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data2/current: 289680ms
2016-01-06 16:10:00,333 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data5/current: 292153ms
2016-01-06 16:10:05,696 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data8/current: 297516ms
2016-01-06 16:10:11,229 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data6/current: 303049ms
2016-01-06 16:10:28,075 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data12/current: 319894ms
2016-01-06 16:10:33,017 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data4/current: 324838ms
2016-01-06 16:10:40,177 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data10/current: 331996ms
2016-01-06 16:10:44,882 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data3/current: 336703ms
2016-01-06 16:11:14,241 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken
to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on
/home/data/data/hadoop/dfs/data/data11/current: 366060ms
2016-01-06 16:11:14,242 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Total time
to scan all replicas for block pool BP-1942012336-xx.xx.xx.xx-1406726500544:
366065ms
{code}
And I know that Scanning blocks on volume and then calculating the dfsUsed
costs the most of time. Because my datanode's migiration costs the much time,
so that dfsUsed value can't use cache-dfsused and should be doing du
operations. But actually I don't need do it again because there has no
operations in these datanodes. The info is these:
{code}
/**
* Read in the cached DU value and return it if it is less than 600 seconds
* old (DU update interval). Slight imprecision of dfsUsed is not critical and
* skipping DU can significantly shorten the startup time. If the cached value
* is not available or too old, -1 is returned.
* */
{code}
The 600 seconds is a dead code. And it looks not suitable for here.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)