[
https://issues.apache.org/jira/browse/HDFS-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854448#comment-16854448
]
Sean Chow commented on HDFS-14476:
----------------------------------
[~jojochuang] I've found out why _every 6hours' scan will have about 25000
abnormal blocks to fix_.
Because the directory scan and get memory finalized blocks is asynchronously.
The actual disk scan may take one hour to finish, between that time period
other blocks is added or removed.
But the long held lock mentioned in this issue still exists. I've this patch
my own and will test it.
> lock too long when fix inconsistent blocks between disk and in-memory
> ---------------------------------------------------------------------
>
> Key: HDFS-14476
> URL: https://issues.apache.org/jira/browse/HDFS-14476
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 2.6.0, 2.7.0
> Reporter: Sean Chow
> Priority: Major
>
> When directoryScanner have the results of differences between disk and
> in-memory blocks. it will try to run `checkAndUpdate` to fix it. However
> `FsDatasetImpl.checkAndUpdate` is a synchronized call
> As I have about 6millions blocks for every datanodes and every 6hours' scan
> will have about 25000 abnormal blocks to fix. That leads to a long lock
> holding FsDatasetImpl object.
> let's assume every block need 10ms to fix(because of latency of SAS disk),
> that will cost 250 seconds to finish. That means all reads and writes will be
> blocked for 3mins for that datanode.
>
> {code:java}
> 2019-05-06 08:06:51,704 INFO
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool
> BP-1644920766-10.223.143.220-1450099987967 Total blocks: 6850197, missing
> metadata files:23574, missing block files:23574, missing blocks in
> memory:47625, mismatched blocks:0
> ...
> 2019-05-06 08:16:41,625 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
> Took 588402ms to process 1 commands from NN
> {code}
> Take long time to process command from nn because threads are blocked. And
> namenode will see long lastContact time for this datanode.
> Maybe this affect all hdfs versions.
> *to fix:*
> just like process invalidate command from namenode with 1000 batch size, fix
> these abnormal block should be handled with batch too and sleep 2 seconds
> between the batch to allow normal reading/writing blocks.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]