Update free space in the DataBlockScanner rather than using du --------------------------------------------------------------
Key: HDFS-3297 URL: https://issues.apache.org/jira/browse/HDFS-3297 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.23.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor As the DataNode adds new blocks to a BlockPool, it keeps track of how much space that block pool consumes. This information gets sent to the NameNode so we can track statistics and so forth. Periodically, we check what's actually on the disk to make sure that the counts we are keeping are accurate. The DataNode currently kicks off a "du -s" process through the shell every few minutes and takes the result as the new used space number. We should do this in the DataBlockScanner, rather than using a separate du process. The main reason to do this is so that we don't cause a lot of random I/O operations on the disk. Since du has to visit every file in the BlockPool, it is essentially re-doing the work of the block scanner, for no reason. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira