haiyang1987 opened a new pull request, #6476: URL: https://github.com/apache/hadoop/pull/6476
### Description of PR https://issues.apache.org/jira/browse/HDFS-17346 DirectoryScanner check mark the normal blocks as corrupt and report to namenode, it maybe cause some corrupted blocks, actually these are health. This can happen if Appending and DirectoryScanner are running at the same time, and the probability is very high. **Root cause:** - Create a file such as:blk_xxx_1001 and diskFile is "file:/XXX/current/finalized/blk_xxx", diskMetaFile is "file:/XXX/current/finalized/blk_xxx_1001.meta" - Run DirectoryScanner, first will create BlockPoolReport.ScanInfo and record blockFile is "file:/XXX/current/finalized/blk_xxx" and metaFile is "file:/XXX/current/finalized/blk_xxx_1001.meta" - Simultaneously other thread to complete append for blk_xxx, then the diskFile "file:/XXX/current/finalized/blk_xxx", diskMetaFile "file:/XXX/current/finalized/blk_xxx_1002.meta", memMetaFile "file:/XXX/current/finalized/blk_xxx", memDataFile "file:/XXX/current/finalized/blk_xxx_1002.meta" - DirectoryScanner continue to run, due to the different generation stamps of the metadata file in mem and metadata file in scanInfo will add the scanInfo object to the list of differences - Continue to run FsDatasetImpl#checkAndUpdate will traverse the list of differences, due to current diskMetaFile "/XXX/current/finalized/blk_xxx_1001.meta" is not exists, so isRegular as false ``` final boolean isRegular = FileUtil.isRegularFile(diskMetaFile, false) && FileUtil.isRegularFile(diskFile, false); ``` - Here will mark the normal blocks as corrupt and report to namenode ``` } else if (!isRegular) { corruptBlock = new Block(memBlockInfo); LOG.warn("Block:{} is not a regular file.", corruptBlock.getBlockId()); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
