jianghuazhu commented on a change in pull request #3861:
URL: https://github.com/apache/hadoop/pull/3861#discussion_r808697481



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
##########
@@ -2812,6 +2816,9 @@ public void checkAndUpdate(String bpid, ScanInfo scanInfo)
             + memBlockInfo.getNumBytes() + " to "
             + memBlockInfo.getBlockDataLength());
         memBlockInfo.setNumBytes(memBlockInfo.getBlockDataLength());
+      } else if (!isRegular) {
+        corruptBlock = new Block(memBlockInfo);
+        LOG.warn("Block:{} is not a regular file.", corruptBlock.getBlockId());

Review comment:
       Thanks @tomscut  for the comment and review.
   This happens occasionally, I've been monitoring it for a long time and still 
haven't found the root cause.
   But I think this situation may be related to the Linux environment. When the 
normal data flow is working, no exception occurs. (I will continue to monitor 
this situation)
   Here are some more canonical checks to prevent further worse cases on the 
cluster. This is a good thing for clusters.
   
   When the file is actually cleaned up, the specific path will be printed. 
Here are some examples of online clusters:
   `
   2022-02-15 11:24:12,856 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
 Deleted BP-xxxx blk_xxxx file 
/mnt/dfs/11/data/current/BP-xxxx.xxxx.xxxx/current/finalized/subdir0/subdir0/blk_xxxx
   `




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to