hi all, I am on hadoop hdfs version of Hadoop 2.6.0-cdh5.8.0. I discovered that the DirectoryScanner is keep causing java GC and slow down the hadoop nodes. Digging into the log file I discovered this:
2018-11-29 13:34:37,995 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool BP-1850109981-192.168.1.1-1413178082983 Total blocks: 3896197, missing metadata files:214, missing block files:214, missing blocks in memory:103, mismatched blocks:0 Reading from internet posting, there are postings saying this is from DirectoryScanner which will be executed in every 6 hours. This directory scanning caused GC hiccup in all nodes and caused performance issues on the cluster. Question: when I am doing the hdfs dfsadmin -report. It does not say that I have any corrupted files. Also, I did the hdfs fsck / onto the directory and it does not yield any problems. How can I know what is the missing block files, missing blocks in memory and missing metadata files? Thanks! -- Hiu Yen Onn