[
https://issues.apache.org/jira/browse/HDFS-8564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582517#comment-14582517
]
Colin Patrick McCabe commented on HDFS-8564:
--------------------------------------------
Another possibility would be to rate-limit these scans so that we never do more
than 1 per half hour (or whatever). It seems like if we hit a lot of I/O
errors on a failing drive (a common scenario for a failing drive), we'll be
constantly doing these scans and discovering nothing interesting. I also don't
understand why the scan happens on all drives rather than just the failing
one-- does anyone have some background on that?
> BlockPoolSlice.checkDirs() will trigger excessive IO while traversing all
> sub-directories under finalizedDir
> ------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-8564
> URL: https://issues.apache.org/jira/browse/HDFS-8564
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, HDFS
> Affects Versions: 3.0.0
> Reporter: Esteban Gutierrez
> Assignee: Esteban Gutierrez
> Priority: Critical
>
> DataNodes continuously call checkDiskErrorAsync() for multiple operations in
> the DN in order to verify if a volume hasn't experienced any failure. When
> DN.startCheckDiskErrorThread() is invoked we need to traverse all configured
> data volumes on a DN to see which volumes need to be removed (see
> FsVolumeList.checkDir(s)) however that means that for each directory on
> BlockPoolSlice we need to call DiskChecker.checkDirs() which will
> recursively will look into the rbw, tmp and finalized directories:
> {code}
> void checkDirs() throws DiskErrorException {
> DiskChecker.checkDirs(finalizedDir);
> DiskChecker.checkDir(tmpDir);
> DiskChecker.checkDir(rbwDir);
> }
> {code}
> Unfortunately after HDFS-6482, the subdirectory structure is created with the
> following algorithm:
> {code}
> public static File idToBlockDir(File root, long blockId) {
> int d1 = (int)((blockId >> 16) & 0xff);
> int d2 = (int)((blockId >> 8) & 0xff);
> String path = DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP +
> DataStorage.BLOCK_SUBDIR_PREFIX + d2;
> return new File(root, path);
> }
> {code}
> Which leaves each data volume with 64K directories (256 directories x 256
> subdirectories) A side effect of this is that if the dentries haven't been
> cached by the OS, then the DN needs to recursively scan up to 64k directories
> x the number of configured data volumes (x number of files) impacting IO for
> other operations while DiskChecker.checkDirs(finalizedDir) is running.
> There are few possibilities to address this problem:
> 1. Do not scan at all finalizedDir
> 2. Limit to one level the number of sub directories to scan recursively. (256)
> 3. Remove a subdirectory immediately it doesn't have any block under it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)