[
https://issues.apache.org/jira/browse/HDFS-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924729#action_12924729
]
Dmytro Molkov commented on HDFS-1476:
-------------------------------------
I think the parallelism here will depend on multiple different characteristics:
the number of cores in the system, the size of the blocks map, maybe others. So
it might make sense to make this one configurable to begin with. Large clusters
will certainly benefit from running multiple parallel threads since the time to
scan full BlocksMap will be in minutes.
> listCorruptFileBlocks should be functional while the name node is still in
> safe mode
> ------------------------------------------------------------------------------------
>
> Key: HDFS-1476
> URL: https://issues.apache.org/jira/browse/HDFS-1476
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Patrick Kling
>
> This would allow us to detect whether missing blocks can be fixed using Raid
> and if that is the case exit safe mode earlier.
> One way to make listCorruptFileBlocks available before the name node has
> exited from safe mode would be to perform a scan of the blocks map on each
> call to listCorruptFileBlocks to determine if there are any blocks with no
> replicas. This scan could be parallelized by dividing the space of block IDs
> into multiple intervals than can be scanned independently.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.