[
https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299387#comment-14299387
]
Colin Patrick McCabe commented on HDFS-7648:
--------------------------------------------
bq. The original design of DirectoryScanner is to reconciles the differences
between the block information maintained in memory and the actual blocks stored
in disks. So it does fix the in-memory data structure.
Fixing the in-memory data structure is different than fixing the on-disk data
structure. I do not think that the DirectoryScanner should modify the files on
the disk. It just introduces too much potential for error and mistakes in the
scanner to cause data loss.
bq. Yet more questions if the blocks are not fixed: should the block report
include those blocks? How to access those blocks? How and when to fix those
blocks?
The only way we could ever get into this state is:
* if someone manually renamed some block files on ext4
* if someone introduced a bug in the datanode code that put blocks in the wrong
place.
* if there is serious ext4 filesystem corruption
None of those cases seems like something we should be trying to automatically
recover from.
> Verify the datanode directory layout
> ------------------------------------
>
> Key: HDFS-7648
> URL: https://issues.apache.org/jira/browse/HDFS-7648
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Rakesh R
>
> HDFS-6482 changed datanode layout to use block ID to determine the directory
> to store the block. We should have some mechanism to verify it. Either
> DirectoryScanner or block report generation could do the check.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)