[
https://issues.apache.org/jira/browse/HDFS-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gruust updated HDFS-12662:
--------------------------
Description:
Corrupt blocks currently need to be removed manually and effectively block the
data node on which they reside to receive a good copy of the same block. In
small clusters (ie. node count == replication factor), this prevents the name
node from finding a free data node to keep the desired replication level up
until the user manually runs some fsck command to remove the corrupt block.
I suggest moving the corrupt block out of the way, like it's usually done by
ext2-based filesystems, ie. move the block to /lost+found directory, such that
the name node can replace it immediately.
was:
Corrupt blocks currently need to be removed manually and effectively block the
data node on which they reside to receive a good copy of the same block. In
small clusters (ie. node count == replication factor), this prevents the name
node to find a free data node to keep the desired replication level up until
the user manually runs some fsck command to remove the corrupt block.
I suggest moving the corrupt block out of the way, like it's usually done by
ext2-based filesystems, ie. move the block to /lost+found directory, such that
the name node can replace it immediately.
> lost+found strategy for bad/corrupt blocks to improvate data replication
> 'SLA' for small clusters
> -------------------------------------------------------------------------------------------------
>
> Key: HDFS-12662
> URL: https://issues.apache.org/jira/browse/HDFS-12662
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: block placement
> Affects Versions: 2.8.1
> Reporter: Gruust
> Priority: Minor
>
> Corrupt blocks currently need to be removed manually and effectively block
> the data node on which they reside to receive a good copy of the same block.
> In small clusters (ie. node count == replication factor), this prevents the
> name node from finding a free data node to keep the desired replication level
> up until the user manually runs some fsck command to remove the corrupt block.
> I suggest moving the corrupt block out of the way, like it's usually done by
> ext2-based filesystems, ie. move the block to /lost+found directory, such
> that the name node can replace it immediately.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]