[
https://issues.apache.org/jira/browse/HADOOP-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681098#action_12681098
]
Hairong Kuang commented on HADOOP-5465:
---------------------------------------
Thank Koji for his tireless investigation on this issue.
When this situation occurs, the source DataNode of the block shows abnormal
behavior. No blocks gets replicated from this node or no block gets removed
from this node. Digging into the problem, we seet that NameNode sends the
DataNode an empty replication request, i.e. a replication request with no
blocks and targets as parameters, on every heartbeat reply, thus preventing
sending the node any replication or deletion request. More suspiciously
DataNode notifies NameNode that it has 1 replication in progress although its
jstack shows that it has no replication (data transfer) thread alive.
> Blocks remain under-replicated
> ------------------------------
>
> Key: HADOOP-5465
> URL: https://issues.apache.org/jira/browse/HADOOP-5465
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.3
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Priority: Blocker
> Fix For: 0.18.4
>
>
> Occasionally we see some blocks remain to be under-replicated in our
> production clusters. This is what we obeserved:
> 1. Sometimes when increasing the replication factor of a file, some blocks
> belonged to this file do not get to increase to the new replication factor.
> 2. When taking meta save in two different days, some blocks remain in
> under-replication queue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.