[ 
https://issues.apache.org/jira/browse/HADOOP-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681098#action_12681098
 ] 

Hairong Kuang commented on HADOOP-5465:
---------------------------------------

Thank Koji for his tireless investigation on this issue. 

When this situation occurs, the source DataNode of the block shows abnormal 
behavior. No blocks gets replicated from this node or no block gets removed 
from this node.  Digging into the problem, we seet that NameNode sends the 
DataNode an empty replication request, i.e. a replication request with no 
blocks and targets as parameters, on every heartbeat reply, thus preventing 
sending the node any replication or deletion request. More suspiciously 
DataNode notifies NameNode that it has 1 replication in progress although its 
jstack shows that it has no replication (data transfer) thread alive.

> Blocks remain under-replicated
> ------------------------------
>
>                 Key: HADOOP-5465
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5465
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.3
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.18.4
>
>
> Occasionally we see some blocks remain to be under-replicated in our 
> production clusters. This is what we obeserved:
> 1. Sometimes when increasing the replication factor of a file, some blocks 
> belonged to this file do not get to increase to the new replication factor.
> 2. When taking meta save in two different days, some blocks remain in 
> under-replication queue. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to