[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279230#comment-17279230
 ] 

Kihwal Lee commented on HDFS-15813:
-----------------------------------

+1. Unit test failures seem unrelated.  If you can't find existing Jira for the 
failures, please file one for each.  I've looked at 
{{TestUnderReplicatedBlocks#testSetRepIncWithUnderReplicatedBlocks}} briefly. 
It appears to be a test issue.

The test artificially invalidated a replica on a node, but before the test made 
further progress, the NN fixed the under-replication by having another node 
send the block to the same node.  The test then went ahead and removed it from 
the NN's data structure (blocksmap) and called {{setReplication()}}. The NN 
picked two nodes, but one of them was the node that already has the block 
replica. It was only missing in NN's data structure. Again, this happened 
because the NN fixed the under-replication between the test deleting the 
replica and modifying the nn data structure. The replication failed with 
{{ReplicaAlreadyExistsException}}.   This kind of inconsistency does not happen 
in real clusters, but even if it did, it would be fixed when the replication 
times out.  The test is set to timeout before the default replication timeout, 
so it didn't have any chance to do that. 

> DataStreamer: keep sending heartbeat packets while streaming
> ------------------------------------------------------------
>
>                 Key: HDFS-15813
>                 URL: https://issues.apache.org/jira/browse/HDFS-15813
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.4.0
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>         Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, 
> HDFS-15813.003.patch, HDFS-15813.004.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to