Daryn Sharp commented on HDFS-11313:

Interesting idea.  Need to think through race conditions because the current 
naive design (snapshot in time), is easy to reconcile state in the NN.  Not 
saying I like it, just that we need to think hard about new races esp. with 

It must include provisions for negative block ids so it's not just the last 
segment that is open ended.  Not a contrived use case, we have many 2.x 
clusters with legacy negative block ids esp. archival clusters.

What would be the basic design?  Is it predicated on the NN sorting block ids?  
If yes, I have strong concerns I'll outline.  How are the segment ranges 
computed?  Fixed size?  How will very sparse block ranges be handled, esp. in 
the case of negative block ids?

What I've long wanted to do is invert the block report processing.  The NN 
sends BRs to the DN, and DN reconciles inconsistencies with IBRs.  Haven't 
thought through it beyond the concept, but I digress.

> Segmented Block Reports
> -----------------------
>                 Key: HDFS-11313
>                 URL: https://issues.apache.org/jira/browse/HDFS-11313
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.2
>            Reporter: Konstantin Shvachko
> Block reports from a single DataNode can be currently split into multiple 
> RPCs each reporting a single DataNode storage (disk). The reports are still 
> large since disks are getting bigger. Splitting blockReport RPCs into 
> multiple smaller calls would improve NameNode performance and overall HDFS 
> stability.
> This was discussed in multiple jiras. Here the approach is to let NameNode 
> divide blockID space into segments and then ask DataNodes to report replicas 
> in a particular range of IDs.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to