[
https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339695#comment-14339695
]
Colin Patrick McCabe commented on HDFS-7836:
--------------------------------------------
bq. The hashing scheme should probably not be that simple \[as mod 5\]. Block
IDs are sequentially allocated so it is not hard to think of pathological app
behavior causing skewed block distribution across stripes over time.
Hmm. Our sequential block allocations should guarantee that mod N produces an
approximately equal number of blocks in each stripe. It is only with randomly
allocated block IDs that we could even theoretically get an imbalance (although
the probability is vanishingly small even there if the randomness is uniform.).
With sequentially allocated block IDs the stripes will always be of equal
size. I guess deletions of blocks could change that, but I see no reason why
any group of blocks mod N should be more deleted than another group.
> BlockManager Scalability Improvements
> -------------------------------------
>
> Key: HDFS-7836
> URL: https://issues.apache.org/jira/browse/HDFS-7836
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Charles Lamb
> Assignee: Charles Lamb
> Attachments: BlockManagerScalabilityImprovementsDesign.pdf
>
>
> Improvements to BlockManager scalability.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)