[ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856219#action_12856219 ]
Rodrigo Schmidt commented on HDFS-1094: --------------------------------------- I think Brian is right. Dhruba's proposal doesn't change the probability of data loss given that F machines fail. Where you place blocks will only impact the failure probability of that block, and only if nodes have different probabilities of failure per machine. Placing blocks on a circular arrangement (Dhruba's proposal) or randomly doesn't affect the expected number of missing blocks given F failures because they are both uniform placement strategies. And if we go for a non-uniform placement strategy, we will only make the worst case worse. What I mean is that the current block placement strategy is already optimal and we can't make the expected number of missing blocks better by changing just block placement. We can only improve it by changing the redundancy scheme that we use (e.g., adding correction codes or more replicas). If you are wondering what is the expected number of blocks that go missing given F failures, here is the math: Consider a r-way replication scheme and the following variables.... r = number of replicas used F = number of machines that fail concurrently N = number of machines B = number of blocks (without counting replication) I use the prefix notation C(N,a) to denote the number of ways you can pick a elements out of N. With N machines, we have C(N,r) possible replica placements for a single block. Given that F machines fail, we have C(F,r) possible placements within those F failed machines. If the block placement strategy is uniform (i.e., it doesn't favor some specific nodes or racks more than others), the probability of a block having all its replicas within the failed machines is C(F,r) / C(N,r) = (F) * (F-1) * (F-2) * ... * (F - r + 1) ----------------------------------------------- (N) * (N-1) * (N-2) * ... * (N - r + 1) The expected number of missing blocks when F machines fail should be B * [formula above] If B=30M, N = 1000 and F = 3, r=3, the expected number of missing blocks would be about 0.18 Please someone correct if my math or rationale is wrong. > Intelligent block placement policy to decrease probability of block loss > ------------------------------------------------------------------------ > > Key: HDFS-1094 > URL: https://issues.apache.org/jira/browse/HDFS-1094 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: dhruba borthakur > Assignee: dhruba borthakur > > The current HDFS implementation specifies that the first replica is local and > the other two replicas are on any two random nodes on a random remote rack. > This means that if any three datanodes die together, then there is a > non-trivial probability of losing at least one block in the cluster. This > JIRA is to discuss if there is a better algorithm that can lower probability > of losing a block. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira