[ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856223#action_12856223 ]
Brian Bockelman commented on HDFS-1094: --------------------------------------- A funny related anecdote that I've heard third-hand. I could never trace down the authenticity, but I found it amusing - A large physics experiment once tried to track down and classify as many errors in their simulation software was possible. After they removed the known source of errors, they took the remaining unreproducible errors and mapped them to the worker nodes. Then, they took the list of worker nodes and mapped to where they were in the machine room. Sure enough, all the unreproducible errors could be tracked to the top two nodes in the rack. So, if you put all the copies at the same height on the rack, the probability of losing the files at the top of the rack is definitely higher than the probability of losing the bottom of the rack. > Intelligent block placement policy to decrease probability of block loss > ------------------------------------------------------------------------ > > Key: HDFS-1094 > URL: https://issues.apache.org/jira/browse/HDFS-1094 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: dhruba borthakur > Assignee: dhruba borthakur > > The current HDFS implementation specifies that the first replica is local and > the other two replicas are on any two random nodes on a random remote rack. > This means that if any three datanodes die together, then there is a > non-trivial probability of losing at least one block in the cluster. This > JIRA is to discuss if there is a better algorithm that can lower probability > of losing a block. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira