Hey there, I've set up rack awareness on my hadoop cluster with replication 3. I have 2 racks and each contains 50% of the nodes. I can see that the blocks are spread on the 2 racks, the problem is that all nodes from a rack are storing 2 replicas and the nodes of the other rack just one. If I launch the hadoop balancer script, it will properly spread the replicas across the 2 racks, leaving all nodes with exactly the same available disk space but, after jobs are running for hours, the data will be unbalanced again (rack1 having all nodes with less empty disk space than all nodes from rack2)
Any clue whats going on? Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/rack-awarness-unexpected-behaviour-tp4086029.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.