I am reading Hadoop Definitive Guide 2nd Edition and I am struggling to figure out the exact Hadoop's formula for network distance calculation (page 64/65). (I have my guesses, but I would like to know the exact formula)
There is an example showing following distances: For example, imagine a node n1 on rack r1 in data center d1. This can be represented as /d1/r1/n1. Using this notation, here are the distances for the four scenarios: • distance(/d1/r1/n1, /d1/r1/n1) = 0 (processes on the same node) • distance(/d1/r1/n1, /d1/r1/n2) = 2 (different nodes on the same rack) • distance(/d1/r1/n1, /d1/r2/n3) = 4 (nodes on different racks in the same data center) • distance(/d1/r1/n1, /d2/r3/n4) = 6 (nodes in different data centers) and there is illustration there as well. Here is the link to the illustration: http://books.google.com/books?id=Nff49D7vnJcC&lpg=PA65&ots=IidrYuayXs&dq=hadoop%20network%20distance%20calculation&pg=PA65#v=onepage&q=hadoop%20network%20distance%20calculation&f=false If different rack is 4 and same one is 2 what would be the distance of other nodes that are on the same rack? 2 as well? Can distance be 1? Thank you, Edmon http://it.toolbox.com/blogs/lim
