Hi folks, I have a question regarding hdfs' load balancing when it chooses target datanodes for a block. >From the code, it seems it make a decision based on the information from previously heartbeats. Since heartbeats come every 3 seconds, within that window we may end up putting more load on some datanodes than others. I noticed that for disk space balancing, namenode maintains scheduled block information for each datanode which is updated whenever new block is assigned to the datanodes. Shouldn't we do a similar thing for traffic??
Thanks, Sangmin Lee
