Balancer improvement
--------------------

                 Key: HDFS-1105
                 URL: https://issues.apache.org/jira/browse/HDFS-1105
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Dmytro Molkov


We were seeing some weird issues with the balancer in our cluster:

1) it can get stuck during an iteration and only restarting it helps
2) the iterations are highly inefficient. With 20 minutes iteration it moves 7K 
blocks a minute for the first 6 minutes and hundreds of blocks in the next 14 
minutes
3) it can hit namenode and the network pretty hard

A few improvements we came up with as a result:
Making balancer more deterministic in terms of running time of iteration, 
improving the efficiency and making the load configurable:

Make many of the constants configurable command line parameters: Iteration 
length, number of blocks to move in parallel to a given node and in cluster 
overall.
Terminate transfers that are still in progress after iteration is over.

Previously iteration time was the time window in which the balancer was 
scheduling the moves and then it would wait for the moves to finish 
indefinitely. Each scheduling task can run up to iteration time or even longer. 
This means if you have too many of them and they are long your actual 
iterations are longer than 20 minutes. Now each scheduling task has a time of 
the start of iteration and it should schedule the moves only if it did not run 
out of time. So the tasks that have started after the iteration is over will 
not schedule any moves.

The number of move threads and dispatch threads is configurable so that 
depending on the load of the cluster you can run it slower.

I will attach a patch, please let me know what you think and what can be done 
better.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to