[ 
https://issues.apache.org/jira/browse/HADOOP-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515423
 ] 

Hairong Kuang edited comment on HADOOP-1652 at 7/25/07 2:30 PM:
----------------------------------------------------------------

Here are some of my initial thoughts. Please comment.

1. What's balance?
A cluster is balanced iff there is no under-capactiy or over-capacity data 
nodes in the cluster.
An under-capacity data node is a node that its %used space is less than 
avg_%used_space-threshhold.
An over-capacity data node is a node that its %used space is greater than 
avg_%used_space+threshhold. 
A threshold is user configurable. A default value could be 20% of % used space.

2. When to rebalance?
Rebanlancing is performed on demand. An administrator issues a command to 
trigger rebalancing. Rebalancing automatically shuts off once the cluster is 
balanced and can also be interrupted by an administrator. The following 
commands are to be supported:
Hadoop dfsadmin balance <start/stop/get>
                  -----Start/stop data block rebalancing or query its status. 

3. How to balance?
    (a) Upon receiving a data block rebalancing request, a name node creates a 
Balancing thread. 
    (b) The thread performs rebalancing iteratively. 
          # At each iteration, it scans the whole data node list and schedules 
block moving tasks. It sleeps for a heartbeat interval between iterations;
          # When scanning the data node list, if it finds an under-capacity 
data node, it schedules moving blocks to the data node. The source data node is 
chosen randomly from over-capacity data nodes or non-under-capacity data nodes 
if no over-capacity data node exists. The source block is randomly chosen from 
the source data node as long as the block moving does not violate requirement 
(1).
          # If the thread finds an over-capacity data node, it scheduls moving 
blocks from the data node to other data nodes. It chooses a target data node 
randomly from under-capacity data nodes or non-over-capcity data nodes when 
there is no under-capacity data node; It then randomly chooses a source block 
that does not violate requirement (1). 
          # The scheduled tasks are put to a queue in the source data node. The 
task queue has a limited length of 4 by default and is configurable.
          # The scheduled tasks are sent to data nodes to execute in responding 
to a heartbeat message. Currently dfs limits at most 2 tasks per heartbeat by 
default.
    (c) The thread stops and frees itself when the cluster becomes balanced.


 was:
Here are some of my initial thoughts. Please comment.

1. What's balance?
A cluster is balanced iff there is no under-capactiy or over-capacity data 
nodes in the cluster.
An under-capacity data node is a node that its %used space is less than 
avg_%used_space-threshhold.
An over-capacity data node is a node that its %used space is greater than 
avg_%used_space+threshhold. 
A threshold is user configurable. A default value could be 20% of % used space.

2. When to rebalance?
Rebanlancing is performed on demand. An administrator issues a command to 
trigger rebalancing. Rebalancing automatically shuts off once the cluster is 
balanced and can also be interrupted by an administrator. The following 
commands are to be supported:
Hadoop dfsadmin balance <start/stop/get>
                  -----Start/stop data block rebalancing or query its status. 

3. How to balance?
    * Upon receiving a data block rebalancing request, a name node creates a 
Balancing thread. 
    * The thread performs rebalancing iteratively. 
          # At each iteration, it scans the whole data node list and schedules 
block moving tasks. It sleeps for a heartbeat interval between iterations;
          # When scanning the data node list, if it finds an under-capacity 
data node, it schedules moving blocks to the data node. The source data node is 
chosen randomly from over-capacity data nodes or non-under-capacity data nodes 
if no over-capacity data node exists. The source block is randomly chosen from 
the source data node as long as the block moving does not violate requirement 
(1).
          # If the thread finds an over-capacity data node, it scheduls moving 
blocks from the data node to other data nodes. It chooses a target data node 
randomly from under-capacity data nodes or non-over-capcity data nodes when 
there is no under-capacity data node; It then randomly chooses a source block 
that does not violate requirement (1). 
          # The scheduled tasks are put to a queue in the source data node. The 
task queue has a limited length of 4 by default and is configurable.
          # The scheduled tasks are sent to data nodes to execute in responding 
to a heartbeat message. Currently dfs limits at most 2 tasks per heartbeat by 
default.
    * The thread stops and frees itself when the cluster becomes balanced.

> Rebalance data blocks when new data nodes added or data nodes become full
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-1652
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1652
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.15.0
>
>
> When a new data node joins hdfs cluster, it does not hold much data. So any 
> map task assigned to the machine most likely does not read local data, thus 
> increasing the use of network bandwidth. On the other hand, when some data 
> nodes become full, new data blocks are placed on only non-full data nodes, 
> thus reducing their read parallelism. 
> This jira aims to find an approach to redistribute data blocks when imbalance 
> occurs in the cluster.  An solution should meet the following requirements:
> 1. It maintains data availablility guranteens in the sense that rebalancing 
> does not reduce the number of replicas that a block has or the number of 
> racks that the block resides.
> 2. An adminstrator should be able to invoke and interrupt rebalancing from a 
> command line.
> 3. Rebalancing should be throttled so that rebalancing does not cause a 
> namenode to be too busy to serve any incoming request or saturate the network.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to