[jira] Commented: (HADOOP-4116) Balancer should provide better resource management

Hairong Kuang (JIRA) Fri, 12 Sep 2008 17:10:07 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630713#action_12630713
 ]


Hairong Kuang commented on HADOOP-4116:
---------------------------------------

Proposed changes to the Balancer:
1. Remove the use of Semaphor at DataNodes. Instead a DataNode uses a counter 
to manages the number of concurrent block moves. On receiving a block move 
request while maximum block moves are in progress, reject the request 
immediately.
2. Let the receiver initiate the block move; The sender rejects the request 
when the maximum number has already reached. As a result when either the sender 
or the receiver does not have resource to handle block move, the block content 
will not get transfered across network.
3. The balancer does not set a timeout on a socket. Instead, it sets the option 
KeepAlive on the socket. So a block move does not timeout no matter how slow it 
goes and next phrase of scheduling does not get started when there is a pending 
block move. 

> Balancer should provide better resource management
> --------------------------------------------------
>
>                 Key: HADOOP-4116
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4116
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Raghu Angadi
>            Assignee: Hairong Kuang
>
> The number of threads are currently limited on datanodes. Once these threads 
> are occupied, DataNode does not accept any more requests (DOS). Recently we 
> saw a case where most of the 256 threads were waiting in 
> {{DataXceiver.replaceBlock()}} trying to acquire  {{balancingSem}}.  Since 
> rebalancing  is (heavily) throttled, I would think this would be the common 
> case. 
> These operations waiting  for active rebalancing threads to finish need not 
> take up a thread. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4116) Balancer should provide better resource management

Reply via email to