[ 
https://issues.apache.org/jira/browse/HDFS-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated HDFS-6441:
-------------------------------

    Attachment: HDFS-6441.patch

Attaching a newer patch based on the comments received.

The following changes are done:

1. Maintains a separate set for nodesToBeExcluded and nodesToBeIncluded.

2. Nodes used for balancing are formed as :

if   nodesToBeIncluded is specified ,
 nodesForBalancing =   ({ nodes returned by namenode } INTERSECT 
nodesToBeIncluded )   -  nodesToBeExcluded

if   nodesToBeIncluded is not specified ,
 nodesForBalancing =  { nodes returned by namenode }   -nodesToBeExcluded

3. getPeerHostName is used to check membership in nodesToBeIncluded or 
nodesToBeExcluded.  If getPeerHostName is null ( true in testcases) , 
getHostName is used. I believe , this is secure since the we are always 
including/excluding from the set of data nodes returned by the namenode.

4. The DEFAULT  is spelled correctly. This caused  changes in few existing test 
cases.



> Add ability to exclude/include few datanodes while balancing
> ------------------------------------------------------------
>
>                 Key: HDFS-6441
>                 URL: https://issues.apache.org/jira/browse/HDFS-6441
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer
>    Affects Versions: 2.4.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>         Attachments: HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, 
> HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, 
> HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch
>
>
> In some use cases, it is desirable to ignore a few data nodes  while 
> balancing. The administrator should be able to specify a list of data nodes 
> in a file similar to the hosts file and the balancer should ignore these data 
> nodes while balancing so that no blocks are added/removed on these nodes.
> Similarly it will be beneficial to specify that only a particular list of 
> datanodes should be considered for balancing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to