[jira] [Commented] (HDFS-3498) Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass

Junping Du (JIRA) Tue, 19 Jun 2012 20:22:48 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397230#comment-13397230
 ]


Junping Du commented on HDFS-3498:
----------------------------------

Hey Nicholas,
   Just update patch in HADOOP-8472 which is the implementation of 
ReplicaPlacementPolicy for VM case. Sorry for the confusing to take without the 
VM implementation part. 
   That's great questions, below is my reply:
   *  How are you going to use LocalityGroup in the VM case?
   Current replica removal policy try to make replica robust (after removing) 
in rack level, so it split replica nodes into two categories according to their 
rack. I think algorithm is general for other cases so I try to separate the 
rack-specific part to getLocalityGroupForSplit() which can be overridden easily 
to address some other cases that we may need other locality group other than 
rack to play as failure group.
In VM case, I think we still need rack level to play as robust group so just 
override getRack() but not getLocalityGroupForSplit() in VM implementation. So 
getLocalityGroupForSplit() just make it extensible for future requirements. If 
you think it is unnecessary, I am ok to delete it but override getRack().
 
   * Are you going to override pickupReplicaSet(..) in the vm implementation? 
If yes, how?
   Yes. It will divide the first set (nodes with other replica living in the 
same rack) into two sub categories: one set includes nodes with other replica 
in the same nodegroup, the other set includes remaining. A little overhead 
added to the whole algorithm, but still keep linear as before.

   * Do you know what does "priSet" stand for? It does not seem a good name not 
me.
   Sorry. I just reuse the old terminology in previous code. it stands for the 
first category nodes that there are other replicas living in the same rack so 
removing 1 of these nodes will not reduce the racks of a replica. May be we can 
go with something like: rackReplicatedNodes? Any suggestion here?
    
Thanks,

Junping
                
> Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
> ------------------------------------------------------------------------
>
>                 Key: HDFS-3498
>                 URL: https://issues.apache.org/jira/browse/HDFS-3498
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HDFS-3498.patch, 
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. A user 
> specified ReplicaPlacementPolicy can be specified in the hdfs-site.xml 
> configuration under the key "dfs.block.replicator.classname". However, to 
> make it possible to reuse code in ReplicaPlacementPolicyDefault a few of its 
> methods were changed from private to protected. ReplicaPlacementPolicy and 
> BlockPlacementPolicyDefault are currently annotated with 
> @InterfaceAudience.Private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3498) Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass

Reply via email to