[
https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397230#comment-13397230
]
Junping Du commented on HDFS-3498:
----------------------------------
Hey Nicholas,
Just update patch in HADOOP-8472 which is the implementation of
ReplicaPlacementPolicy for VM case. Sorry for the confusing to take without the
VM implementation part.
That's great questions, below is my reply:
* How are you going to use LocalityGroup in the VM case?
Current replica removal policy try to make replica robust (after removing)
in rack level, so it split replica nodes into two categories according to their
rack. I think algorithm is general for other cases so I try to separate the
rack-specific part to getLocalityGroupForSplit() which can be overridden easily
to address some other cases that we may need other locality group other than
rack to play as failure group.
In VM case, I think we still need rack level to play as robust group so just
override getRack() but not getLocalityGroupForSplit() in VM implementation. So
getLocalityGroupForSplit() just make it extensible for future requirements. If
you think it is unnecessary, I am ok to delete it but override getRack().
* Are you going to override pickupReplicaSet(..) in the vm implementation?
If yes, how?
Yes. It will divide the first set (nodes with other replica living in the
same rack) into two sub categories: one set includes nodes with other replica
in the same nodegroup, the other set includes remaining. A little overhead
added to the whole algorithm, but still keep linear as before.
* Do you know what does "priSet" stand for? It does not seem a good name not
me.
Sorry. I just reuse the old terminology in previous code. it stands for the
first category nodes that there are other replicas living in the same rack so
removing 1 of these nodes will not reduce the racks of a replica. May be we can
go with something like: rackReplicatedNodes? Any suggestion here?
Thanks,
Junping
> Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
> ------------------------------------------------------------------------
>
> Key: HDFS-3498
> URL: https://issues.apache.org/jira/browse/HDFS-3498
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: data-node
> Affects Versions: 1.0.0, 2.0.0-alpha
> Reporter: Junping Du
> Assignee: Junping Du
> Attachments: HDFS-3498.patch,
> Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. A user
> specified ReplicaPlacementPolicy can be specified in the hdfs-site.xml
> configuration under the key "dfs.block.replicator.classname". However, to
> make it possible to reuse code in ReplicaPlacementPolicyDefault a few of its
> methods were changed from private to protected. ReplicaPlacementPolicy and
> BlockPlacementPolicyDefault are currently annotated with
> @InterfaceAudience.Private.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira