[
https://issues.apache.org/jira/browse/HDFS-15087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009739#comment-17009739
]
Jinglun edited comment on HDFS-15087 at 1/8/20 2:48 AM:
--------------------------------------------------------
Hi [~linyiqun], thanks your suggestion ! I prepared a initial patch based on
xiaomi branch.
There are few annotation in the patch. So I write some annotations below for
the important classes.
* FRHelper, FederationRenameV2, FederationRenameV1(Deprecated):Serialization &
Deserialization.
* FSFedRenameV2Op: It handles saveSubTree and graftSubTree rpcs. Like
FSDirRenameOp.
* BasicInfo: It contains the basic information of an HFR job.
* BlockLinker: It does all the hard links. Including collecting and hardlink.
* BlocksToDup: Blocks in batch.
* FsDatasetImpl: method addBlocksToNewPool is added to hard link replicas.
* IDConsumer: It's used by NameNode. After the Ids are pre-allocated, we use
IDConsumer to consume all the allocated Ids.
* LockHelper: It helps release and restore all the write locks.
* ParallelConsumer: The super class of ChecksumDirectoriesChecker.
* ChecksumDirectoriesChecker: After the HardLink phase done, It verifies the
source path and destination path.
* JobScheduler, Job, JobContext: The Scheduler model in the design doc.
* FederationRenameProcedure: It's the HFR.
* DistCpProcedure: Another version of HFR. Use distcp instead of saveSubTree +
graftSubTree + HardLink.
was (Author: lijinglun):
Hi [~linyiqun], thanks your suggestion ! I prepared a initial patch based on
xiaomi branch.
There are few annotation in the patch. So I write some annotations below for
the important classes.
* FRHelper, FederationRenameV2, FederationRenameV1(Deprecated):Serialization &
Deserialization.
* FSFedRenameV2Op: It handles saveSubTree and graftSubTree rpcs. Like
FSDirRenameOp.
* BasicInfo: It contains the basic information of an HFR job.
* BlockLinker: It does all the hard links. Including collecting and hardlink.
* BlocksToDup: Blocks in batch.
* FsDatasetImpl: method addBlocksToNewPool is added to hard link replicas.
* IDConsumer: It's used by NameNode. After the Ids are pre-allocated, we use
IDConsumer to consume all the allocated Ids.
* LockHelper: It helps release and restore all the write locks.
* ParallelConsumer: The super class of ChecksumDirectoriesChecker.
* ChecksumDirectoriesChecker: After the HardLink phase done, It verifies the
source path and destination path.
* JobScheduler, Job, JobContext: The Scheduler model in the design doc.
* FederationRenameProcedure: It's the HFR.
* DistCpProcedure: Use distcp instead of saveSubTree + graftSubTree + HardLink.
> RBF: Balance/Rename across federation namespaces
> ------------------------------------------------
>
> Key: HDFS-15087
> URL: https://issues.apache.org/jira/browse/HDFS-15087
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Jinglun
> Priority: Major
> Attachments: HDFS-15087.initial.patch, HFR_Rename Across Federation
> Namespaces.pdf
>
>
> The Xiaomi storage team has developed a new feature called HFR(HDFS
> Federation Rename) that enables us to do balance/rename across federation
> namespaces. The idea is to first move the meta to the dst NameNode and then
> link all the replicas. It has been working in our largest production cluster
> for 2 months. We use it to balance the namespaces. It turns out HFR is fast
> and flexible. The detail could be found in the design doc.
> Looking forward to a lively discussion.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]