[ 
https://issues.apache.org/jira/browse/HDFS-15087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009739#comment-17009739
 ] 

Jinglun edited comment on HDFS-15087 at 1/8/20 2:48 AM:
--------------------------------------------------------

Hi [~linyiqun], thanks your suggestion ! I prepared a initial patch based on 
xiaomi branch.

There are few annotation in the patch. So I write some annotations below for 
the important classes.
 * FRHelper, FederationRenameV2, FederationRenameV1(Deprecated):Serialization & 
Deserialization.
 * FSFedRenameV2Op: It handles saveSubTree and graftSubTree rpcs. Like 
FSDirRenameOp.
 * BasicInfo: It contains the basic information of an HFR job.
 * BlockLinker: It does all the hard links. Including collecting and hardlink.
 * BlocksToDup: Blocks in batch.
 * FsDatasetImpl: method addBlocksToNewPool is added to hard link replicas.
 * IDConsumer: It's used by NameNode. After the Ids are pre-allocated, we use 
IDConsumer to consume all the allocated Ids.
 * LockHelper: It helps release and restore all the write locks.
 * ParallelConsumer: The super class of ChecksumDirectoriesChecker.
 * ChecksumDirectoriesChecker: After the HardLink phase done, It verifies the 
source path and destination path.
 * JobScheduler, Job, JobContext: The Scheduler model in the design doc.
 * FederationRenameProcedure: It's the HFR.
 * DistCpProcedure: Another version of HFR. Use distcp instead of saveSubTree + 
graftSubTree + HardLink.


was (Author: lijinglun):
Hi [~linyiqun], thanks your suggestion ! I prepared a initial patch based on 
xiaomi branch.

There are few annotation in the patch. So I write some annotations below for 
the important classes.
 * FRHelper, FederationRenameV2, FederationRenameV1(Deprecated):Serialization & 
Deserialization.
 * FSFedRenameV2Op: It handles saveSubTree and graftSubTree rpcs. Like 
FSDirRenameOp.
 * BasicInfo: It contains the basic information of an HFR job.
 * BlockLinker: It does all the hard links. Including collecting and hardlink.
 * BlocksToDup: Blocks in batch.
 * FsDatasetImpl: method addBlocksToNewPool is added to hard link replicas.
 * IDConsumer: It's used by NameNode. After the Ids are pre-allocated, we use 
IDConsumer to consume all the allocated Ids.
 * LockHelper: It helps release and restore all the write locks.
 * ParallelConsumer: The super class of ChecksumDirectoriesChecker.
 * ChecksumDirectoriesChecker: After the HardLink phase done, It verifies the 
source path and destination path.
 * JobScheduler, Job, JobContext: The Scheduler model in the design doc.
 * FederationRenameProcedure: It's the HFR.
 * DistCpProcedure: Use distcp instead of saveSubTree + graftSubTree + HardLink.

> RBF: Balance/Rename across federation namespaces
> ------------------------------------------------
>
>                 Key: HDFS-15087
>                 URL: https://issues.apache.org/jira/browse/HDFS-15087
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Jinglun
>            Priority: Major
>         Attachments: HDFS-15087.initial.patch, HFR_Rename Across Federation 
> Namespaces.pdf
>
>
> The Xiaomi storage team has developed a new feature called HFR(HDFS 
> Federation Rename) that enables us to do balance/rename across federation 
> namespaces. The idea is to first move the meta to the dst NameNode and then 
> link all the replicas. It has been working in our largest production cluster 
> for 2 months. We use it to balance the namespaces. It turns out HFR is fast 
> and flexible. The detail could be found in the design doc. 
> Looking forward to a lively discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to