[
https://issues.apache.org/jira/browse/HDFS-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570346#comment-17570346
]
Plamen Jeliazkov commented on HDFS-15195:
-----------------------------------------
Hi [~Amithsha],
I recently had the same idea and believe there is a path forward here if we
don't take the approach of HDFS-7702, but rather take the approach of an
"enhanced" DistCp that utilizes renaming block data (and transforming it) via a
communication path between the two namespaces. Already, today, DistCp is the
only accepted path for migrating the data between two Namespaces. Utilizing
DataNode in-place block transformations between two Namespaces (Nameservices?)
would simply serve as a much faster form of DistCp that can only be effectively
used by Federated NameNodes.
A proposal for "enhanced DistCp":
(1) Sending NameNode keeps the file open with an empty append call.
(2) Receiving NameNode constructs the metadata, reserving the namespace, and
calls as many addBlocks as needed to obtain new block id, genstamps, etc.
(3) New DistCp is used to tell DataNodes to transform blocks into new Blockpool
blocks, via rename, etc.
In theory, assuming same volume renames, it should be very quick and even
possible if DataNodes are 100% utilized.
Thoughts?
> In place namenode federation
> ----------------------------
>
> Key: HDFS-15195
> URL: https://issues.apache.org/jira/browse/HDFS-15195
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Amithsha
> Priority: Major
>
> In the current scenario federating the existing data is not possible. This
> impacts the implementation of HDFS federation on the production cluster with
> more than PB of data. Because we need to copy the data from the old set of
> namenodes to the new set of namenodes. From the data node directory structure
> its clear that if we move the blocks of particular data from namenode_set_1
> dir (dfs/data/current/BP-xxx) to namenode_set_2 dir (dfs/data/current/BP-yyy)
> will solve the issue. Why can’t we make this us a new future where it will
> ask for dir to get federated and stop the write process until move completes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]