[
https://issues.apache.org/jira/browse/HDFS-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584147#comment-17584147
]
fanshilun edited comment on HDFS-2139 at 8/25/22 2:16 AM:
----------------------------------------------------------
[~ferhui] [~xuzq_zander] Personally, this jira has helped a lot of people, I
think we should keep the original Assignee of this jira, should we create
subtasks and assign them?
The above is just a personal opinion, not the key point.
We still focus on the fastcopy feature itself, I hope this feature can help
more partners who use hdfs, thanks again [~ferhui] [~xuzq_zander] for your
contribution to this feature.
was (Author: slfan1989):
[~ferhui] [~xuzq_zander] Personally, this jira has helped a lot of people, I
think we should keep the original Assignee of this jira, should we create
subtasks and assign them?
The above is just a personal opinion, not the key point.
We still focus on the fastcopy feature itself, I hope this feature can help
more partners who use hdfs, thanks again [~ferhui] [~xuzq_zander] for his
contribution to this feature.
> Fast copy for HDFS.
> -------------------
>
> Key: HDFS-2139
> URL: https://issues.apache.org/jira/browse/HDFS-2139
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Pritam Damania
> Assignee: ZanderXu
> Priority: Major
> Attachments: HDFS-2139-For-2.7.1.patch, HDFS-2139.patch,
> HDFS-2139.patch, image-2022-08-11-11-48-17-994.png
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> There is a need to perform fast file copy on HDFS. The fast copy mechanism
> for a file works as
> follows :
> 1) Query metadata for all blocks of the source file.
> 2) For each block 'b' of the file, find out its datanode locations.
> 3) For each block of the file, add an empty block to the namesystem for
> the destination file.
> 4) For each location of the block, instruct the datanode to make a local
> copy of that block.
> 5) Once each datanode has copied over its respective blocks, they
> report to the namenode about it.
> 6) Wait for all blocks to be copied and exit.
> This would speed up the copying process considerably by removing top of
> the rack data transfers.
> Note : An extra improvement, would be to instruct the datanode to create a
> hardlink of the block file if we are copying a block on the same datanode
> [~xuzq_zander]Provided a design doc
> https://docs.google.com/document/d/1OHdUpQmKD3TZ3xdmQsXNmlXJetn2QFPinMH31Q4BqkI/edit?usp=sharing
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]