[
https://issues.apache.org/jira/browse/HDFS-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586263#comment-13586263
]
Aaron T. Myers commented on HDFS-4523:
--------------------------------------
bq. That's not correct. The final concat'ed file won't be in the previously
created snapshots. Only the transient files are removed.
Gotcha. Thanks for the clarification.
Still, though, I think the point remains - the supposedly read-only snapshot
which was created when those source files were in the FS will be modified to
remove those source files which were concat'ed. That doesn't seem correct to
me. Why shouldn't the snapshot continue to have those source files?
> Fix concat for snapshots
> ------------------------
>
> Key: HDFS-4523
> URL: https://issues.apache.org/jira/browse/HDFS-4523
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h4523_20130222.patch, h4523_20130223.patch,
> h4523_20130225.patch
>
>
> The use case of concat is for copying large files across clusters using the
> following steps.
> - Step 1: The blocks of a file in the source cluster are copied in parallel
> to transient files in the destination cluster.
> - Step 2: Then the transient files in the destination cluster are
> concatenated in order to obtain the original file.
> If a snapshot is taken in the destination cluster before Step 2, some
> transient files may be captured in the snapshot. These transient files
> should be removed in Step 2.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira