[
https://issues.apache.org/jira/browse/HDFS-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632101#comment-13632101
]
Aaron T. Myers commented on HDFS-4529:
--------------------------------------
That sounds good to me as well. Thanks Nicholas.
> Decide the semantic of concat with snapshots
> --------------------------------------------
>
> Key: HDFS-4529
> URL: https://issues.apache.org/jira/browse/HDFS-4529
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: Tsz Wo (Nicholas), SZE
>
> The use case of concat is for copying large files across clusters using the
> following steps.
> - Step 1: The blocks of a file in the source cluster are copied in parallel
> to transient files in the destination cluster.
> - Step 2: Then the transient files in the destination cluster are
> concatenated in order to obtain the original file.
> If a snapshot is taken in the destination cluster before Step 2, some
> transient files may be captured in the snapshot. Then what should happen?
> The following are some alternatives:
> * (1) fail concat and keep the transient files in the snapshots;
> * (2) allow concat and keep the transient files in the snapshots;
> * (3) allow concat but remove the transient files from all snapshots.
> All solutions above are not perfect. Here are their drawbacks:
> For (1) and (2), the transient files will remain in the system until the
> snapshots are deleted. It is inefficient to the system since the files are
> known to be transient. (1) may be able to force user to create files under
> some non-snapshottable tmp directory in the first place. However, it
> complicates the user applications and the existing applications may need to
> be updated for the new policy. Also, non-snapshottable directory may not
> exists since admin may set the system root directory to be snapshottable.
> For (2), the problem seems to break the Read-Only snapshot contract - some
> files appear in a snapshot may disappear later on.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira