[
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215926#comment-16215926
]
Yongjun Zhang commented on HDFS-12544:
--------------------------------------
HI [~manojg],
Thanks for the updated patch. While I'm taking a look, I have some questions:
1.
{quote}
these renamed files whose target is not under the scoped directory should be
shown as "D" deleted entries in the report
{quote}
Did you also examine that from the target dir point of view, if we scope the
snapshot to the target dir of a rename operation, the file is a "+" entry in
the snapshotDiff report?
2. What happens if the scoped dir is a newly created one in between the two
snapshots that we are doing snapshotDiff?
Thanks.
> SnapshotDiff - support diff generation on any snapshot root descendant
> directory
> --------------------------------------------------------------------------------
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 3.0.0-beta1
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch,
> HDFS-12544.03.patch, HDFS-12544.04.patch
>
>
> {noformat}
> # hdfs snapshotDiff <snapshot_root_path> <from_snapshot_name>
> <to_snapshot_name>
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two
> given snapshots under a snapshot root directory. The command today only
> accepts the path that is a snapshot root. There are many deployments where
> the snapshot root is configured at the higher level directory but the diff
> report needed is only for a specific directory under the snapshot root. In
> these cases, the diff report can be filtered for changes pertaining to the
> directory we are interested in. But when the snapshot root directory is very
> huge, the snapshot diff report generation can take minutes even if we are
> interested to know the changes only in a small directory. So, it would be
> highly performant if the diff report calculation can be limited to only the
> interesting sub-directory of the snapshot root instead of the whole snapshot
> root.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]