[
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192441#comment-16192441
]
Yongjun Zhang commented on HDFS-12544:
--------------------------------------
Hi [~manojg],
Thanks for working on this issue. I did a review of rev2, here are my comments:
1. It seems to make sense to include a new field snapshotDiffScopeDir in the
SnapshotDiffInfo class, and initialize it as the constructor. So the info in
this class is self-contained.
2. suggest to move the checking
{code}
if (!this.snapshotDiffAllowSnapRootDescendant) {
{code}
from SnapshotManager%getSnapshottableAncestorDir to its caller, and call
{{SnapshotManager%getSnapshottableAncestorDir}} or
{{SnapshotManager%getSnapshottableRoot}} based on the value of this config.
This way {{SnapshotManager%getSnapshottableAncestorDir}} can still return the
snapshottable ancestor even if the config is set to false. We might need this
behavior else where.
3. suggest to remove the method
{{SnapshotManager%setSnapshotDiffAllowSnapRootDescendant}}, and use the config
property to pass on the value to the cluster. This means, we need to split the
tests in TestSnapshotDiffReport into two parts, one with the config with true
value, the other with false.
Coding wise, we can put common code in a abstract base class, and introduce two
new test classes, one for testing allowing descendant, and the other for
testing not allowing descendant.
4. Nit. In SnapshotManager.java, change "directories" to "directory" in the
following text.
* operation can be run for any descendant directories under a snapshot root
> SnapshotDiff - support diff generation on any snapshot root descendant
> directory
> --------------------------------------------------------------------------------
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 3.0.0-beta1
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch
>
>
> {noformat}
> # hdfs snapshotDiff <snapshot_root_path> <from_snapshot_name>
> <to_snapshot_name>
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two
> given snapshots under a snapshot root directory. The command today only
> accepts the path that is a snapshot root. There are many deployments where
> the snapshot root is configured at the higher level directory but the diff
> report needed is only for a specific directory under the snapshot root. In
> these cases, the diff report can be filtered for changes pertaining to the
> directory we are interested in. But when the snapshot root directory is very
> huge, the snapshot diff report generation can take minutes even if we are
> interested to know the changes only in a small directory. So, it would be
> highly performant if the diff report calculation can be limited to only the
> interesting sub-directory of the snapshot root instead of the whole snapshot
> root.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]