[
https://issues.apache.org/jira/browse/HDFS-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227375#comment-15227375
]
Yongjun Zhang commented on HDFS-10263:
--------------------------------------
The following code
{code}
/**
* Recursively compute the difference between snapshots under a given
* directory/file.
* @param snapshotRoot The directory where snapshots were taken.
* @param node The directory/file under which the diff is computed.
* @param parentPath Relative path (corresponding to the snapshot root) of
* the node's parent.
* @param diffReport data structure used to store the diff.
*/
private void computeDiffRecursively(final INodeDirectory snapshotRoot,
INode node, List<byte[]> parentPath, SnapshotDiffInfo diffReport) {
final Snapshot earlierSnapshot = diffReport.isFromEarlier() ?
diffReport.getFrom() : diffReport.getTo();
final Snapshot laterSnapshot = diffReport.isFromEarlier() ?
diffReport.getTo() : diffReport.getFrom();
byte[][] relativePath = parentPath.toArray(new byte[parentPath.size()][]);
if (node.isDirectory()) {
final ChildrenDiff diff = new ChildrenDiff();
INodeDirectory dir = node.asDirectory();
DirectoryWithSnapshotFeature sf = dir.getDirectoryWithSnapshotFeature();
if (sf != null) {
boolean change = sf.computeDiffBetweenSnapshots(earlierSnapshot,
laterSnapshot, diff, dir);
if (change) {
diffReport.addDirDiff(dir, relativePath, diff);
}
}
ReadOnlyList<INode> children = dir.getChildrenList(earlierSnapshot
.getId());
for (INode child : children) {
final byte[] name = child.getLocalNameBytes();
boolean toProcess = diff.searchIndex(ListType.DELETED, name) < 0;
if (!toProcess && child instanceof INodeReference.WithName) {
byte[][] renameTargetPath = findRenameTargetPath(
snapshotRoot, (WithName) child,
laterSnapshot == null ? Snapshot.CURRENT_STATE_ID :
laterSnapshot.getId());
if (renameTargetPath != null) {
toProcess = true;
diffReport.setRenameTarget(child.getId(), renameTargetPath);
}
}
if (toProcess) {
parentPath.add(name);
computeDiffRecursively(snapshotRoot, child, parentPath, diffReport);
parentPath.remove(parentPath.size() - 1);
}
}
} else if (node.isFile() && node.asFile().isWithSnapshot()) {
INodeFile file = node.asFile();
boolean change = file.getFileWithSnapshotFeature()
.changedBetweenSnapshots(file, earlierSnapshot, laterSnapshot);
if (change) {
diffReport.addFileDiff(file, relativePath);
}
}
}
{code}
calcs earlierSnapshot and laterSnapshot then use does
{code}
boolean change = sf.computeDiffBetweenSnapshots(earlierSnapshot,
laterSnapshot, diff, dir);
{code}
for both forward and backward diff calculation. The bug may be in the related
code.
> Reversed snapshot diff report contains incorrect entries
> --------------------------------------------------------
>
> Key: HDFS-10263
> URL: https://issues.apache.org/jira/browse/HDFS-10263
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Yongjun Zhang
>
> Steps to reproduce:
> 1. Take a snapshot s1 at:
> {code}
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/bar
> -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo
> -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/foo/f1
> {code}
> 2. Make the following change:
> {code}
> private int changeData7(Path dir) throws Exception {
> final Path foo = new Path(dir, "foo");
> final Path foo2 = new Path(dir, "foo2");
> final Path foo_f1 = new Path(foo, "f1");
> final Path foo2_f2 = new Path(foo2, "f2");
> final Path foo2_f1 = new Path(foo2, "f1");
> final Path foo_d1 = new Path(foo, "d1");
> final Path foo_d1_f3 = new Path(foo_d1, "f3");
> int numDeletedAndModified = 0;
> dfs.rename(foo, foo2);
> dfs.delete(foo2_f1, true);
>
> DFSTestUtil.createFile(dfs, foo_f1, BLOCK_SIZE, DATA_NUM, 0L);
> DFSTestUtil.appendFile(dfs, foo_f1, (int) BLOCK_SIZE);
> dfs.rename(foo_f1, foo2_f2);
> numDeletedAndModified += 1; // "M ./foo"
> DFSTestUtil.createFile(dfs, foo_d1_f3, BLOCK_SIZE, DATA_NUM, 0L);
> return numDeletedAndModified;
> }
> {code}
> that results in
> {code}
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/bar
> -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo/d1
> -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/foo/d1/f3
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo2
> -rw-r--r-- 1 yzhang supergroup 2048 2016-04-05 14:48 /target/foo2/f2
> {code}
> 3. take snapshot s2 here
> 4. Do the following to revert the change done in step 2
> {code}
> private int revertChangeData7(Path dir) throws Exception {
> final Path foo = new Path(dir, "foo");
> final Path foo2 = new Path(dir, "foo2");
> final Path foo_f1 = new Path(foo, "f1");
> final Path foo2_f2 = new Path(foo2, "f2");
> final Path foo2_f1 = new Path(foo2, "f1");
> final Path foo_d1 = new Path(foo, "d1");
> final Path foo_d1_f3 = new Path(foo_d1, "f3");
> int numDeletedAndModified = 0;
>
> dfs.delete(foo_d1, true);
> dfs.rename(foo2_f2, foo_f1);
>
> dfs.delete(foo, true);
>
> DFSTestUtil.createFile(dfs, foo2_f1, BLOCK_SIZE, DATA_NUM, 0L);
> DFSTestUtil.appendFile(dfs, foo2_f1, (int) BLOCK_SIZE);
> dfs.rename(foo2, foo);
>
> return numDeletedAndModified;
> }
> {code}
> that get the following results:
> {code}
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/bar
> -rw-r--r-- 1 yzhang supergroup 1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x - yzhang supergroup 0 2016-04-05 14:48 /target/foo
> -rw-r--r-- 1 yzhang supergroup 2048 2016-04-05 14:48 /target/foo/f1
> {code}
> 4. Take snapshot s3 here.
> Below is the different snapshots
> {code}
> s1-s2: Difference between snapshot s1 and snapshot s2 under directory /target:
> M .
> + ./foo
> R ./foo -> ./foo2
> M ./foo
> + ./foo/f2
> - ./foo/f1
> s2-s1: Difference between snapshot s2 and snapshot s1 under directory /target:
> M .
> - ./foo
> R ./foo2 -> ./foo
> M ./foo
> - ./foo/f2
> + ./foo/f1
> s2-s3: Difference between snapshot s2 and snapshot s3 under directory /target:
> M .
> - ./foo
> R ./foo2 -> ./foo
> M ./foo2
> + ./foo2/f1
> - ./foo2/f2
> s3-s2: Difference between snapshot s3 and snapshot s2 under directory /target:
> M .
> + ./foo
> R ./foo -> ./foo2
> M ./foo2
> - ./foo2/f1
> + ./foo2/f2
> {code}
> The s2-s1 snapshot is supposed to be the same as s2-s3, because the change
> from s2 to s3 is an exact reversion of the change from s1 to s2. We can see
> that s1 and s3 have same file structures.
> However, the resulted shown above is not. I expect the following part
> {code}
> M ./foo
> - ./foo/f2
> + ./foo/f1
> {code}
> in s2-s1 diff should be
> {code}
> M ./foo2
> + ./foo2/f1
> - ./foo2/f2
> {code}
> (same as in s2-s3)
> instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)