sodonnel commented on a change in pull request #3234:
URL: https://github.com/apache/hadoop/pull/3234#discussion_r676680004
##########
File path:
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java
##########
@@ -515,6 +521,26 @@ private DiffInfo getRenameItem(DiffInfo diff, DiffInfo[]
renameDiffArray) {
return null;
}
+ /**
+ * checks if a parent dir is marked deleted as a part of dir rename happening
+ * to a path which is excluded by the the filter.
+ * @return true if it's marked deleted
+ */
+ private boolean isParentOrSelfMarkedDeleted(DiffInfo diff,
+ DiffInfo[] deletedDirDiffArray) {
+ for (DiffInfo item : deletedDirDiffArray) {
+ if (item.getSource().equals(diff.getSource())) {
+ if (diff.getType() == SnapshotDiffReport.DiffType.MODIFY) {
Review comment:
Could you copy the comment from getRenameItem here, as it helps explain
why we are only filtering modified entries:
> // The same path string may appear in:
// 1. both renamed and modified snapshot diff entries.
// 2. both renamed and created snapshot diff entries.
// Case 1 is the about same file/directory, whereas case 2
// is about two different files/directories.
// We are finding case 1 here, thus we check against DiffType.MODIFY.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]