[
https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mukul Kumar Singh resolved HDFS-16145.
--------------------------------------
Fix Version/s: 3.3.2
Resolution: Fixed
> CopyListing fails with FNF exception with snapshot diff
> -------------------------------------------------------
>
> Key: HDFS-16145
> URL: https://issues.apache.org/jira/browse/HDFS-16145
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: distcp
> Reporter: Shashikant Banerjee
> Assignee: Shashikant Banerjee
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.3.2
>
> Time Spent: 3h 20m
> Remaining Estimate: 0h
>
> Distcp with snapshotdiff and with filters, marks a Rename as a delete
> opeartion on the target if the rename target is to a directory which is
> exluded by the filter. But, in cases, where files/subdirs created/modified
> prior to the Rename post the old snapshot will still be present as
> modified/created entries in the final copy list. Since, the parent diretory
> is marked for deletion, these subsequent create/modify entries should be
> ignored while building the final copy list.
> With such cases, when the final copy list is built, distcp tries to do a
> lookup for each create/modified file in the newer snapshot which will fail
> as, the parent dir is already moved to a new location in later snapshot.
>
> {code:java}
> sudo -u kms hadoop key create testkey
> hadoop fs -mkdir -p /data/gcgdlknnasg/
> hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/
> hadoop fs -mkdir -p /dest/gcgdlknnasg
> hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg
> hdfs dfs -mkdir /data/gcgdlknnasg/dir1
> hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/
> hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/
> [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/
> drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05
> /data/gcgdlknnasg/.Trash
> drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07
> /data/gcgdlknnasg/dir1
> [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/
> [root@nightly62x-1 logs]#
> hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/
> hdfs dfs -rm -r /data/gcgdlknnasg/dir1/
> hdfs dfs -mkdir /data/gcgdlknnasg/dir1/
> ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the
> replication schedule. You get into below error and failure of the BDR job.
> 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff -
> java.io.FileNotFoundException: File does not exist:
> /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487)
> ……..
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]