[jira] [Updated] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-16145: - Fix Version/s: 3.4.0 > CopyListing fails with FNF exception with snapshot diff > --- > > Key: HDFS-16145 > URL: https://issues.apache.org/jira/browse/HDFS-16145 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Distcp with snapshotdiff and with filters, marks a Rename as a delete > opeartion on the target if the rename target is to a directory which is > exluded by the filter. But, in cases, where files/subdirs created/modified > prior to the Rename post the old snapshot will still be present as > modified/created entries in the final copy list. Since, the parent diretory > is marked for deletion, these subsequent create/modify entries should be > ignored while building the final copy list. > With such cases, when the final copy list is built, distcp tries to do a > lookup for each create/modified file in the newer snapshot which will fail > as, the parent dir is already moved to a new location in later snapshot. > > {code:java} > sudo -u kms hadoop key create testkey > hadoop fs -mkdir -p /data/gcgdlknnasg/ > hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ > hadoop fs -mkdir -p /dest/gcgdlknnasg > hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg > hdfs dfs -mkdir /data/gcgdlknnasg/dir1 > hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ > hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ > drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 > /data/gcgdlknnasg/.Trash > drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 > /data/gcgdlknnasg/dir1 > [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# > hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ > hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ > hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ > ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the > replication schedule. You get into below error and failure of the BDR job. > 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - > java.io.FileNotFoundException: File does not exist: > /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) > …….. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDFS-16145: --- Description: Distcp with snapshotdiff and with filters, marks a Rename as a delete opeartion on the target if the rename target is to a directory which is exluded by the filter. But, in cases, where files/subdirs created/modified prior to the Rename post the old snapshot will still be present as modified/created entries in the final copy list. Since, the parent diretory is marked for deletion, these subsequent create/modify entries should be ignored while building the final copy list. With such cases, when the final copy list is built, distcp tries to do a lookup for each create/modified file in the newer snapshot which will fail as, the parent dir is already moved to a new location in later snapshot. {code:java} sudo -u kms hadoop key create testkey hadoop fs -mkdir -p /data/gcgdlknnasg/ hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ hadoop fs -mkdir -p /dest/gcgdlknnasg hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg hdfs dfs -mkdir /data/gcgdlknnasg/dir1 hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 /data/gcgdlknnasg/.Trash drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 /data/gcgdlknnasg/dir1 [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ [root@nightly62x-1 logs]# hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the replication schedule. You get into below error and failure of the BDR job. 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - java.io.FileNotFoundException: File does not exist: /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) …….. {code} was: Distcp with snapshotdiff and with filters, marks a Rename as a delete opeartion on the target if the rename target is to a directory which is exluded by the filter. But, in cases, where files/subdirs created/modified prior to the Rename post the old snapshot will still be present as modified/created entries in the final copy list. Since, the parent diretory is marked for deletion, these subsequent create/modify entries should be ignored while building the final copy list. With such cases, when the final copy list is built, distcp tries to do a lookup for each create/modified file in the l\newer snapshot which will fail as, the parent dir is already moved to a new location in later snapshot. {code:java} sudo -u kms hadoop key create testkey hadoop fs -mkdir -p /data/gcgdlknnasg/ hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ hadoop fs -mkdir -p /dest/gcgdlknnasg hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg hdfs dfs -mkdir /data/gcgdlknnasg/dir1 hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 /data/gcgdlknnasg/.Trash drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 /data/gcgdlknnasg/dir1 [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ [root@nightly62x-1 logs]# hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the replication schedule. You get into below error and failure of the BDR job. 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - java.io.FileNotFoundException: File does not exist: /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) …….. {code} > CopyListing fails with FNF exception with snapshot diff > --- > > Key: HDFS-16145 > URL: https://issues.apache.org/jira/browse/HDFS-16145 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Distcp with snapshotdiff and
[jira] [Updated] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16145: -- Labels: pull-request-available (was: ) > CopyListing fails with FNF exception with snapshot diff > --- > > Key: HDFS-16145 > URL: https://issues.apache.org/jira/browse/HDFS-16145 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Distcp with snapshotdiff and with filters, marks a Rename as a delete > opeartion on the target if the rename target is to a directory which is > exluded by the filter. But, in cases, where files/subdirs created/modified > prior to the Rename post the old snapshot will still be present as > modified/created entries in the final copy list. Since, the parent diretory > is marked for deletion, these subsequent create/modify entries should be > ignored while building the final copy list. > With such cases, when the final copy list is built, distcp tries to do a > lookup for each create/modified file in the l\newer snapshot which will fail > as, the parent dir is already moved to a new location in later snapshot. > > {code:java} > sudo -u kms hadoop key create testkey > hadoop fs -mkdir -p /data/gcgdlknnasg/ > hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ > hadoop fs -mkdir -p /dest/gcgdlknnasg > hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg > hdfs dfs -mkdir /data/gcgdlknnasg/dir1 > hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ > hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ > drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 > /data/gcgdlknnasg/.Trash > drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 > /data/gcgdlknnasg/dir1 > [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# > hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ > hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ > hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ > ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the > replication schedule. You get into below error and failure of the BDR job. > 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - > java.io.FileNotFoundException: File does not exist: > /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) > …….. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org