[jira] [Updated] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff

2021-07-28 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-16145:
-
Fix Version/s: 3.4.0

> CopyListing fails with FNF exception with snapshot diff
> ---
>
> Key: HDFS-16145
> URL: https://issues.apache.org/jira/browse/HDFS-16145
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Distcp with snapshotdiff and with filters, marks a Rename as a delete 
> opeartion on the target if the rename target is to a directory which is 
> exluded by the filter. But, in cases, where files/subdirs created/modified 
> prior to the Rename post the old snapshot will still be present as 
> modified/created entries in the final copy list. Since, the parent diretory 
> is marked for deletion, these subsequent create/modify entries should be 
> ignored while building the final copy list. 
> With such cases, when the final copy list is built, distcp tries to do a 
> lookup for each create/modified file in the newer snapshot which will fail 
> as, the parent dir is already moved to a new location in later snapshot.
>  
> {code:java}
> sudo -u kms hadoop key create testkey
> hadoop fs -mkdir -p /data/gcgdlknnasg/
> hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/
> hadoop fs -mkdir -p /dest/gcgdlknnasg
> hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg
> hdfs dfs -mkdir /data/gcgdlknnasg/dir1
> hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ 
> hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ 
> [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/
> drwxrwxrwt   - hdfs supergroup  0 2021-07-16 14:05 
> /data/gcgdlknnasg/.Trash
> drwxr-xr-x   - hdfs supergroup  0 2021-07-16 13:07 
> /data/gcgdlknnasg/dir1
> [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/
> [root@nightly62x-1 logs]#
> hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/
> hdfs dfs -rm -r /data/gcgdlknnasg/dir1/
> hdfs dfs -mkdir /data/gcgdlknnasg/dir1/
> ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the 
> replication schedule. You get into below error and failure of the BDR job.
> 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - 
> java.io.FileNotFoundException: File does not exist: 
> /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487)
> ……..
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff

2021-07-26 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-16145:
---
Description: 
Distcp with snapshotdiff and with filters, marks a Rename as a delete opeartion 
on the target if the rename target is to a directory which is exluded by the 
filter. But, in cases, where files/subdirs created/modified prior to the Rename 
post the old snapshot will still be present as modified/created entries in the 
final copy list. Since, the parent diretory is marked for deletion, these 
subsequent create/modify entries should be ignored while building the final 
copy list. 

With such cases, when the final copy list is built, distcp tries to do a lookup 
for each create/modified file in the newer snapshot which will fail as, the 
parent dir is already moved to a new location in later snapshot.

 
{code:java}
sudo -u kms hadoop key create testkey
hadoop fs -mkdir -p /data/gcgdlknnasg/
hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/
hadoop fs -mkdir -p /dest/gcgdlknnasg
hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg
hdfs dfs -mkdir /data/gcgdlknnasg/dir1
hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ 
hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ 

[root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/
drwxrwxrwt   - hdfs supergroup  0 2021-07-16 14:05 
/data/gcgdlknnasg/.Trash
drwxr-xr-x   - hdfs supergroup  0 2021-07-16 13:07 
/data/gcgdlknnasg/dir1
[root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/
[root@nightly62x-1 logs]#

hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/
hdfs dfs -rm -r /data/gcgdlknnasg/dir1/
hdfs dfs -mkdir /data/gcgdlknnasg/dir1/

===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the 
replication schedule. You get into below error and failure of the BDR job.

21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - 
java.io.FileNotFoundException: File does not exist: 
/data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487)
……..
{code}

  was:
Distcp with snapshotdiff and with filters, marks a Rename as a delete opeartion 
on the target if the rename target is to a directory which is exluded by the 
filter. But, in cases, where files/subdirs created/modified prior to the Rename 
post the old snapshot will still be present as modified/created entries in the 
final copy list. Since, the parent diretory is marked for deletion, these 
subsequent create/modify entries should be ignored while building the final 
copy list. 

With such cases, when the final copy list is built, distcp tries to do a lookup 
for each create/modified file in the l\newer snapshot which will fail as, the 
parent dir is already moved to a new location in later snapshot.

 
{code:java}
sudo -u kms hadoop key create testkey
hadoop fs -mkdir -p /data/gcgdlknnasg/
hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/
hadoop fs -mkdir -p /dest/gcgdlknnasg
hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg
hdfs dfs -mkdir /data/gcgdlknnasg/dir1
hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ 
hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ 

[root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/
drwxrwxrwt   - hdfs supergroup  0 2021-07-16 14:05 
/data/gcgdlknnasg/.Trash
drwxr-xr-x   - hdfs supergroup  0 2021-07-16 13:07 
/data/gcgdlknnasg/dir1
[root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/
[root@nightly62x-1 logs]#

hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/
hdfs dfs -rm -r /data/gcgdlknnasg/dir1/
hdfs dfs -mkdir /data/gcgdlknnasg/dir1/

===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the 
replication schedule. You get into below error and failure of the BDR job.

21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - 
java.io.FileNotFoundException: File does not exist: 
/data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487)
……..
{code}


> CopyListing fails with FNF exception with snapshot diff
> ---
>
> Key: HDFS-16145
> URL: https://issues.apache.org/jira/browse/HDFS-16145
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Distcp with snapshotdiff and 

[jira] [Updated] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff

2021-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16145:
--
Labels: pull-request-available  (was: )

> CopyListing fails with FNF exception with snapshot diff
> ---
>
> Key: HDFS-16145
> URL: https://issues.apache.org/jira/browse/HDFS-16145
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Distcp with snapshotdiff and with filters, marks a Rename as a delete 
> opeartion on the target if the rename target is to a directory which is 
> exluded by the filter. But, in cases, where files/subdirs created/modified 
> prior to the Rename post the old snapshot will still be present as 
> modified/created entries in the final copy list. Since, the parent diretory 
> is marked for deletion, these subsequent create/modify entries should be 
> ignored while building the final copy list. 
> With such cases, when the final copy list is built, distcp tries to do a 
> lookup for each create/modified file in the l\newer snapshot which will fail 
> as, the parent dir is already moved to a new location in later snapshot.
>  
> {code:java}
> sudo -u kms hadoop key create testkey
> hadoop fs -mkdir -p /data/gcgdlknnasg/
> hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/
> hadoop fs -mkdir -p /dest/gcgdlknnasg
> hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg
> hdfs dfs -mkdir /data/gcgdlknnasg/dir1
> hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ 
> hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ 
> [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/
> drwxrwxrwt   - hdfs supergroup  0 2021-07-16 14:05 
> /data/gcgdlknnasg/.Trash
> drwxr-xr-x   - hdfs supergroup  0 2021-07-16 13:07 
> /data/gcgdlknnasg/dir1
> [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/
> [root@nightly62x-1 logs]#
> hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/
> hdfs dfs -rm -r /data/gcgdlknnasg/dir1/
> hdfs dfs -mkdir /data/gcgdlknnasg/dir1/
> ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the 
> replication schedule. You get into below error and failure of the BDR job.
> 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - 
> java.io.FileNotFoundException: File does not exist: 
> /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487)
> ……..
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org