[ 
https://issues.apache.org/jira/browse/HADOOP-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703507#comment-17703507
 ] 

ASF GitHub Bot commented on HADOOP-18662:
-----------------------------------------

ayushtkn commented on code in PR #5477:
URL: https://github.com/apache/hadoop/pull/5477#discussion_r1144296150


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java:
##########
@@ -2413,8 +2413,13 @@ private void handleFileStat(LocatedFileStatus stat) 
throws IOException {
         if (stat.isFile()) { // file
           curFile = stat;
         } else if (recursive) { // directory
-          itors.push(curItor);
-          curItor = listLocatedStatus(stat.getPath());
+          try {
+            RemoteIterator<LocatedFileStatus> newDirItor = 
listLocatedStatus(stat.getPath());

Review Comment:
   That wasn't an issue. If you do ``listLocatedStatus(stat.getPath());`` and 
the path is a file, it will still return an iterator with just one entry which 
is that file, and it will again loop on that iterator find the first entry as 
file and add it to the list, So, that won't break but I have added a mocked 
test as well for this case.





> ListFiles with recursive fails with FNF
> ---------------------------------------
>
>                 Key: HADOOP-18662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18662
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Ayush Saxena
>            Assignee: Ayush Saxena
>            Priority: Major
>              Labels: pull-request-available
>
> Problem triggers in HDFS, but the change is in Hadoop-Common, Since the 
> listFiles is defined in Hadoop-Common.
> Scenario:
> ListFiles With recursive: 
>  * Fetches a dir say /dir, which has some /dir/s1...s10
>  * Recursive is set to true: It goes and tries on say /dir/s5 and /dir/s5 got 
> deleted by that time
>  * The entire operation fails with FNF
> Hive Cleaner uses listFiles with recursive true and this impacts that
> {noformat}
> 2023-03-06 07:45:48,331 ERROR 
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner: 
> [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to 
> complete cleaning of 
> id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId:
>  null,initiatorId: null java.io.FileNotFoundException: File 
> hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550
>  does not exist.
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1275)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1249)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208)
>     at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144)
>     at org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332)
>     at org.apache.hadoop.fs.FileSystem$5.hasNext(FileSystem.java:2309)
>     at 
> org.apache.hadoop.util.functional.RemoteIterators$WrappingRemoteIterator.sourceHasNext(RemoteIterators.java:432)
>     at 
> org.apache.hadoop.util.functional.RemoteIterators$FilteringRemoteIterator.fetch(RemoteIterators.java:581)
>     at 
> org.apache.hadoop.util.functional.RemoteIterators$FilteringRemoteIterator.hasNext(RemoteIterators.java:602)
>     at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getHdfsDirSnapshots(AcidUtils.java:1435)
>     at 
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner.removeFiles(Cleaner.java:287)
>     at org.apache.hadoop.hive.ql.txn.compactor.Cleaner.clean(Cleaner.java:214)
>     at 
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner.lambda$run$0(Cleaner.java:114)
>     at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil$ThrowingRunnable.lambda$unchecked$0(CompactorUtil.java:54)
>     at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to