[
https://issues.apache.org/jira/browse/HADOOP-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703507#comment-17703507
]
ASF GitHub Bot commented on HADOOP-18662:
-----------------------------------------
ayushtkn commented on code in PR #5477:
URL: https://github.com/apache/hadoop/pull/5477#discussion_r1144296150
##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java:
##########
@@ -2413,8 +2413,13 @@ private void handleFileStat(LocatedFileStatus stat)
throws IOException {
if (stat.isFile()) { // file
curFile = stat;
} else if (recursive) { // directory
- itors.push(curItor);
- curItor = listLocatedStatus(stat.getPath());
+ try {
+ RemoteIterator<LocatedFileStatus> newDirItor =
listLocatedStatus(stat.getPath());
Review Comment:
That wasn't an issue. If you do ``listLocatedStatus(stat.getPath());`` and
the path is a file, it will still return an iterator with just one entry which
is that file, and it will again loop on that iterator find the first entry as
file and add it to the list, So, that won't break but I have added a mocked
test as well for this case.
> ListFiles with recursive fails with FNF
> ---------------------------------------
>
> Key: HADOOP-18662
> URL: https://issues.apache.org/jira/browse/HADOOP-18662
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Major
> Labels: pull-request-available
>
> Problem triggers in HDFS, but the change is in Hadoop-Common, Since the
> listFiles is defined in Hadoop-Common.
> Scenario:
> ListFiles With recursive:
> * Fetches a dir say /dir, which has some /dir/s1...s10
> * Recursive is set to true: It goes and tries on say /dir/s5 and /dir/s5 got
> deleted by that time
> * The entire operation fails with FNF
> Hive Cleaner uses listFiles with recursive true and this impacts that
> {noformat}
> 2023-03-06 07:45:48,331 ERROR
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner:
> [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to
> complete cleaning of
> id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId:
> null,initiatorId: null java.io.FileNotFoundException: File
> hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550
> does not exist.
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1275)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1249)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208)
> at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144)
> at org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332)
> at org.apache.hadoop.fs.FileSystem$5.hasNext(FileSystem.java:2309)
> at
> org.apache.hadoop.util.functional.RemoteIterators$WrappingRemoteIterator.sourceHasNext(RemoteIterators.java:432)
> at
> org.apache.hadoop.util.functional.RemoteIterators$FilteringRemoteIterator.fetch(RemoteIterators.java:581)
> at
> org.apache.hadoop.util.functional.RemoteIterators$FilteringRemoteIterator.hasNext(RemoteIterators.java:602)
> at
> org.apache.hadoop.hive.ql.io.AcidUtils.getHdfsDirSnapshots(AcidUtils.java:1435)
> at
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner.removeFiles(Cleaner.java:287)
> at org.apache.hadoop.hive.ql.txn.compactor.Cleaner.clean(Cleaner.java:214)
> at
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner.lambda$run$0(Cleaner.java:114)
> at
> org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil$ThrowingRunnable.lambda$unchecked$0(CompactorUtil.java:54)
> at
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]