XComp edited a comment on pull request #18692:
URL: https://github.com/apache/flink/pull/18692#issuecomment-1040317952


   > I'm wondering if that isn't yet again a minio issue. From everything I 
have read presto assumes that everything is an empty directory by default 
(hence why mkdirs doesn't do anything). So unless you check that a directory 
does not exist things should be fine. 
https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/directory_markers.html
   
   After looking through the Presto code once, we can conclude that it's not a 
Minio issue and the change is required. See the explanation below
   
   > Did you step through PrestoS3FileSystem to see which call fails? How does 
the test (FileSystemBehaviorTestSuite.testMkdirsCreatesParentDirectories) 
actually fail? Is the assertion failing, or is the FS throwing a 
FileNotFoundException.
   > I'm asking because FileSystem#exists only checks whether getFileStatus 
returns null, but I don't see a codepath in the presto filesystem where that 
can be the case.
   
   I guess, I see where the confusion is coming: I missed the fact that the 
`FileNotFoundException` is handled in the `hadoop-common` FileSystem 
implementation (see 
[FileSystem:1401](https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1401)).
 Hence, `exists()` doesn't cause problems but the `getFileStatus` 
implementation which is used in the pre-PR implementation of 
`FileSystemJobResultStore` (see my [comment 
above](https://github.com/apache/flink/pull/18692#discussion_r806857666) for 
the code path).
   
   Hence, change in question (removing the directory check) is still necessary 
for `PrestoS3FileSystem`. Otherwise, we would run into `FileNotFoundException`s 
when the base directory is empty.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to