bgaborg commented on a change in pull request #1407: HADOOP-16490. Improve 
S3Guard handling of FNFEs in copy
URL: https://github.com/apache/hadoop/pull/1407#discussion_r322222712
 
 

 ##########
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##########
 @@ -2587,6 +2594,30 @@ S3AFileStatus innerGetFileStatus(final Path f,
     entryPoint(INVOCATION_GET_FILE_STATUS);
     checkNotClosed();
     final Path path = qualify(f);
+    return resolveFileStatus(path, needEmptyDirectoryFlag, false);
+  }
+
+
+  /**
+   * Get the status of a file or directory, first through S3Guard and then
+   * through S3.
+   * The S3 probes can leave 404 responses in the S3 load balancers; if
+   * a check is only needed for a directory, declaring this saves time and
+   * avoids creating one for the object.
+   * When only probing for directories, if an entry for a file is found in
+   * S3Guard it is returned, but checks for updated values are skipped.
+   * @param path fully qualified path
+   * @param needEmptyDirectoryFlag if true, implementation will calculate
+   *        a TRUE or FALSE value for {@link S3AFileStatus#isEmptyDirectory()}
+   * @param onlyProbeForDirectory only perform the directory probes.
+   * @return a S3AFileStatus object
+   * @throws FileNotFoundException when the path does not exist
+   * @throws IOException on other problems.
+   */
+  private S3AFileStatus resolveFileStatus(final Path path,
 
 Review comment:
   I don't like this: why do we need to create another wrapper for this call? I 
mean `getFileStatus` calls `innerGetFileStatus` calls `resolveFileStatus` and I 
don't see why do we need to do the last call here - imho there's no need for 
another method in the same class.. It will be just another command+click for 
most of us in the IDE, while I don't see any particular gain from this - a 
better way would be the factor out the method call to it's own class, or create 
a jira for this at least.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to