[
https://issues.apache.org/jira/browse/HADOOP-17134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159234#comment-17159234
]
Steve Loughran commented on HADOOP-17134:
-----------------------------------------
not sure we need to worry about this. it's the specific operation we are
optimising away from, because we know that in the production code we've seen,
it is only ever called against directories.
We could fix it by replicating the relevant code from innerGetFileStatus which
looked in s3guard for the file, and do that first. Which would add a DDB call
to every directory listing on the path we are now optimising for.
> S3AFileSystem.listLocatedStatu(file) does a LIST even with S3Guard
> ------------------------------------------------------------------
>
> Key: HADOOP-17134
> URL: https://issues.apache.org/jira/browse/HADOOP-17134
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Steve Loughran
> Priority: Minor
>
> This is minor and we may want to WONTFIX; noticed during work on directory
> markers.
> If you call listLocatedStatus(file) then a LIST call is always made to S3,
> even when S3Guard is present and has the record to say "this is a file"
> Does this matter enough to fix?
> # The HADOOP-16465 work moved the list before falling back to getFileStatus
> # that listing calls s3guard.listChildren(path) to list the children.
> # which only returns the chlldren of a path, not a record of the path itself.
> # so we get an empty list back, triggering the LIST
> # its only after that LIST fails that we fall back to getFileStatus and hence
> look for the actual file record.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]