[
https://issues.apache.org/jira/browse/HADOOP-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343827#comment-17343827
]
Wei-Chiu Chuang commented on HADOOP-17400:
------------------------------------------
[~mukund-thakur] there's only one open jira under this umbrella. Shall we
resolve this jira and move the open jira (HADOOP-17335) to the Uber S3A Hadoop
3.4.0 jira?
> Optimize S3A for maximum performance in directory listings
> ----------------------------------------------------------
>
> Key: HADOOP-17400
> URL: https://issues.apache.org/jira/browse/HADOOP-17400
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Mukund Thakur
> Priority: Major
>
> Make listing in applications as fast as we can get it especially for query
> planning.
> * All operations used in listing directories for query planning etc to be
> optimized for their primary use: being passed directories (not files) and so
> make that faster even at the expense of more remote IO when handed files or
> empty directories.
> * remove needless calls to S3 wherever possible (e.g. {{getFileStatus("/")}},
> making bucket existence probes optional)
> * Support/enable Asynchronous IO where possible.
>
> Review higher level APIs (glob status) and uses on the FsShell and optimize
> their use by minimising invocations or FS API calls, with bonus goal of
> reduce/minimize risk of 404 caching.
> Work with downstream projects to move to FS APIs which work best in this
> world -primarily the recursive listing operations and those which return
> RemoteIterator<FileStatus> -and so make any asynchronous page fetching
> operations useful.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]