[
https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392423#comment-15392423
]
Abdullah Yousufi commented on HIVE-14165:
-----------------------------------------
Thanks for the clarification Steve, looking forward to that O(files/1000)
recursive list
> Enable faster S3 Split Computation by listing files in blocks
> -------------------------------------------------------------
>
> Key: HIVE-14165
> URL: https://issues.apache.org/jira/browse/HIVE-14165
> Project: Hive
> Issue Type: Sub-task
> Affects Versions: 2.1.0
> Reporter: Abdullah Yousufi
> Assignee: Abdullah Yousufi
>
> During split computation when a large number of files are required to be
> listed from S3, instead of executing 1 API call per file, one can optimize by
> listing 1000 files in each API call. This would reduce the amount of time
> required for listing files.
> Qubole has this optimization in place as detailed here:
> https://www.qubole.com/blog/product/optimizing-hadoop-for-s3-part-1/?nabe=5695374637924352:0
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)