[
https://issues.apache.org/jira/browse/ARROW-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304441#comment-17304441
]
Ian Cook commented on ARROW-11558:
----------------------------------
Amazon S3 Object Lambda (announced today at
https://aws.amazon.com/blogs/aws/introducing-amazon-s3-object-lambda-use-your-code-to-process-data-as-it-is-being-retrieved-from-s3/)
seems like a better way to achieve the goals described here.
> [C++] Push down projection and selection to S3 Select
> -----------------------------------------------------
>
> Key: ARROW-11558
> URL: https://issues.apache.org/jira/browse/ARROW-11558
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Ian Cook
> Priority: Major
> Labels: filesystem
>
> Amazon S3 Select [1], an S3 feature generally available since April 2018 [2],
> can improve S3 read performance by allowing S3 clients to use a limited
> subset of SQL to specify projection and selection [3] on data in some formats
> [4]. It would be interesting to try using this in Arrow and to measure its
> effects on S3 read performance under various conditions.
> [1] [https://aws.amazon.com/blogs/aws/s3-glacier-select/]
> [2]
> [https://aws.amazon.com/about-aws/whats-new/2018/04/amazon-s3-select-is-now-generally-available/]
> [3]
> [https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-glacier-select-sql-reference-select.html]
> [4][https://docs.aws.amazon.com/cli/latest/reference/s3api/select-object-content.html]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)