[ https://issues.apache.org/jira/browse/HADOOP-19348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ahmar Suhail resolved HADOOP-19348. ----------------------------------- Resolution: Fixed > S3A: Add initial support for analytics-accelerator-s3 > ----------------------------------------------------- > > Key: HADOOP-19348 > URL: https://issues.apache.org/jira/browse/HADOOP-19348 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.2 > Reporter: Ahmar Suhail > Assignee: Ahmar Suhail > Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.2 > > > S3 recently released [Analytics Accelerator Library for Amazon > S3|https://github.com/awslabs/analytics-accelerator-s3] as an Alpha release, > which is an input stream, with an initial goal of improving performance for > Apache Spark workloads on Parquet datasets. > For example, it implements optimisations such as footer prefetching, and so > avoids the multiple GETS S3AInputStream currently makes for the footer bytes > and PageIndex structures. > The library also tracks columns currently being read by a query using the > parquet metadata, and then prefetches these bytes when parquet files with the > same schema are opened. > This ticket tracks the work required for the basic initial integration. There > is still more work to be done, such as VectoredIO support etc, which we will > identify and follow up with. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org