[
https://issues.apache.org/jira/browse/HADOOP-19211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860412#comment-17860412
]
Steve Loughran commented on HADOOP-19211:
-----------------------------------------
It'd be great to see this -and get any performance numbers you have. We saw
improvements of up to 30-40% in TPC-DS queries with spark.
* Parquet 1.14.1 ships with support for this -just turn it on. Note: it only
supports byte buffers in heap, not direct, because while doing it we discovered
HADOOP-19101 in shipping releases; the turning it off for direct buffers avoids
having to worry about whether the runtime has the fix or not.
* does this store support multiple ranges in a single GET request? AWS S3
doesn't (though EMC's store does). If it was in AWS then we would use it for
more efficient queries.
* Have a look at HADOOP-18855 for ongoing work; failure recovery is something I
want to see. I think we may need to extend the API slightly
> AliyunOSS: Support vectored read API
> ------------------------------------
>
> Key: HADOOP-19211
> URL: https://issues.apache.org/jira/browse/HADOOP-19211
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/oss
> Affects Versions: 3.2.4, 3.3.6
> Reporter: wujinhu
> Assignee: wujinhu
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]