[
https://issues.apache.org/jira/browse/HADOOP-19211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866686#comment-17866686
]
ASF GitHub Bot commented on HADOOP-19211:
-----------------------------------------
steveloughran commented on PR #6904:
URL: https://github.com/apache/hadoop/pull/6904#issuecomment-2232972993
> , it would be good to create a VectoredInputStream which takes the actual
DataInputStream as input and then all the Object stores like abfs, s3 and
allyun extending this VectoredInputStream. Not really sure if this is feasible
and will work.
mixed feelings.
* abfs is the most advanced in terms of prefetch and block cache, openFile()
support
* classic s3a does vector IO, IOStatistics context, but reaching AOL.
* don't know about the others
s3a prefetch stream is not ready for real use; #5832 does a lot of this. I'd
like that in just to show some progress.
if we were to do a new stream, I'd want
* block structure underneath
* openFile() length, read policies, split start end to frame cache
* footer prefetch cache for orc/parquet files
* unbuffer() frees block cache
* prefetching disabled on columnar formats opened with openFile read policy
what we should do is factor out commonality and put into common.
on that note, if anyone could take up #6773 and #1747 to create contract
tests for ByteBufferPositionedReadable we could share that with all impls of
vector io
> AliyunOSS: Support vectored read API
> ------------------------------------
>
> Key: HADOOP-19211
> URL: https://issues.apache.org/jira/browse/HADOOP-19211
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/oss
> Affects Versions: 3.2.4, 3.3.6
> Reporter: wujinhu
> Assignee: wujinhu
> Priority: Major
> Labels: pull-request-available
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]