[jira] [Commented] (HADOOP-19211) AliyunOSS: Support vectored read API

ASF GitHub Bot (Jira) Wed, 17 Jul 2024 03:29:06 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-19211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866686#comment-17866686
 ]


ASF GitHub Bot commented on HADOOP-19211:
-----------------------------------------

steveloughran commented on PR #6904:
URL: https://github.com/apache/hadoop/pull/6904#issuecomment-2232972993

   > , it would be good to create a VectoredInputStream which takes the actual 
DataInputStream as input and then all the Object stores like abfs, s3 and 
allyun extending this VectoredInputStream. Not really sure if this is feasible 
and will work.
   
   mixed feelings. 
   
   * abfs is the most advanced in terms of prefetch and block cache, openFile() 
support
   * classic s3a does vector IO, IOStatistics context, but reaching AOL.
   * don't know about the others
   
   s3a prefetch stream is not ready for real use; #5832 does a lot of this. I'd 
like that in just to show some progress.
   
   if we were to do a new stream, I'd want
   
   * block structure underneath
   * openFile() length, read policies, split start end to frame cache
   * footer prefetch cache for orc/parquet files
   * unbuffer() frees block cache
   * prefetching disabled on columnar formats opened with openFile read policy
   
   
   what we should do is factor out commonality and put into common. 
   
   on that note, if anyone could take up #6773 and #1747 to create contract 
tests for ByteBufferPositionedReadable we could share that with all impls of 
vector io




> AliyunOSS: Support vectored read API
> ------------------------------------
>
>                 Key: HADOOP-19211
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19211
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/oss
>    Affects Versions: 3.2.4, 3.3.6
>            Reporter: wujinhu
>            Assignee: wujinhu
>            Priority: Major
>              Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19211) AliyunOSS: Support vectored read API

Reply via email to