[jira] [Updated] (HADOOP-18179) Boost S3A Stream Read Performance

Steve Loughran (Jira) Fri, 21 Apr 2023 03:15:04 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-18179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Steve Loughran updated HADOOP-18179:
------------------------------------
    Summary: Boost S3A Stream Read Performance  (was: Boost S3A Stream Read 
Performance with prefetching and caching)

> Boost S3A Stream Read Performance
> ---------------------------------
>
>                 Key: HADOOP-18179
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18179
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.3.2
>            Reporter: Steve Loughran
>            Priority: Major
>
> calibrate S3A input stream performance against recent applications/data 
> formats and improve where necessary.
> HADOOP-18028 is a key part of this, but there are other issues/opertunities
> # we could add machine parsable trace-level logging in FSDataInputStream to 
> collect stats on how stream apis are invoked, so collect data from real apps; 
> analyze
> # implement those APIs which some apps use (ByteBufferPositionedReadable), 
> not so much for direct implementation as to get better information from the 
> app as its read plan
> # the `normal` mode doesn't switch from sequential on forward seeks. Is that 
> always appropriate?
> # choose different buffering options when doing whole file IO vs sequential 
> vs random



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-18179) Boost S3A Stream Read Performance

Reply via email to