[ https://issues.apache.org/jira/browse/HADOOP-18179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-18179: ------------------------------------ Summary: Boost S3A Stream Read Performance (was: Boost S3A Stream Read Performance with prefetching and caching) > Boost S3A Stream Read Performance > --------------------------------- > > Key: HADOOP-18179 > URL: https://issues.apache.org/jira/browse/HADOOP-18179 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 > Affects Versions: 3.3.2 > Reporter: Steve Loughran > Priority: Major > > calibrate S3A input stream performance against recent applications/data > formats and improve where necessary. > HADOOP-18028 is a key part of this, but there are other issues/opertunities > # we could add machine parsable trace-level logging in FSDataInputStream to > collect stats on how stream apis are invoked, so collect data from real apps; > analyze > # implement those APIs which some apps use (ByteBufferPositionedReadable), > not so much for direct implementation as to get better information from the > app as its read plan > # the `normal` mode doesn't switch from sequential on forward seeks. Is that > always appropriate? > # choose different buffering options when doing whole file IO vs sequential > vs random -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org