Steve Loughran created HADOOP-15245:
---------------------------------------
Summary: S3AInputStream.skip() to use lazy seek
Key: HADOOP-15245
URL: https://issues.apache.org/jira/browse/HADOOP-15245
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/s3
Affects Versions: 3.1.0
Reporter: Steve Loughran
the default skip() does a read and discard of all bytes, no matter how far
ahead the skip is. This is very inefficient if the skip() is being done on S3A
random IO, though exactly what to do when in sequential mode.
Proposed:
* add an optimized version of S3AInputStream.skip() which does a lazy seek,
which itself will decided when to skip() vs issue a new GET.
* add some more instrumentation to measure how often this gets used
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]