[
https://issues.apache.org/jira/browse/HADOOP-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved HADOOP-13009.
-------------------------------------
Resolution: Invalid
Fix Version/s: 2.8.0
S3A doesn't open the input stream; hasn't ever. There's a getFileStatus(),
which is needed to determine content length and fail if the file is
missing...this is one HTTP connect which is then shut down
> add option for lazy open() on s3a
> ---------------------------------
>
> Key: HADOOP-13009
> URL: https://issues.apache.org/jira/browse/HADOOP-13009
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Fix For: 2.8.0
>
>
> After lazy-seek, I want to add a —very much non-default —lazy-open option.
> If you look at a trace of what goes on with object store access, there's
> usually a GET at offset 0 (the {{open()}} command, followed by a {{seek()}}.
> If there was a lazy option option, then {{open()}} would set up the instance
> for reading, but not actually talk to the object store —it'd be the first
> seek or read which would hit the service. You'd eliminate one HTTP operation
> from a read sequence, for a faster startup time, especially long-haul.
> That's a big break in the normal assumption: if a file isn't there,
> {{open()}} fails, so it'd only work with apps which did open+read, open+seek,
> or opened+positioned readable action back to back. By making it an option
> people can experiment to see what happens —though full testing would need to
> do some fault injection on the first seek/read to see how code handled late
> failure.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)