[ 
https://issues.apache.org/jira/browse/HADOOP-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-13009.
-------------------------------------
       Resolution: Invalid
    Fix Version/s: 2.8.0

S3A doesn't open the input stream; hasn't ever. There's a getFileStatus(), 
which is needed to determine content length and fail if the file is 
missing...this is one HTTP connect which is then shut down

> add option for lazy open() on s3a
> ---------------------------------
>
>                 Key: HADOOP-13009
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13009
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>             Fix For: 2.8.0
>
>
> After lazy-seek, I want to add a —very much non-default —lazy-open option.
> If you look at a trace of what goes on with object store access, there's 
> usually a GET at offset 0 (the {{open()}} command, followed by a {{seek()}}. 
> If there was a lazy option option, then {{open()}} would set up the instance 
> for reading, but not actually talk to the object store —it'd be the first 
> seek or read which would hit the service. You'd eliminate one HTTP operation 
> from a read sequence, for a faster startup time, especially long-haul.
> That's a big break in the normal assumption: if a file isn't there, 
> {{open()}} fails, so it'd only work with apps which did open+read, open+seek, 
> or opened+positioned readable action back to back. By making it an option 
> people can experiment to see what happens —though full testing would need to 
> do some fault injection on the first seek/read to see how code handled late 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to