[ https://issues.apache.org/jira/browse/HADOOP-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-13009. ------------------------------------- Resolution: Invalid Fix Version/s: 2.8.0 S3A doesn't open the input stream; hasn't ever. There's a getFileStatus(), which is needed to determine content length and fail if the file is missing...this is one HTTP connect which is then shut down > add option for lazy open() on s3a > --------------------------------- > > Key: HADOOP-13009 > URL: https://issues.apache.org/jira/browse/HADOOP-13009 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 2.8.0 > Reporter: Steve Loughran > Fix For: 2.8.0 > > > After lazy-seek, I want to add a —very much non-default —lazy-open option. > If you look at a trace of what goes on with object store access, there's > usually a GET at offset 0 (the {{open()}} command, followed by a {{seek()}}. > If there was a lazy option option, then {{open()}} would set up the instance > for reading, but not actually talk to the object store —it'd be the first > seek or read which would hit the service. You'd eliminate one HTTP operation > from a read sequence, for a faster startup time, especially long-haul. > That's a big break in the normal assumption: if a file isn't there, > {{open()}} fails, so it'd only work with apps which did open+read, open+seek, > or opened+positioned readable action back to back. By making it an option > people can experiment to see what happens —though full testing would need to > do some fault injection on the first seek/read to see how code handled late > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)