[
https://issues.apache.org/jira/browse/HADOOP-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076817#comment-16076817
]
Steve Loughran commented on HADOOP-14468:
-----------------------------------------
The general code path for file IO is one of two things
sequential from the start (gzip unzip, CSV, text, avro)
{code}
instream= fs.open(path)
instream.read()
{code}
or: seek and read, explicit or in a readFully() call. This is the codepath in:
.snappy files, examining columnar stored data in ORC, Parquet, ...
{code}
instream= fs.open(path)
instream.seek(somewhere)
instream.read(bytes)
instream.seek(somewhere-else)
...
{code}
Either way, there's usually a read() call very shortly after the open, which is
when any missing file will surface, so we don't need to overreact —just make
sure that the error message which surfaces ona 404 on the first open of a file
is propagated up in a way which is meaningful and consistent with what people
normally expect. HADOOP-14467 looks at that.
Doing it as a fallback for troubleshooting/monitoring is something to consider
though.
> S3Guard: make short-circuit getFileStatus() configurable
> --------------------------------------------------------
>
> Key: HADOOP-14468
> URL: https://issues.apache.org/jira/browse/HADOOP-14468
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Aaron Fabbri
> Assignee: Aaron Fabbri
> Priority: Minor
>
> Currently, when S3Guard is enabled, getFileStatus() will skip S3 if it gets a
> result from the MetadataStore (e.g. dynamodb) first.
> I would like to add a new parameter
> {{fs.s3a.metadatastore.getfilestatus.authoritative}} which, when true, keeps
> the current behavior. When false, S3AFileSystem will check both S3 and the
> MetadataStore.
> I'm not sure yet if we want to have this behavior the same for all callers of
> getFileStatus(), or if we only want to check both S3 and MetadataStore for
> some internal callers such as open().
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]