[
https://issues.apache.org/jira/browse/HADOOP-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357390#comment-16357390
]
Steve Loughran commented on HADOOP-15216:
-----------------------------------------
HADOOP-13761 covers the condition where S3Guard finds the file in its
geFileStatus in the {{FileSystem.open()}} call, but when S3AInputStream
initiates the GET a 404 comes back: FNFE should be handled with backoff too
* Maybe: special handling for that first attempt, as an FNFE on later ones
probably means someone deleted the file
* The situation of HEAD -> 200, GET -> 400 could also arise if the GET went to
a different shard from the HEAD. So the condition could also arise in
non-S3guarded buckets, sometimes
> S3AInputStream to handle reconnect on read() failure better
> -----------------------------------------------------------
>
> Key: HADOOP-15216
> URL: https://issues.apache.org/jira/browse/HADOOP-15216
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
> Priority: Major
>
> {{S3AInputStream}} handles any IOE through a close() of stream and single
> re-invocation of the read, with
> * no backoff
> * no abort of the HTTPS connection, which is just returned to the pool, If
> httpclient hasn't noticed the failure, it may get returned to the caller on
> the next read
> Proposed
> * switch to invoker
> * retry policy explicitly for stream (EOF => throw, timeout => close, sleep,
> retry, etc)
> We could think about extending the fault injection to inject stream read
> failures intermittently too, though it would need something in S3AInputStream
> to (optionally) wrap the http input streams with the failing stream.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]