[
https://issues.apache.org/jira/browse/HADOOP-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911095#comment-17911095
]
Steve Loughran commented on HADOOP-19042:
-----------------------------------------
* what are you seeing?
* did you have openssl on?
* what was your deployment? ec2/elsewhere
* aws s3 or someone else?
# can share a stack trace if you can?
no work has been done on this. I think the workaround is going to be "identify
what is wrong with the network"
The error translation here is a bit crude -we do actually look for error
strings from openssl for example. I don't see anything related to this (or
socket timeout) going in, but if you can help kick off the PR and test against
your system, that'd be good. Network failures are always hard to replicate as
they depend on networks having problems *and* people noticing. when the
recovery works we often recover -but things take longer. So the bug surfaces as
a performance issue, rather than a failure.
> S3A: detect and recover from SSL ConnectionReset exceptions
> -----------------------------------------------------------
>
> Key: HADOOP-19042
> URL: https://issues.apache.org/jira/browse/HADOOP-19042
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0, 3.3.6
> Reporter: Steve Loughran
> Priority: Major
>
> s3a input stream doesn't recover from SSL exceptions, specifically
> ConnectionReset
> This is a variant of HADOOP-19027, except it's surfaced on an older release...
> # need to make sure the specific exception is handled by aborting stream and
> retrying -so map to the new HttpChannelEOFException
> # all of thisd needs to be backported
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]