[
https://issues.apache.org/jira/browse/HADOOP-18927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773628#comment-17773628
]
Steve Loughran commented on HADOOP-18927:
-----------------------------------------
* treat NoRouteToHostException and BindException as unrecoverable
* PortUnreachableException is UDP related and we shouldn't see it
* other SocketExceptions => connectivity and hope idempotent
Maybe we need to
# review that connectivityFailure retry policy
# have a variant for connectivityFailureNonIdempotent -socket exceptions which
may have happened partway through an operation.
# just review that entire exception retry logic and see what is wrong with it
> S3ARetryHandler to treat SocketExceptions as connectivity failures
> ------------------------------------------------------------------
>
> Key: HADOOP-18927
> URL: https://issues.apache.org/jira/browse/HADOOP-18927
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.6
> Reporter: Steve Loughran
> Priority: Major
>
> i've got a v1 sdk stack trace where a TCP connection reset is breaking a
> large upload. that should be recoverable with retries.
> {code}
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Connection
> reset by peer: Unable to execute HTTP request: Connection reset by peer at...
> {code}
> proposed:
> * S3ARetryPolicy to map SocketException to connectivity failure
> * See if we can create a test for this, ideally under the aws sdk.
> I'm now unsure about how well we handle these io problems...a quick
> experiment with the 3.3.5 release shows that the retry policy retries on
> whatever exception chain has an unknown host for the endpoint.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]