aswinshakil commented on PR #3977: URL: https://github.com/apache/ozone/pull/3977#issuecomment-1322487455
@sodonnel thanks for looking into this. GRPC returns one of the 17 error codes. The error code `UNAVAILABLE` falls under a wider umbrella, where it can also indicate the, 1) GRPC Server is down. 2) Some data is transferred before the connection breaks. We want to retry on the second scenario, but not on the first. Since it only sends generic `UNAVAILABLE` it can't be distinguished with GRPC Internal retry policy. For the question, Connection refused falls under the `UNAVAILABLE` status code, hence we will retry it. If we are okay with not retrying for the second scenario, We can just keep `DEADLINE_EXCEEDED` for the retry policy. As per the [JIRA comment](https://issues.apache.org/jira/browse/HDDS-7187?focusedCommentId=17598216&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17598216), they have faced only `DEADLINE_EXCEEDED` as per their testing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
