neils-dev opened a new pull request #2901: URL: https://github.com/apache/ozone/pull/2901
…oxyProvider with GrpcOmTransport invocationHandler. ## What changes were proposed in this pull request? HA support for S3 gateway Grpc transport between `S3 gateway` and `Ozone Manager`. Follows that of ozone clients providing HA support through the `OMFailoverProxyProvider`. For Grpc, the `OMFailoverProxyProvider` is extended by the Grpc specific `GrpcFailoverProxyProvider` that overrides configuration methods but reuses all retry logic including the `retryPolicy` and actions. With this, all ozone clients supporting failover use the same underlying failover proxy. The Grpc implementation invokes the retryPolicy on exception from processing an s3 request across the Grpc channel. Every exception from the channel is processed through the `submitRequest` acting as the retry invocation handler i.) deserializes the exception generated through the channel ii.) takes action based on the retryAction returned by the retryPolicy of the `OMFailoverProxyProvider` iii.) either retries through the proxy determined by the `OMFailoverProxyProvider` or Fails the request based on the retryPolicy and exception type Typical s3 request processing involves the GrpcOmTransport: - submitRequest, `org.apache.hadoop.ozone.om.protocolPB.GrpcOmTransport.submitRequest` - on exception deserialize exception, `org.apache.hadoop.ozone.om.protocolPB.GrpcOmTransport.unwrapException` - get retryAction, `org.apache.hadoop.ozone.om.protocolPB.shouldRetry` - `OMFailoverProxyProvider retryPolicy.shouldRetry` - take action, fail or retry with proxy set by `OMFailoverProxyProvider` ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-5544 ## How was this patch tested? Patch was tested through new unit tests for Grpc Failover and manually tested with ozonesecure-ha docker cluster. 1. Unit Tests : 2 unit tests TestS3GrpcOmTransport.java i. testGrpcFailoverProxy - client submits OmRequest to Ozone Manager server over Grpc; server initially injects fault throwing NotALeaderException; client through failover retries and passes ii. testGrpcFailoverProxyExhaustRetry - client submits OmRequest to server, server throws NotALeaderException, client fails retry due to exhausted retry `hadoop-ozone/common$ mvn -Dtest=TestS3GrpcOmTransport test` ``` Running org.apache.hadoop.ozone.om.protocolPB.TestS3GrpcOmTransport [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.417 s - in org.apache.hadoop.ozone.om.protocolPB.TestS3GrpcOmTransport ``` 2. Manual tests `hadoop-ozone/dist/target/ozone-1.2.0-SNAPSHOT/compose/ozonesecure-ha$ docker-compose up -d --scale datanode=3` set aws-cli bucket requests with aws credentials (in profile ozone) ie. `aws s3api --profile ozone --endpoint http://localhost:9878 create-bucket --bucket bucket1` stop leader om `hadoop-ozone/dist/target/ozone-1.2.0-SNAPSHOT/compose/ozonesecure-ha$ docker-compose stop om1` force s3g Grpc into failover, send aws-cli bucket request and confirm success with two remaining HA Ozone Managers `aws s3api --profile ozone --endpoint http://localhost:9878 create-bucket --bucket bucket2` ``` { "Location": "http://localhost:9878/bucket2" } ``` ozonesecure-ha acceptance tests: `hadoop-ozone/dist/target/ozone-1.2.0-SNAPSHOT/compose/ozonesecure-ha$test.sh` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
