neils-dev commented on code in PR #3389:
URL: https://github.com/apache/ozone/pull/3389#discussion_r870552755


##########
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/GrpcOmTransport.java:
##########
@@ -177,17 +181,22 @@ public void start() throws IOException {
   public OMResponse submitRequest(OMRequest payload) throws IOException {
     OMResponse resp = null;
     boolean tryOtherHost = true;
+    int expectedFailoverCount = 0;
     ResultCodes resultCode = ResultCodes.INTERNAL_ERROR;
     while (tryOtherHost) {
       tryOtherHost = false;
+      expectedFailoverCount = syncFailoverCount.get();

Review Comment:
   Thanks @kerneltime for reviewing this PR and for your comments.  The 
`expectedFailoverCount` that is re-synced to the `syncFailoverCount` prior to 
each` submitRequest` (over Grpc from s3g to om) and then checked against the 
`syncFailoverCount `in the `shouldRetry` is to guard against another thread 
also in failover incrementing the proxy index.
   
   This is to ensure `performFailover` (of OmFailoverProxyProviderBase) " _is 
supposed to called only once in a multithreaded enviornment.  This is where the 
failover occurs.  performFailOver updates the currentProxyOmNode.._."   
   Implemented here, the `performFailover` is only applied in failover through 
`shouldRety` iff `expectedFailoverCount` is in sync with the 
`syncFailoverCount` (`expectedFailoverCount` == `syncFailoverCount`).  Should 
the `expectedFailoverCount` in `shouldRetry` method not equal the 
`syncFailoverCount` at the point in failover, it implies that another thread is 
in failover and has incremented the proxy to retry with.  In this event the` 
performFailover` is **_NOT_** called, instead the proxy is updated to the 
current proxy index (which has been incremented by another thread in failover).
   
   Please do check if the behavior of the failover with `performFailover` (in 
`shouldRetry`) through the `expectedFailoverCount` and `syncFailoverCount` is 
as we want it to be under multi-threaded environment.  
    
   ```
             if (syncFailoverCount.get() == expectedFailoverCount) {
               omFailoverProxyProvider.performFailover(null);
               syncFailoverCount.getAndIncrement();
             } else {
               LOG.warn("A failover has occurred since the start of current" +
                   " thread retry, NOT failover using current proxy");
             }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to