functioner commented on pull request #2727:
URL: https://github.com/apache/hadoop/pull/2727#issuecomment-788083854


   > > @functioner see TestRPC#testClientRpcTimeout
   > > Maybe It is expected that Send ping all the time when 
_ipc.client.rpc-timeout.ms_ less than 0.
   > > @iwasakims Do you have any ideas? Thanks.
   > 
   > @ferhui See TestRPC.java line 1419-1426 (within testClientRpcTimeout)
   > 
   > ```
   >         try {
   >           // should not time out because effective rpc-timeout is
   >           // multiple of ping interval: 1600 (= 800 * (1000 / 800 + 1))
   >           proxy.sleep(null, newSleepRequest(1300));
   >         } catch (ServiceException e) {
   >           LOG.info("got unexpected exception.", e);
   >           fail("RPC should not time out.");
   >         }
   > ```
   > 
   > According to the comment there, I think the correct behavior is to assign 
the effective rpc-timeout with multiple of ping interval.
   > In the current implementation (e.g., the constructor of 
hadoop.ipc.Client$Connection), rpc-timeout is set as a single ping interval, 
rather than multiple of ping interval.
   > Therefore, timeout is the expected behavior.
   
   @iwasakims The current codebase can pass this test, but in the wrong way. 
The current behavior is always swallowing the SocketTimeoutException in line 
548:
   ```
        /* Process timeout exception
          * if the connection is not going to be closed or 
          * the RPC is not timed out yet, send a ping.
          */
         private void handleTimeout(SocketTimeoutException e, int waiting)
             throws IOException {
           if (shouldCloseConnection.get() || !running.get() ||
               (0 < rpcTimeout && rpcTimeout <= waiting)) {            // line 
545
             throw e;
           } else {
             sendPing();                                               // line 
548
           }
         }
   ```
   According to the comment in that test case, it "should not time out because 
effective rpc-timeout is multiple of ping interval: 1600 (= 800 * (1000 / 800 + 
1))", and it doesn't mean that it shouldn't time out.
   Hence, I think the root cause of the bug is the code (line 545) does not 
enforce the semantic of `ipc.client.rpc-timeout.ms` configuration in 
[core-default.xml](https://hadoop.apache.org/docs/r3.2.2/hadoop-project-dist/hadoop-common/core-default.xml).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to