Jagane Sundar created HDFS-4646:
-----------------------------------

             Summary: createNNProxyWithClientProtocol ignores configured 
timeout value
                 Key: HDFS-4646
                 URL: https://issues.apache.org/jira/browse/HDFS-4646
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
    Affects Versions: 2.0.3-alpha, 3.0.0, 2.0.4-alpha
         Environment: Linux
            Reporter: Jagane Sundar
            Priority: Minor
             Fix For: 3.0.0, 2.0.4-alpha


The Client RPC I/O timeout mechanism appears to be configured by two 
core-site.xml paramters:

1. A boolean ipc.client.ping
2. A numeric value ipc.ping.interval

If ipc.client.ping is true, then we send a RPC ping every ipc.ping.interval 
milliseconds
If ipc.client.ping is false, then ipc.ping.interval turns into the socket 
timeout value.

The bug here is that while creating a Non HA proxy, the configured timeout 
value is ignored, and 0 is passed in. 0 is taken to mean 'wait forever' and the 
client RPC socket never times out.

Note that this bug is reproducible only in the case where the NN machine dies, 
i.e. the TCP stack with the NN IP address stops responding completely. The code 
does not take this path when you do a 'kill -9' of the NN process, since there 
is a TCP stack that is alive and sends out a TCP RST to the client, and that 
results in a socket error (not a timeout).

The fix is to pass in the correct configured value for timeout by calling 
Client.getTimeout(conf) instead of passing in 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to