Jagane Sundar created HDFS-4646:
-----------------------------------
Summary: createNNProxyWithClientProtocol ignores configured
timeout value
Key: HDFS-4646
URL: https://issues.apache.org/jira/browse/HDFS-4646
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.0.3-alpha, 3.0.0, 2.0.4-alpha
Environment: Linux
Reporter: Jagane Sundar
Priority: Minor
Fix For: 3.0.0, 2.0.4-alpha
The Client RPC I/O timeout mechanism appears to be configured by two
core-site.xml paramters:
1. A boolean ipc.client.ping
2. A numeric value ipc.ping.interval
If ipc.client.ping is true, then we send a RPC ping every ipc.ping.interval
milliseconds
If ipc.client.ping is false, then ipc.ping.interval turns into the socket
timeout value.
The bug here is that while creating a Non HA proxy, the configured timeout
value is ignored, and 0 is passed in. 0 is taken to mean 'wait forever' and the
client RPC socket never times out.
Note that this bug is reproducible only in the case where the NN machine dies,
i.e. the TCP stack with the NN IP address stops responding completely. The code
does not take this path when you do a 'kill -9' of the NN process, since there
is a TCP stack that is alive and sends out a TCP RST to the client, and that
results in a socket error (not a timeout).
The fix is to pass in the correct configured value for timeout by calling
Client.getTimeout(conf) instead of passing in 0.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira