Make RPC to have an option to timeout
-------------------------------------
Key: HADOOP-6889
URL: https://issues.apache.org/jira/browse/HADOOP-6889
Project: Hadoop Common
Issue Type: New Feature
Components: ipc
Affects Versions: 0.22.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Fix For: 0.22.0, 0.20-append
Currently Hadoop RPC does not timeout when the RPC server is alive. What it
currently does is that a RPC client sends a ping to the server whenever a
socket timeout happens. If the server is still alive, it continues to wait
instead of throwing a SocketTimeoutException. This is to avoid a client to
retry when a server is busy and thus making the server even busier. This works
great if the RPC server is NameNode.
But Hadoop RPC is also used for some of client to DataNode communications, for
example, for getting a replica's length. When a client comes across a
problematic DataNode, it gets stuck and can not switch to a different DataNode.
In this case, it would be better that the client receives a timeout exception.
I plan to add a new configuration ipc.client.max.pings that specifies the max
number of pings that a client could try. If a response can not be received
after the specified max number of pings, a SocketTimeoutException is thrown. If
this configuration property is not set, a client maintains the current
semantics, waiting forever.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.