[ http://issues.apache.org/jira/browse/HADOOP-312?page=all ]
Devaraj Das updated HADOOP-312:
-------------------------------
Status: Patch Available (was: Open)
Affects Version/s: (was: 0.4.0)
This patch implements the following:
1) Caching of client - server connections is made optional. Defaults to
no-caching.
2) If no-caching is true, clients will disconnect idle connections to a server
after a configured time. The idle time defaults to 1 second.
The performance hit in this case is that once in a while clients are not able
to establish a connection to a server (if the server is too busy to accept
incoming connections). I have seen this in the case of TaskTracker ->
JobTracker protocol. It happens once in a while. When it happens, the
JobTracker assumes that the TaskTracker is lost and then there is a whole set
of reruns for the tasks that were running on this "lost" tasktracker. This
slows down the overall progress of the job. Of course, this also happens in the
case where the connections are cached but the difference is that the RPCs
timeout as opposed to connect failing.
If the above doesn't happen, the performance figures with/without caching on a
370 node cluster is nearly the same.
> Connections should not be cached
> --------------------------------
>
> Key: HADOOP-312
> URL: http://issues.apache.org/jira/browse/HADOOP-312
> Project: Hadoop
> Issue Type: Improvement
> Components: ipc
> Reporter: Devaraj Das
> Assigned To: Devaraj Das
> Attachments: no_connection_caching.patch
>
>
> Servers and clients (client include datanodes, tasktrackers, DFSClients &
> tasks) should not cache connections or maybe cache them for very short
> periods of time. Clients should set up & tear down connections to the servers
> everytime they need to contact the servers (including the heartbeats). If
> connection is cached, then reuse the existing connection for a few subsequent
> transactions until the connection expires. The heartbeat interval should be
> more so that many more clients (order of tens of thousands) can be
> accomodated within 1 heartbeat interval.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira