[ https://issues.apache.org/jira/browse/HADOOP-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HADOOP-2864. -------------------------------------- Resolution: Fixed This has changed so much since this JIRA was filed that I'm just going to close this as stale. > Improve the Scalability and Robustness of IPC > --------------------------------------------- > > Key: HADOOP-2864 > URL: https://issues.apache.org/jira/browse/HADOOP-2864 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Attachments: RPCScalabilityDesignWeb.pdf > > > This jira is intended to enhance IPC's scalability and robustness. > Currently an IPC server can easily hung due to a disk failure or garbage > collection, during which it cannot respond to the clients promptly. This has > caused a lot of dropped calls and delayed responses thus many running > applications fail on timeout. On the other side if busy clients send a lot of > requests to the server in a short period of time or too many clients > communicate with the server simultaneously, the server may be swarmed by > requests and cannot work responsively. > The proposed changes aim to > # provide a better client/server coordination > #* Server should be able to throttle client during burst of requests. > #* A slow client should not affect server from serving other clients. > #* A temporary hanging server should not cause catastrophic failures to > clients. > # Client/server should detect remote side failures. Examples of failures > include: (1) the remote host is crashed; (2) the remote host is crashed and > then rebooted; (3) the remote process is crashed or shut down by an operator; > # Fairness. Each client should be able to make progress. -- This message was sent by Atlassian JIRA (v6.2#6252)