[ 
http://issues.apache.org/jira/browse/HADOOP-255?page=comments#action_12440279 ] 
            
Owen O'Malley commented on HADOOP-255:
--------------------------------------

I'm going to hijack this bug. Clearly the original context was fixed by moving 
from the rpc getMapOutput to a jetty servlet. However, we are seeing cases 
where the dfs servers have trouble keeping up with the rpc calls. 

Therefore, I propose that we define a fraction of the ipc.timeout that is the 
maximum time the rpc calls can take before they are given to the handler.

> Client Calls are not cancelled after a call timeout
> ---------------------------------------------------
>
>                 Key: HADOOP-255
>                 URL: http://issues.apache.org/jira/browse/HADOOP-255
>             Project: Hadoop
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.2.1
>         Environment: Tested on Linux 2.6
>            Reporter: Naveen Nalam
>         Assigned To: Owen O'Malley
>
> In ipc/Client.java, if a call times out, a SocketTimeoutException is thrown 
> but the Call object still exists on the queue.
> What I found was that when transferring very large amounts of data, it's 
> common for queued up calls to timeout. Yet even though the caller has is no 
> longer waiting, the request is still serviced on the server and the data is 
> sent to the client. The client after receiving the full response calls 
> callComplete() which is a noop since nobody is waiting.
> The problem is that the calls that timeout will retry and the system gets 
> into a situation where data is being transferred around, but it's all data 
> for timed out requests and no progress is ever made.
> My quick solution to this was to add a "boolean timedout" to the Call object 
> which I set to true whenever the queued caller times out. And then when the 
> client starts to pull over the response data (in Connection::run) to first 
> check if the Call is timedout and immediately close the connection.
> I think a good fix for this is to queue requests on the client, and do a 
> single sendParam only when there is no outstanding request. This will allow 
> closing the connection when receiving a response for a request we no longer 
> have pending, reopen the connection, and resend the next queued request. I 
> can provide a patch for this, but I've seen a lot of recent activity in this 
> area so I'd like to get some feedback first.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to