[ 
https://issues.apache.org/jira/browse/HADOOP-13404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388747#comment-15388747
 ] 

Peter Shi commented on HADOOP-13404:
------------------------------------

I think there are 2 solution

1) add ping response in RPC server, and check the response in client side. Need 
client side and server side modification, which may have some compatibility 
issue.
2) add thread to scan the  calls inside the connection, send timeout exception 
to the response if the call do not get response for a long time. This is only 
client side solution.

> RPC call hangs when server side CPU overloaded
> ----------------------------------------------
>
>                 Key: HADOOP-13404
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13404
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Peter Shi
>
> In our reliability test, in namenode, inject fault like cpu 100% consumed, 
> after fault injection, for existing connection, all the request will hangs 
> forever, not timeout. for new coming connection, it will failover to another 
> namenode in HA deployment.
> There is no timeout mechanism for calls on established connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to