[jira] [Commented] (HDFS-10781) libhdfs++: redefine NN timeout to be "time without a response"

Anatoli Shein (JIRA) Tue, 23 Aug 2016 09:30:53 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433143#comment-15433143
 ]


Anatoli Shein commented on HDFS-10781:
--------------------------------------

Current hack that we are using in the code is:

{code}
//TODO: HDFS-9539 - until then we increase the time-out to allow all recursive 
async calls to finish
options.rpc_timeout = std::numeric_limits<int>::max();
{code}

This is done in all examples and tools that are recursive.

> libhdfs++: redefine NN timeout to be "time without a response"
> --------------------------------------------------------------
>
>                 Key: HDFS-10781
>                 URL: https://issues.apache.org/jira/browse/HDFS-10781
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Bob Hansen
>
> In the find tool, we submit a zillion requests to the NameNode 
> asynchronously.  As the queue on the NameNode grows, the time to response for 
> each individual message will increase.  In the find tool, we were eventually 
> getting timeouts on requests, even though the NN was respoinding as fast as 
> its little feet could carry it.
> I propose that we should redefine timeouts to be on a per-connection basis 
> rather than per-request.  If a client has an outstanding request to the NN 
> but hasn't gotten a response back within n msec, it should declare the 
> connection dead and retry.  As long as the NameNode is being responsive to 
> the best of its ability and providing data, we will not declare the link dead.
> One potential for Failure of Least Astonishment here is that it will mean any 
> particular request from a client cannot be depended on to get a positive or 
> negative response within a fixed amount of time, but I think that may be a 
> good trade to make.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-10781) libhdfs++: redefine NN timeout to be "time without a response"

Reply via email to