[
https://issues.apache.org/jira/browse/HDFS-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433143#comment-15433143
]
Anatoli Shein commented on HDFS-10781:
--------------------------------------
Current hack that we are using in the code is:
{code}
//TODO: HDFS-9539 - until then we increase the time-out to allow all recursive
async calls to finish
options.rpc_timeout = std::numeric_limits<int>::max();
{code}
This is done in all examples and tools that are recursive.
> libhdfs++: redefine NN timeout to be "time without a response"
> --------------------------------------------------------------
>
> Key: HDFS-10781
> URL: https://issues.apache.org/jira/browse/HDFS-10781
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Bob Hansen
>
> In the find tool, we submit a zillion requests to the NameNode
> asynchronously. As the queue on the NameNode grows, the time to response for
> each individual message will increase. In the find tool, we were eventually
> getting timeouts on requests, even though the NN was respoinding as fast as
> its little feet could carry it.
> I propose that we should redefine timeouts to be on a per-connection basis
> rather than per-request. If a client has an outstanding request to the NN
> but hasn't gotten a response back within n msec, it should declare the
> connection dead and retry. As long as the NameNode is being responsive to
> the best of its ability and providing data, we will not declare the link dead.
> One potential for Failure of Least Astonishment here is that it will mean any
> particular request from a client cannot be depended on to get a positive or
> negative response within a fixed amount of time, but I think that may be a
> good trade to make.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]