Bob Hansen created HDFS-10781:
---------------------------------

             Summary: libhdfs++: redefine NN timeout to be "time without a 
response"
                 Key: HDFS-10781
                 URL: https://issues.apache.org/jira/browse/HDFS-10781
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Bob Hansen


In the find tool, we submit a zillion requests to the NameNode asynchronously.  
As the queue on the NameNode grows, the time to response for each individual 
message will increase.  In the find tool, we were eventually getting timeouts 
on requests, even though the NN was respoinding as fast as its little feet 
could carry it.

I propose that we should redefine timeouts to be on a per-connection basis 
rather than per-request.  If a client has an outstanding request to the NN but 
hasn't gotten a response back within n msec, it should declare the connection 
dead and retry.  As long as the NameNode is being responsive to the best of its 
ability and providing data, we will not declare the link dead.

One potential for Failure of Least Astonishment here is that it will mean any 
particular request from a client cannot be depended on to get a positive or 
negative response within a fixed amount of time, but I think that may be a good 
trade to make.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to