[ 
https://issues.apache.org/jira/browse/HDFS-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626489#comment-16626489
 ] 

Konstantin Shvachko commented on HDFS-13873:
--------------------------------------------

So the main idea is to come up with an estimate of how long will it take for 
Observer to catch up to the client's state, and reject the request if the 
estimated time is larger than a set threshold. So there two main questions to 
answer
# How to estimate the catching-up time?
# Choose the threshold value.

Here some thoughts:
# I propose to measure the rate of processing edits during the startup time. We 
can assume that on startup  the Observer processes edits at the maximum rate, 
because it is the only thing it is doing. So we can take it as an upper bound 
of its processing power.
#* The upper bound should work, because if Observer is busy and falls behind, 
it will reject reads redirecting them to active NN or another Observer, which 
will free its resources to process more edits. Sort of a self throttling 
mechanism.
#* One corner case is when the cluster is new (empty image), then we cannot 
calculate, but let's set some reasonable value. I would say 10,000 tx/sec.
#* For calculating the rate we should use existing {{FSEditLog}} variables 
{{numTransactions}} and {{totalTimeTransactions}}, if possible.
# The threshold should be based on the client timeout - what is the reason of 
waiting if we know the client will timeout. So if the client timeout is 20 sec, 
then the threshold should be ~15 sec. Ideally it should be the time remaining 
until the client times out (= timeout - effectiveQueueTime), but I don't know 
if we can measure that.

> ObserverNode should reject read requests when it is too far behind.
> -------------------------------------------------------------------
>
>                 Key: HDFS-13873
>                 URL: https://issues.apache.org/jira/browse/HDFS-13873
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client, namenode
>    Affects Versions: HDFS-12943
>            Reporter: Konstantin Shvachko
>            Assignee: Chao Sun
>            Priority: Major
>
> Add a server-side threshold for ObserverNode to reject read requests when it 
> is too far behind.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to