[
https://issues.apache.org/jira/browse/HDFS-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626489#comment-16626489
]
Konstantin Shvachko commented on HDFS-13873:
--------------------------------------------
So the main idea is to come up with an estimate of how long will it take for
Observer to catch up to the client's state, and reject the request if the
estimated time is larger than a set threshold. So there two main questions to
answer
# How to estimate the catching-up time?
# Choose the threshold value.
Here some thoughts:
# I propose to measure the rate of processing edits during the startup time. We
can assume that on startup the Observer processes edits at the maximum rate,
because it is the only thing it is doing. So we can take it as an upper bound
of its processing power.
#* The upper bound should work, because if Observer is busy and falls behind,
it will reject reads redirecting them to active NN or another Observer, which
will free its resources to process more edits. Sort of a self throttling
mechanism.
#* One corner case is when the cluster is new (empty image), then we cannot
calculate, but let's set some reasonable value. I would say 10,000 tx/sec.
#* For calculating the rate we should use existing {{FSEditLog}} variables
{{numTransactions}} and {{totalTimeTransactions}}, if possible.
# The threshold should be based on the client timeout - what is the reason of
waiting if we know the client will timeout. So if the client timeout is 20 sec,
then the threshold should be ~15 sec. Ideally it should be the time remaining
until the client times out (= timeout - effectiveQueueTime), but I don't know
if we can measure that.
> ObserverNode should reject read requests when it is too far behind.
> -------------------------------------------------------------------
>
> Key: HDFS-13873
> URL: https://issues.apache.org/jira/browse/HDFS-13873
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client, namenode
> Affects Versions: HDFS-12943
> Reporter: Konstantin Shvachko
> Assignee: Chao Sun
> Priority: Major
>
> Add a server-side threshold for ObserverNode to reject read requests when it
> is too far behind.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]