[
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310837#comment-15310837
]
Xiaoyu Yao edited comment on HDFS-9924 at 6/1/16 6:31 PM:
----------------------------------------------------------
[~daryn], thanks for the valuable feedback. [~kihwal] also mentioned similar
issue
[here|https://issues.apache.org/jira/browse/HADOOP-12916?focusedCommentId=15277342&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15277342].
But I wasn't able to get clarification of it. The FSN/FSD locking issue is a
very good point. I tried to find some metrics/logs about it but there was not
any. I will open a separate ticket to add more metrics and WARN/DEBUG logs for
long locking operations on namenode similar to what we have for slow
write/network WARN/metrics on datanode.
As you mentioned above, the priority level is assigned by scheduler. As part of
HADOOP-12916, we separate scheduler from call queue and make it pluggable so
that priority assignment can be customized as appropriate for different
workloads. For the mixed write intensive and read workload example, I agree
that the DecayedRpcScheduler that uses call rate to determine priority may not
be the good choice. We have thought of adding a different scheduler that
combines the weight of RPC call and its rate. But it is tricky to assign
weight. For example, getContentSummary on a directory with millions of
files/dirs and a directory with a few files/dirs won't have the same impact on
NN.
Backoff based on response time allows all users to stop overloading namenode
when the high priority RPC calls experience longer than normal end to end
delay. User2/User3/User4 (low priority based on call rate) will have much wider
response time threshold for backing off. In this case, User 1 will be backed
off first by breaking the relative smaller response time threshold and get
namenode out of the state that other users can not use the namenode "fairly".
We are also proposing to have a scheduler that offers better namenode resource
management via YARN integration on HADOOP-13128. I would appreciate if you can
share your thoughts and comments on the proposal there as well. Thanks!
was (Author: xyao):
[~daryn], thanks for the valuable feedback. @Kihwal Lee also mentioned similar
issue
[here|https://issues.apache.org/jira/browse/HADOOP-12916?focusedCommentId=15277342&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15277342].
But I wasn't able to get clarification of it. The FSN/FSD locking issue is a
very good point. I tried to find some metrics/logs about it but there was not
any. I will open a separate ticket to add more metrics and WARN/DEBUG logs for
long locking operations on namenode similar to what we have for slow
write/network WARN/metrics on datanode.
As you mentioned above, the priority level is assigned by scheduler. As part of
HADOOP-12916, we separate scheduler from call queue and make it pluggable so
that priority assignment can be customized as appropriate for different
workloads. For the mixed write intensive and read workload example, I agree
that the DecayedRpcScheduler that uses call rate to determine priority may not
be the good choice. We have thought of adding a different scheduler that
combines the weight of RPC call and its rate. But it is tricky to assign
weight. For example, getContentSummary on a directory with millions of
files/dirs and a directory with a few files/dirs won't have the same impact on
NN.
Backoff based on response time allows all users to stop overloading namenode
when the high priority RPC calls experience longer than normal end to end
delay. User2/User3/User4 (low priority based on call rate) will have much wider
response time threshold for backing off. In this case, User 1 will be backed
off first by breaking the relative smaller response time threshold and get
namenode out of the state that other users can not use the namenode "fairly".
We are also proposing to have a scheduler that offers better namenode resource
management via YARN integration on HADOOP-13128. I would appreciate if you can
share your thoughts and comments on the proposal there as well. Thanks!
> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>
> Key: HDFS-9924
> URL: https://issues.apache.org/jira/browse/HDFS-9924
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: fs
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Xiaobing Zhou
> Attachments: AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked
> until the method returns. It is very slow if a client makes a large number
> of independent calls in a single thread since each call has to wait until the
> previous call is finished. It is inefficient if a client needs to create a
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is
> not blocked. The methods in the new API immediately return a Java Future
> object. The return value can be obtained by the usual Future.get() method.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]