[ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310837#comment-15310837
 ] 

Xiaoyu Yao edited comment on HDFS-9924 at 6/1/16 6:31 PM:
----------------------------------------------------------

[~daryn], thanks for the valuable feedback. [~kihwal] also mentioned similar 
issue 
[here|https://issues.apache.org/jira/browse/HADOOP-12916?focusedCommentId=15277342&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15277342].
 But I wasn't able to get clarification of it. The FSN/FSD locking issue is a 
very good point. I tried to find some metrics/logs about it but there was not 
any. I will open a separate ticket to add more metrics and WARN/DEBUG logs for 
long locking operations on namenode similar to what we have for slow 
write/network WARN/metrics on datanode.  

As you mentioned above, the priority level is assigned by scheduler. As part of 
HADOOP-12916, we separate scheduler from call queue and make it pluggable so 
that priority assignment can be customized as appropriate for different 
workloads. For the mixed write intensive and read workload example, I agree 
that the DecayedRpcScheduler that uses call rate to determine priority may not 
be the good choice. We have thought of adding a different scheduler that 
combines the weight of RPC call and its rate. But it is tricky to assign 
weight. For example,  getContentSummary on a directory with millions of 
files/dirs and a directory with a few files/dirs won't have the same impact on 
NN. 

Backoff based on response time allows all users to stop overloading namenode 
when the high priority RPC calls experience longer than normal end to end 
delay. User2/User3/User4 (low priority based on call rate) will have much wider 
response time threshold for backing off. In this case, User 1 will be backed 
off first by breaking the relative smaller response time threshold and get 
namenode out of the state that other users can not use the namenode "fairly". 

We are also proposing to have a scheduler that offers better namenode resource 
management via YARN integration on HADOOP-13128. I would appreciate if you can 
share your thoughts and comments on the proposal there as well. Thanks!



was (Author: xyao):
[~daryn], thanks for the valuable feedback. @Kihwal Lee also mentioned similar 
issue 
[here|https://issues.apache.org/jira/browse/HADOOP-12916?focusedCommentId=15277342&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15277342].
 But I wasn't able to get clarification of it. The FSN/FSD locking issue is a 
very good point. I tried to find some metrics/logs about it but there was not 
any. I will open a separate ticket to add more metrics and WARN/DEBUG logs for 
long locking operations on namenode similar to what we have for slow 
write/network WARN/metrics on datanode.  

As you mentioned above, the priority level is assigned by scheduler. As part of 
HADOOP-12916, we separate scheduler from call queue and make it pluggable so 
that priority assignment can be customized as appropriate for different 
workloads. For the mixed write intensive and read workload example, I agree 
that the DecayedRpcScheduler that uses call rate to determine priority may not 
be the good choice. We have thought of adding a different scheduler that 
combines the weight of RPC call and its rate. But it is tricky to assign 
weight. For example,  getContentSummary on a directory with millions of 
files/dirs and a directory with a few files/dirs won't have the same impact on 
NN. 

Backoff based on response time allows all users to stop overloading namenode 
when the high priority RPC calls experience longer than normal end to end 
delay. User2/User3/User4 (low priority based on call rate) will have much wider 
response time threshold for backing off. In this case, User 1 will be backed 
off first by breaking the relative smaller response time threshold and get 
namenode out of the state that other users can not use the namenode "fairly". 

We are also proposing to have a scheduler that offers better namenode resource 
management via YARN integration on HADOOP-13128. I would appreciate if you can 
share your thoughts and comments on the proposal there as well. Thanks!


> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to