[jira] [Comment Edited] (HADOOP-10410) Support ioprio_set in NativeIO

Haohui Mai (JIRA) Thu, 20 Mar 2014 20:20:07 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942724#comment-13942724
 ]


Haohui Mai edited comment on HADOOP-10410 at 3/21/14 3:17 AM:
--------------------------------------------------------------

[~andrew.wang], thanks for the explanation. I think it is possible to address 
Liang's immediate requirements with external utilities, therefore I'll drill 
down on the longer term design below.

For your requirement (i.e. differentiated services for different users), this 
approach makes an assumption that each request is handled by one thread in its 
life cycle.

However, it seems nontrivial to get it right under the current implementation. 
Indeed each request is assigned to a DataXceiver at the very beginning, but the 
DN creates new threads for sending and receiving data, and kills threads if it 
has to recovery the pipeline. To get it right you have to carefully track the 
request in the threads. The complexity is one of the reasons that putting 
HDFS-5270 on hold.

I believe that in practice that you'll need  {{IOPRIO_CLASS_RT}}. Skipping 
{{IOPRIO_CLASS_RT}} leaves you only two choices, which are {{IOPRIO_CLASS_BE}} 
and {{IOPRIO_CLASS_IDLE}}. I'm skeptical whether one should set its I/O 
priority to {{IOPRIO_CLASS_IDLE}} at all, as that way other processes in the 
system can block the DN I/O.

Although adding the {{ioprio_\*}} calls give you something to experiment with, 
but I'm skeptical in the long term whether this is the right API at the right 
place. the experience of building large-scale databases show that scheduling 
with application-level information is critical for performance [1], and mapping 
the scheduling requirements into OS primitives is occasionally suboptimal [2]. 
I'm unclear that how to build a clean set of APIs for the applications directly 
on top of the {{ioprio_\*}} calls.

Coming back to your original requirement, I propose to create an I/O thread 
pool, and associate with multiple QoS queues with it. Essentially the DN 
schedules its own I/O requests. That way not only allows flexible scheduling 
policy, but also solves the throttling issues we have today. At that time 
adding {{ioprio_\*}} to adjust the priorities of the I/O threads would make 
much more sense.

References:

1. Michael Stonebraker. Operating System Support for Database Management. In 
CACM, July, 1981.


was (Author: wheat9):
[~andrew.wang], thanks for the explanation. I think it is possible to address 
Liang's immediate requirements with external utilities, therefore I'll drill 
down on the longer term design below.

For your requirement (i.e. differentiated services for different users), this 
approach makes an assumption that each request is handled by one thread in its 
life cycle.

However, it seems nontrivial to get it right under the current implementation. 
Indeed each request is assigned to a DataXceiver at the very beginning, but the 
DN creates new threads for sending and receiving data, and kills threads if it 
has to recovery the pipeline. To get it right you have to carefully track the 
request in the threads. The complexity is one of the reasons that putting 
HDFS-5270 on hold.

I believe that in practice that you'll need  {{IOPRIO_CLASS_RT}}. Skipping 
{{IOPRIO_CLASS_RT}} leaves you only two choices, which are {{IOPRIO_CLASS_BE}} 
and {{IOPRIO_CLASS_IDLE}}. I'm skeptical whether one should set its I/O 
priority to {{IOPRIO_CLASS_IDLE}} at all, as that way other processes in the 
system can block the DN I/O.

Although adding the {{ioprio_*}} calls give you something to experiment with, 
but I'm skeptical in the long term whether this is the right API at the right 
place. the experience of building large-scale databases show that scheduling 
with application-level information is critical for performance [1], and mapping 
the scheduling requirements into OS primitives is occasionally suboptimal [2]. 
I'm unclear that how to build a clean set of APIs for the applications directly 
on top of the {{ioprio_*}} calls.

Coming back to your original requirement, I propose to create an I/O thread 
pool, and associate with multiple QoS queues with it. Essentially the DN 
schedules its own I/O requests. That way not only allows flexible scheduling 
policy, but also solves the throttling issues we have today. At that time 
adding {{ioprio_*}} to adjust the priorities of the I/O threads would make much 
more sense.

References:

1. Michael Stonebraker. Operating System Support for Database Management. In 
CACM, July, 1981.

> Support ioprio_set in NativeIO
> ------------------------------
>
>                 Key: HADOOP-10410
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10410
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: native
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HADOOP-10410.txt
>
>
> It would be better to HBase application if HDFS layer provide a fine-grained 
> IO request priority. Most of modern kernel should support ioprio_set system 
> call now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HADOOP-10410) Support ioprio_set in NativeIO

Reply via email to