[
https://issues.apache.org/jira/browse/HADOOP-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942724#comment-13942724
]
Haohui Mai edited comment on HADOOP-10410 at 3/21/14 3:17 AM:
--------------------------------------------------------------
[~andrew.wang], thanks for the explanation. I think it is possible to address
Liang's immediate requirements with external utilities, therefore I'll drill
down on the longer term design below.
For your requirement (i.e. differentiated services for different users), this
approach makes an assumption that each request is handled by one thread in its
life cycle.
However, it seems nontrivial to get it right under the current implementation.
Indeed each request is assigned to a DataXceiver at the very beginning, but the
DN creates new threads for sending and receiving data, and kills threads if it
has to recovery the pipeline. To get it right you have to carefully track the
request in the threads. The complexity is one of the reasons that putting
HDFS-5270 on hold.
I believe that in practice that you'll need {{IOPRIO_CLASS_RT}}. Skipping
{{IOPRIO_CLASS_RT}} leaves you only two choices, which are {{IOPRIO_CLASS_BE}}
and {{IOPRIO_CLASS_IDLE}}. I'm skeptical whether one should set its I/O
priority to {{IOPRIO_CLASS_IDLE}} at all, as that way other processes in the
system can block the DN I/O.
Although adding the {{ioprio_\*}} calls give you something to experiment with,
but I'm skeptical in the long term whether this is the right API at the right
place. the experience of building large-scale databases show that scheduling
with application-level information is critical for performance [1], and mapping
the scheduling requirements into OS primitives is occasionally suboptimal [2].
I'm unclear that how to build a clean set of APIs for the applications directly
on top of the {{ioprio_\*}} calls.
Coming back to your original requirement, I propose to create an I/O thread
pool, and associate with multiple QoS queues with it. Essentially the DN
schedules its own I/O requests. That way not only allows flexible scheduling
policy, but also solves the throttling issues we have today. At that time
adding {{ioprio_\*}} to adjust the priorities of the I/O threads would make
much more sense.
References:
1. Michael Stonebraker. Operating System Support for Database Management. In
CACM, July, 1981.
was (Author: wheat9):
[~andrew.wang], thanks for the explanation. I think it is possible to address
Liang's immediate requirements with external utilities, therefore I'll drill
down on the longer term design below.
For your requirement (i.e. differentiated services for different users), this
approach makes an assumption that each request is handled by one thread in its
life cycle.
However, it seems nontrivial to get it right under the current implementation.
Indeed each request is assigned to a DataXceiver at the very beginning, but the
DN creates new threads for sending and receiving data, and kills threads if it
has to recovery the pipeline. To get it right you have to carefully track the
request in the threads. The complexity is one of the reasons that putting
HDFS-5270 on hold.
I believe that in practice that you'll need {{IOPRIO_CLASS_RT}}. Skipping
{{IOPRIO_CLASS_RT}} leaves you only two choices, which are {{IOPRIO_CLASS_BE}}
and {{IOPRIO_CLASS_IDLE}}. I'm skeptical whether one should set its I/O
priority to {{IOPRIO_CLASS_IDLE}} at all, as that way other processes in the
system can block the DN I/O.
Although adding the {{ioprio_*}} calls give you something to experiment with,
but I'm skeptical in the long term whether this is the right API at the right
place. the experience of building large-scale databases show that scheduling
with application-level information is critical for performance [1], and mapping
the scheduling requirements into OS primitives is occasionally suboptimal [2].
I'm unclear that how to build a clean set of APIs for the applications directly
on top of the {{ioprio_*}} calls.
Coming back to your original requirement, I propose to create an I/O thread
pool, and associate with multiple QoS queues with it. Essentially the DN
schedules its own I/O requests. That way not only allows flexible scheduling
policy, but also solves the throttling issues we have today. At that time
adding {{ioprio_*}} to adjust the priorities of the I/O threads would make much
more sense.
References:
1. Michael Stonebraker. Operating System Support for Database Management. In
CACM, July, 1981.
> Support ioprio_set in NativeIO
> ------------------------------
>
> Key: HADOOP-10410
> URL: https://issues.apache.org/jira/browse/HADOOP-10410
> Project: Hadoop Common
> Issue Type: New Feature
> Components: native
> Affects Versions: 3.0.0, 2.4.0
> Reporter: Liang Xie
> Assignee: Liang Xie
> Attachments: HADOOP-10410.txt
>
>
> It would be better to HBase application if HDFS layer provide a fine-grained
> IO request priority. Most of modern kernel should support ioprio_set system
> call now.
--
This message was sent by Atlassian JIRA
(v6.2#6252)