[
https://issues.apache.org/jira/browse/HDFS-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596712#comment-14596712
]
Haohui Mai commented on HDFS-8617:
----------------------------------
bq. You can read my related SoCC paper here:
http://umbrant.com/papers/socc12-cake.pdf . I experimented with ioprio about 3
years ago as part of this work, and didn't get positive results. We needed
application-level throttling.
As you mentioned in the evaluation, there are adverse effects on throughputs.
I agree that application-level throttling can be useful. The proposed solution,
however, relies on magic numbers to work. My concern is that how to choose the
magic numbers? Is it repeatable to achieve good performance? Is it
generalizable to other configuration? It looks to me that currently the answers
of both questions are no. The proposed solution looks like lowering the
utilization of the cluster (at the cost of making {{checkDir()}} really slow)
to meet the SLOs.
bq. The key issue though, as both Colin and I have mentioned, is that there is
queuing both in the OS and on disk. ioprio only affects OS-level queuing, and
disk-level queuing can be quite substantial. Not sure how much more needs to be
said.
Point taken. Unfortunately without performance benchmarks and numbers the
statements are purely speculative. For example, what do you mean by
substantial? The size of the NCQ is 32 compared the size of OS level I/O queue
can be hundreds or thousands. I really appreciate doing some performance
benchmarks and sharing the numbers.
My concern of the proposal is that the parameter cannot be automatically
tunable w.r.t. cluster configurations and loads. It has to be dynamic. In the
longer term it makes a lot sense to tune these parameters based on the length
of the I/O queue, avg. processing time, etc. At the first step I think it can
be very helpful to simply correlate these parameters with simple metrics like
the number of tranceiver threads.
> Throttle DiskChecker#checkDirs() speed.
> ---------------------------------------
>
> Key: HDFS-8617
> URL: https://issues.apache.org/jira/browse/HDFS-8617
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: HDFS
> Affects Versions: 2.7.0
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Attachments: HDFS-8617.000.patch
>
>
> As described in HDFS-8564, {{DiskChecker.checkDirs(finalizedDir)}} is
> causing excessive I/Os because {{finalizedDirs}} might have up to 64K
> sub-directories (HDFS-6482).
> This patch proposes to limit the rate of IO operations in
> {{DiskChecker.checkDirs()}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)