[
https://issues.apache.org/jira/browse/HDFS-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592656#comment-14592656
]
Haohui Mai commented on HDFS-8617:
----------------------------------
Thanks for the comments. I think all of us agree that we should give a shot of
performing {{checkDirs()}} in a thread that has low I/O priority in the
background. Just some quick questions regarding the comments:
bq. NCQ means there's substantial queuing on the disk itself, which isn't
affected by ioprios.
I might be missing something. How does NCQ become a factor if the OS has not
pushed the I/O to the disks? In production we have observed that the OS disk
I/O queue can have as many as 1000 entries. This is two orders of magnitude
larger compared to the capacity of the NCQ queue (32). Given the abundant
amount of I/O requests, there is a very high chance that the OS scheduler will
do a great job in terms of scheduling.
bq. Given a typical IOPS is about 100 for HDD, throttling it to 50 or less
calls per second should consume less than 1/2 IOPS. On Ext3/4 this can be better
I'm unsure whether this is the right math. I just checked the code. It looks
like {{checkDir()}} mostly performs read-only operations on the metadata of the
underlying filesystem. The metadata can be fully cached thus the parameter can
be way off (and for SSD the parameter needs to be recalculated). That comes
back to the point that it is difficult to determine the right parameter for
various configuration. The difficulties of finding the parameter leads me to
believe that using throttling here is flawed.
> Throttle DiskChecker#checkDirs() speed.
> ---------------------------------------
>
> Key: HDFS-8617
> URL: https://issues.apache.org/jira/browse/HDFS-8617
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: HDFS
> Affects Versions: 2.7.0
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Attachments: HDFS-8617.000.patch
>
>
> As described in HDFS-8564, {{DiskChecker.checkDirs(finalizedDir)}} is
> causing excessive I/Os because {{finalizedDirs}} might have up to 64K
> sub-directories (HDFS-6482).
> This patch proposes to limit the rate of IO operations in
> {{DiskChecker.checkDirs()}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)