[
https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033796#comment-16033796
]
Arpit Agarwal commented on HDFS-11789:
--------------------------------------
Thanks for contributing this improvement [~hanishakoneru]. A few comments:
# For maintaining stats one more option is RollingAverages which uses
MutableRatesWithAggregation and is optimized for multithreaded updates.
# Instead of floating point arithmetic here:
{code}
sampleRangeMax = (int) ((double) conf.getScrMetricsSamplingPercentage()
/ 100 * Integer.MAX_VALUE);
{code}
Alternatively:
{code}
sampleRangeMax = (Integer.MAX_VALUE / 100) *
conf.getScrMetricsSamplingPercentage();
{code}
Also we should limit getScrMetricsSamplingPercentage to \[0, 100\] if the
administrator misconfigures it.
# We should add isolated test cases for BlockReaderIoProvider and
BlockReaderLocalMetrics if possible.
> Maintain Short-Circuit Read Statistics
> --------------------------------------
>
> Key: HDFS-11789
> URL: https://issues.apache.org/jira/browse/HDFS-11789
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Hanisha Koneru
> Assignee: Hanisha Koneru
> Attachments: HDFS-11789.001.patch
>
>
> If a disk or controller hardware is faulty then short-circuit read requests
> can stall indefinitely while reading from the file descriptor. Currently
> there is no way to detect when short-circuit read requests are slow or
> blocked.
> This Jira proposes that each BlockReaderLocal maintain read statistics while
> it is active by measuring the time taken for a pre-determined fraction of
> read requests. These per-reader stats can be aggregated into global stats
> when the reader is closed. The aggregate statistics can be exposed via JMX.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]