[
https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052519#comment-16052519
]
Arpit Agarwal commented on HDFS-11789:
--------------------------------------
Thanks for updating the patch [~hanishakoneru]. A few more comments:
# In BlockReaderLocal constructor, metrics is being passed to
BlockReaderIoProvider before it is potentially initialized.
{code}
this.blockReaderIoProvider = new BlockReaderIoProvider(
builder.shortCircuitConf, metrics, timer);
if (builder.shortCircuitConf.isScrMetricsEnabled()) {
metricsInitializationLock.lock();
if (!isMetricsEnabled) {
metrics = BlockReaderLocalMetrics.create();
isMetricsEnabled = true;
}
{code}
# Typo - {{SHORT_CIRCUIT_READ_LATECNY_METRIC_REGISTERD_NAME}}.
# Looks like we can eliminate the {{BlockReaderLocal#isMetricsEnabled}},
instead just check whether metrics is null.
# Thanks for adding TestBlockReaderIoProvider. One more related test - ensure
{{addShortCircuitReadLatency}} is not invoked when the read takes less than the
threshold.
# Also I think we can make BlockReaderIoProvider be a static utility class
since it doesn't maintain any mutable state. But this approach is also fine
since the allocated object is lightweight.
TestBlockReaderLocalMetrics is a nicely written unit test!
> Maintain Short-Circuit Read Statistics
> --------------------------------------
>
> Key: HDFS-11789
> URL: https://issues.apache.org/jira/browse/HDFS-11789
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Reporter: Hanisha Koneru
> Assignee: Hanisha Koneru
> Attachments: HDFS-11789.001.patch, HDFS-11789.002.patch,
> HDFS-11789.003.patch
>
>
> If a disk or controller hardware is faulty then short-circuit read requests
> can stall indefinitely while reading from the file descriptor. Currently
> there is no way to detect when short-circuit read requests are slow or
> blocked.
> This Jira proposes that each BlockReaderLocal maintain read statistics while
> it is active by measuring the time taken for a pre-determined fraction of
> read requests. These per-reader stats can be aggregated into global stats
> when the reader is closed. The aggregate statistics can be exposed via JMX.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]