[
https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048517#comment-16048517
]
Arpit Agarwal edited comment on HDFS-11789 at 6/13/17 11:37 PM:
----------------------------------------------------------------
Thanks for updating the patch [~hanishakoneru]. A few comments:
# Typos in {{SHORT_CIRCUIT_READ_LATECNY_METRIC_REGISTERD_NAME}}.
# The test-only method {{getShortCircuitReadRollingAverages}} should be tagged
with {{\@VisibleForTesting}}. Also it should be package-private if possible.
# I think we can eliminate the METRICS_ENABLED config key. We can enable the
metric when the sampling percentage is > 0.
# Let's avoid static initialization of {{BlockReaderLocal#metrics}}. One issue
is that adding the overhead of a MutableRollingAverages object (and its rates
roller thread) for all clients, most of which may will never enable SCR
statistics. I think your original approach of initializing it within a lock on
construction was fine, the overhead of that lock compared to the overhead of
cloning a file descriptor via system calls should be minimal.
Still reviewing the tests.
was (Author: arpitagarwal):
Thanks for updating the patch [~hanishakoneru]. A few comments:
# Typos in {{SHORT_CIRCUIT_READ_LATECNY_METRIC_REGISTERD_NAME}}.
# The test-only method {{getShortCircuitReadRollingAverages}} should be tagged
with {{\@VisibleForTesting}}. Also it should be package-private if possible.
# I think we can eliminate the METRICS_SAMPLING_PERCENTAGE_KEY config key. We
can enable the metric when the sampling percentage is > 0.
# Let's avoid static initialization of {{BlockReaderLocal#metrics}}. One issue
is that adding the overhead of a MutableRollingAverages object (and its rates
roller thread) for all clients, most of which may will never enable SCR
statistics. I think your original approach of initializing it within a lock on
construction was fine, the overhead of that lock compared to the overhead of
cloning a file descriptor via system calls should be minimal.
Still reviewing the tests.
> Maintain Short-Circuit Read Statistics
> --------------------------------------
>
> Key: HDFS-11789
> URL: https://issues.apache.org/jira/browse/HDFS-11789
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Reporter: Hanisha Koneru
> Assignee: Hanisha Koneru
> Attachments: HDFS-11789.001.patch, HDFS-11789.002.patch
>
>
> If a disk or controller hardware is faulty then short-circuit read requests
> can stall indefinitely while reading from the file descriptor. Currently
> there is no way to detect when short-circuit read requests are slow or
> blocked.
> This Jira proposes that each BlockReaderLocal maintain read statistics while
> it is active by measuring the time taken for a pre-determined fraction of
> read requests. These per-reader stats can be aggregated into global stats
> when the reader is closed. The aggregate statistics can be exposed via JMX.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]