[
https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15463720#comment-15463720
]
Chris Douglas commented on HDFS-10742:
--------------------------------------
bq. Removed per-method measurement and aggregate stats to minimize overhead for
cheap operations
Recording stack traces, etc. may be useful. I raised those overheads for two
reasons. First, that diagnostic data should only be built when necessary.
Previous versions of the patch generated diagnostic data that was ultimately
discarded. Second, minutiae like "GC-friendliness", whether a wrapper class on
{{Lock}} adds dispatch overhead, class loading overhead, etc. when this JIRA
logs critical sections longer than *300ms* are irrelevant, literally by orders
of magnitude.
I'll ask again after the motivation. Is the intent to identify subsystems that
do too much work (and/or I/O) in a critical section? To measure where the DN is
spending cycles? In some of these cases, stack traces could be very helpful. As
long as this telemetry can be tuned by configuration, it shouldn't create new
problems for operators. To that point, instead of reusing the {{FsDatasetImpl}}
log, this may want to use its own (so disabling/redirecting it doesn't affect
other, critical information written to that log).
bq. Was your concern was with exposing AutoCloseableLock in hadoop utils or
just an overall objection to this idiom even if it's not exposed outside the DN.
I have two objections, one to the idiom and one to the design.
{{AutoCloseableLock}} is a new abstraction that doesn't elide any details of
the thing it obscures. Every path for {{Lock}} objects is possible, so it
doesn't make it easier to reason about lock correctness. It creates a special
type for the _simplest_ possible case. This case is so simple, we have
automated tooling to detect it. Further, instead of every Java developer
looking at a {{Lock}} and scanning for well-worn expectations, we introduce an
abstraction that requires investigation. No advantage beyond developer
convenience has been raised, and it is a demonstrably _inconvenient_
abstraction.
If the interface were limited so it could _only_ be used in the resource idiom
(as [proposed|https://s.apache.org/Eyaw] earlier), then a developer could know
that the {{Lock}} acquisition was limited to scope. As semantic sugar it would
still be pointless, because Java programmers also know how to use
{{synchronized}}.
This could be redeemed by changing the design. If the {{AutoCloseableLock}}
constructor accepted a {{Lock}} instance and cut its API to scope-limited
changes, then telemetry like this could be added as implementations of the
{{Lock}} type with all the perceived advantages of ACL. Devs can reason about
the simpler API of ACL (scope-based acquisition), but still monitor all
acquisitions as required by this JIRA (impractical to implement using
{{synchronized}}).
I remain skeptical of {{AutoCloseableLock}}, since one could just use the
{{Lock}} in interface and existing tooling to implement the same. But if its
virtues are obvious to everyone but me I won't stand in the way of including it.
> Measurement of lock held time in FsDatasetImpl
> ----------------------------------------------
>
> Key: HDFS-10742
> URL: https://issues.apache.org/jira/browse/HDFS-10742
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 3.0.0-alpha2
> Reporter: Chen Liang
> Assignee: Chen Liang
> Attachments: HDFS-10742.001.patch, HDFS-10742.002.patch,
> HDFS-10742.003.patch, HDFS-10742.004.patch, HDFS-10742.005.patch,
> HDFS-10742.006.patch, HDFS-10742.007.patch, HDFS-10742.008.patch,
> HDFS-10742.009.patch, HDFS-10742.010.patch, HDFS-10742.011.patch
>
>
> This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is
> held by a thread. Doing so will allow us to measure lock statistics.
> This can be done by extending the {{AutoCloseableLock}} lock object in
> {{FsDatasetImpl}}. In the future we can also consider replacing the lock with
> a read-write lock.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]