SiyaoIsHiding opened a new pull request, #1916:
URL: https://github.com/apache/cassandra-java-driver/pull/1916
It is leaked by micrometer gauge initialization.
I used the following `application.conf`, which includes all node and session
level metrics, and the memory leak is gone.
```
datastax-java-driver.advanced.metrics {
session.enabled = [
# The number and rate of bytes sent for the entire session (exposed as a
Meter).
bytes-sent,
# The number and rate of bytes received for the entire session (exposed
as a Meter).
bytes-received
# The number of nodes to which the driver has at least one active
connection (exposed as a
# Gauge<Integer>).
connected-nodes,
# The throughput and latency percentiles of CQL requests (exposed as a
Timer).
#
# This corresponds to the overall duration of the session.execute()
call, including any
# retry.
cql-requests,
# The number of CQL requests that timed out -- that is, the
session.execute() call failed
# with a DriverTimeoutException (exposed as a Counter).
cql-client-timeouts,
# The size of the driver-side cache of CQL prepared statements.
#
# The cache uses weak values eviction, so this represents the number of
PreparedStatement
# instances that your application has created, and is still holding a
reference to. Note
# that the returned value is approximate.
cql-prepared-cache-size,
# How long requests are being throttled (exposed as a Timer).
#
# This is the time between the start of the session.execute() call, and
the moment when
# the throttler allows the request to proceed.
throttling.delay,
# The size of the throttling queue (exposed as a Gauge<Integer>).
#
# This is the number of requests that the throttler is currently
delaying in order to
# preserve its SLA. This metric only works with the built-in
concurrency- and rate-based
# throttlers; in other cases, it will always be 0.
throttling.queue-size,
# The number of times a request was rejected with a
RequestThrottlingException (exposed as
# a Counter)
throttling.errors,
# The throughput and latency percentiles of DSE continuous CQL requests
(exposed as a
# Timer).
#
# This metric is a session-level metric and corresponds to the overall
duration of the
# session.executeContinuously() call, including any retry.
#
# Note that this metric is analogous to the OSS driver's 'cql-requests'
metrics, but for
# continuous paging requests only. Continuous paging requests do not
update the
# 'cql-requests' metric, because they are usually much longer. Only the
following metrics
# are updated during a continuous paging request:
#
# - At node level: all the usual metrics available for normal CQL
requests, such as
# 'cql-messages' and error-related metrics (but these are only updated
for the first
# page of results);
# - At session level: only 'continuous-cql-requests' is updated (this
metric).
continuous-cql-requests,
# The throughput and latency percentiles of Graph requests (exposed as a
Timer).
#
# This metric is a session-level metric and corresponds to the overall
duration of the
# session.execute(GraphStatement) call, including any retry.
graph-requests,
# The number of graph requests that timed out -- that is, the
# session.execute(GraphStatement) call failed with a
DriverTimeoutException (exposed as a
# Counter).
#
# Note that this metric is analogous to the OSS driver's
'cql-client-timeouts' metrics, but
# for Graph requests only.
graph-client-timeouts
]
node.enabled = [
# The number of connections open to this node for regular requests
(exposed as a
# Gauge<Integer>).
#
# This includes the control connection (which uses at most one extra
connection to a
# random node in the cluster).
pool.open-connections,
# The number of stream ids available on the connections to this node
(exposed as a
# Gauge<Integer>).
#
# Stream ids are used to multiplex requests on each connection, so this
is an indication
# of how many more requests the node could handle concurrently before
becoming saturated
# (note that this is a driver-side only consideration, there might be
other limitations on
# the server that prevent reaching that theoretical limit).
pool.available-streams,
# The number of requests currently executing on the connections to this
node (exposed as a
# Gauge<Integer>). This includes orphaned streams.
pool.in-flight,
# The number of "orphaned" stream ids on the connections to this node
(exposed as a
# Gauge<Integer>).
#
# See the description of the connection.max-orphan-requests option for
more details.
pool.orphaned-streams,
# The number and rate of bytes sent to this node (exposed as a Meter).
bytes-sent,
# The number and rate of bytes received from this node (exposed as a
Meter).
bytes-received,
# The throughput and latency percentiles of individual CQL messages sent
to this node as
# part of an overall request (exposed as a Timer).
#
# Note that this does not necessarily correspond to the overall duration
of the
# session.execute() call, since the driver might query multiple nodes
because of retries
# and speculative executions. Therefore a single "request" (as seen from
a client of the
# driver) can be composed of more than one of the "messages" measured by
this metric.
#
# Therefore this metric is intended as an insight into the performance
of this particular
# node. For statistics on overall request completion, use the
session-level cql-requests.
cql-messages,
# The number of times the driver failed to send a request to this node
(exposed as a
# Counter).
#
# In those case we know the request didn't even reach the coordinator,
so they are retried
# on the next node automatically (without going through the retry
policy).
errors.request.unsent,
# The number of times a request was aborted before the driver even
received a response
# from this node (exposed as a Counter).
#
# This can happen in two cases: if the connection was closed due to an
external event
# (such as a network error or heartbeat failure); or if there was an
unexpected error
# while decoding the response (this can only be a driver bug).
errors.request.aborted,
# The number of times this node replied with a WRITE_TIMEOUT error
(exposed as a Counter).
#
# Whether this error is rethrown directly to the client, rethrown or
ignored is determined
# by the RetryPolicy.
errors.request.write-timeouts,
# The number of times this node replied with a READ_TIMEOUT error
(exposed as a Counter).
#
# Whether this error is rethrown directly to the client, rethrown or
ignored is determined
# by the RetryPolicy.
errors.request.read-timeouts,
# The number of times this node replied with an UNAVAILABLE error
(exposed as a Counter).
#
# Whether this error is rethrown directly to the client, rethrown or
ignored is determined
# by the RetryPolicy.
errors.request.unavailables,
# The number of times this node replied with an error that doesn't fall
under other
# 'errors.*' metrics (exposed as a Counter).
errors.request.others,
# The total number of errors on this node that caused the RetryPolicy to
trigger a retry
# (exposed as a Counter).
#
# This is a sum of all the other retries.* metrics.
retries.total,
# The number of errors on this node that caused the RetryPolicy to
trigger a retry, broken
# down by error type (exposed as Counters).
retries.aborted,
retries.read-timeout,
retries.write-timeout,
retries.unavailable,
retries.other,
# The total number of errors on this node that were ignored by the
RetryPolicy (exposed as
# a Counter).
#
# This is a sum of all the other ignores.* metrics.
ignores.total,
# The number of errors on this node that were ignored by the
RetryPolicy, broken down by
# error type (exposed as Counters).
ignores.aborted,
ignores.read-timeout,
ignores.write-timeout,
ignores.unavailable,
ignores.other,
# The number of speculative executions triggered by a slow response from
this node
# (exposed as a Counter).
speculative-executions,
# The number of errors encountered while trying to establish a
connection to this node
# (exposed as a Counter).
#
# Connection errors are not a fatal issue for the driver, failed
connections will be
# retried periodically according to the reconnection policy. You can
choose whether or not
# to log those errors at WARN level with the
connection.warn-on-init-error option.
#
# Authentication errors are not included in this counter, they are
tracked separately in
# errors.connection.auth.
errors.connection.init,
# The number of authentication errors encountered while trying to
establish a connection
# to this node (exposed as a Counter).
# Authentication errors are also logged at WARN level.
errors.connection.auth,
# The throughput and latency percentiles of individual graph messages
sent to this node as
# part of an overall request (exposed as a Timer).
#
# Note that this does not necessarily correspond to the overall duration
of the
# session.execute() call, since the driver might query multiple nodes
because of retries
# and speculative executions. Therefore a single "request" (as seen from
a client of the
# driver) can be composed of more than one of the "messages" measured by
this metric.
#
# Therefore this metric is intended as an insight into the performance
of this particular
# node. For statistics on overall request completion, use the
session-level graph-requests.
graph-messages,
]
factory.class = MicrometerMetricsFactory
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]