[
https://issues.apache.org/jira/browse/CASSSIDECAR-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18028998#comment-18028998
]
Stefan Miklosovic commented on CASSSIDECAR-354:
-----------------------------------------------
[~frankgh] any insights?
> cassinstancesdown/up metrics not updating when instances go down
> ----------------------------------------------------------------
>
> Key: CASSSIDECAR-354
> URL: https://issues.apache.org/jira/browse/CASSSIDECAR-354
> Project: Sidecar for Apache Cassandra
> Issue Type: Bug
> Components: Observability
> Reporter: Carl Sandland
> Priority: Major
>
> When stopping a cassandra 'instance', sidecar is not updating these metrics
> correctly, as the onFailure() block that does the updates is not being
> called, due to exceptions being swallowed in the degelate. Exceptions being
> swallowed doesn't seem to work well with promise chains.
> My expectations where:
> Assume a simple sidecar config with one attached cassandra instance, all
> started up and running happily: cassinstancesdown = 0, cassinstancesup = 1.
> Then manually stop cassandra: cassinstancesdown = 1, cassinstancesup = 0
> I was seeing a constant : cassinstancesdown=0, cassinstancesup=1
> Specifically, the code here:
> {code:java}
> private Future<Void> healthCheck(InstanceMetadata instanceMetadata,
> AtomicInteger instanceDown)
> {
> return internalPool
> .runBlocking(() -> instanceMetadata.delegate().healthCheck(),
> false)
> .onFailure(cause -> {
> instanceDown.incrementAndGet();
> LOGGER.error("Unable to complete health check on instance={}",
> instanceMetadata.id(), cause);
> });
> } {code}
> the metric is updated in the onFailure(), yet the exceptions that would
> trigger a failure (like not being able to connect) are swallowed by the
> delegate (CassandraAdapterDelegate) healthCheck() call.
> I experimented by re-throwing the exceptions in the delegate and the metric
> started tracking correctly. There is quite a lot of state change in the
> delegate in the exception handlers so didn't feel comfortable 'throwing' a
> simplistic PR out.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]