[
https://issues.apache.org/jira/browse/CASSANDRA-18580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752279#comment-17752279
]
Jacek Lewandowski commented on CASSANDRA-18580:
-----------------------------------------------
I've added cache metrics for {{AccordStateCache}} - global and per-instance,
which means that it applies individually to {{Command}} and {{CommandsForKey}}.
Also added a simple meter for progress log size. I have a question about what
is exactly needed for "Size of CommandsForKey"? Do you need some histogram of
serialized size of loaded objects? Should that be applied in
{{AccordStateCache}}?
> Baseline Metrics for Accord Transactions
> ----------------------------------------
>
> Key: CASSANDRA-18580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18580
> Project: Cassandra
> Issue Type: Improvement
> Components: Accord, Observability/JMX, Observability/Metrics
> Reporter: Caleb Rackliffe
> Assignee: Jacek Lewandowski
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Based on some conversations w/ [~benedict] and [~dcapwell], this is the
> initial set of metrics that seem both feasible to implement and useful as we
> monitor the health of a cluster performing Accord transactions:
> 1.) Basic latency metrics for transactions up to the point of COMMIT and rate
> metrics for preemption, failure, and timeouts at the coordinator.
> This has already been implemented and split into read and write-specific
> metrics. Our position for now is that metrics around preemption should be
> useful in place of a more difficult-to-define metric around how many
> transactions are completed via recovery.
> 2.) Global cache stats/metrics (i.e. aggregated for all command stores)
> We could, at some point, build metrics scoped to a specific {{CommandStore}},
> but they might be awkward in MBean/JMX space, as command stores would have to
> be identified by ID or key rangeā¦the latter possibly being able to change
> across epochs. (An alternative would be just publishing command
> store-specific stats on-demand to a virtual table instead.)
> 3.) Something like a decaying histogram of the number of dependencies per
> transaction (or per partial transaction).
> If this is getting worse over time, it could be useful to know/be a way for
> us to detect that contention is increasing. We should be able to hook this up
> to {{ProgressLog}} notifications. Recording for PartialDeps/PartialTxn (which
> ProgressLog gives us at pre-accept) seems acceptable, given this is a
> directional metric.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]