[jira] [Commented] (CASSANDRA-18580) Baseline Metrics for Accord Transactions

Jacek Lewandowski (Jira) Tue, 08 Aug 2023 23:31:04 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752279#comment-17752279
 ]


Jacek Lewandowski commented on CASSANDRA-18580:
-----------------------------------------------

I've added cache metrics for {{AccordStateCache}} - global and per-instance, 
which means that it applies individually to {{Command}} and {{CommandsForKey}}. 
Also added a simple meter for progress log size. I have a question about what 
is exactly needed for "Size of CommandsForKey"? Do you need some histogram of 
serialized size of loaded objects? Should that be applied in 
{{AccordStateCache}}?


> Baseline Metrics for Accord Transactions
> ----------------------------------------
>
>                 Key: CASSANDRA-18580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18580
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Accord, Observability/JMX, Observability/Metrics
>            Reporter: Caleb Rackliffe
>            Assignee: Jacek Lewandowski
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Based on some conversations w/ [~benedict] and [~dcapwell], this is the 
> initial set of metrics that seem both feasible to implement and useful as we 
> monitor the health of a cluster performing Accord transactions:
> 1.) Basic latency metrics for transactions up to the point of COMMIT and rate 
> metrics for preemption, failure, and timeouts at the coordinator.
> This has already been implemented and split into read and write-specific 
> metrics. Our position for now is that metrics around preemption should be 
> useful in place of a more difficult-to-define metric around how many 
> transactions are completed via recovery.
> 2.) Global cache stats/metrics (i.e. aggregated for all command stores)
> We could, at some point, build metrics scoped to a specific {{CommandStore}}, 
> but they might be awkward in MBean/JMX space, as command stores would have to 
> be identified by ID or key range…the latter possibly being able to change 
> across epochs. (An alternative would be just publishing command 
> store-specific stats on-demand to a virtual table instead.)
> 3.) Something like a decaying histogram of the number of dependencies per 
> transaction (or per partial transaction).
> If this is getting worse over time, it could be useful to know/be a way for 
> us to detect that contention is increasing. We should be able to hook this up 
> to {{ProgressLog}} notifications. Recording for PartialDeps/PartialTxn (which 
> ProgressLog gives us at pre-accept) seems acceptable, given this is a 
> directional metric.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-18580) Baseline Metrics for Accord Transactions

Reply via email to