Caleb Rackliffe created CASSANDRA-16701:
-------------------------------------------
Summary: Data Points for the CommitLog's WaitingOnCommit Metric
Should Describe Single Mutations
Key: CASSANDRA-16701
URL: https://issues.apache.org/jira/browse/CASSANDRA-16701
Project: Cassandra
Issue Type: Improvement
Components: Local/Commit Log, Observability/JMX
Reporter: Caleb Rackliffe
The metrics we have around the {{CommitLog}} aren’t as useful as they could be
in the context of investigating the performance of local writes.
1.) We have no way to know how long the actual flush to disk takes in
isolation, i.e. separate from the signaling apparatus between mutation threads
and the sync thread. We should add a metric for this.
2.) The WaitingOnCommit metric can have multiple data points recorded for a
single mutation, which is a little awkward when we’re trying to break down the
latency of a local write (total time for CL add + Memtable put, etc.). More
specifically, a thread waits for the sync thread to catch up to the position of
its mutation, but it can wake up for a sync operation that hasn’t arrived there
yet, which triggers another wait. A new data point is recorded for the metric
each time this happens. We should move the scope of metric recording up a level
so that there is a 1-1 relationship between it and WriteLatency in TableMetrics
(which covers row cache updates and the Memtable put).
{noformat}
void waitForSync(int position, Timer waitingOnCommit)
{
while (lastSyncedOffset < position)
{
WaitQueue.Signal signal = waitingOnCommit != null ?
syncComplete.register(waitingOnCommit.time())
:
syncComplete.register();
if (lastSyncedOffset < position)
signal.awaitUninterruptibly();
else
signal.cancel();
}
}
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]