[
https://issues.apache.org/jira/browse/CASSANDRA-17175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458607#comment-17458607
]
Stefan Miklosovic commented on CASSANDRA-17175:
-----------------------------------------------
Thanks for the patch, [~cam1982] , I have created a PR to further work on and I
have also gone through the patch from high level, I have fixed some formatting
issues as well as improved the clarity of the code + the patch itself was not
applying on the current trunk anymore so I've fixed that too.
[https://github.com/apache/cassandra/pull/1360/files]
In order to ever merge this, we need also some (unit) tests and I do not find
any. I ll try to come up with something if you do not make them in the
meanwhile.
> More detailed latency metrics
> -----------------------------
>
> Key: CASSANDRA-17175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17175
> Project: Cassandra
> Issue Type: New Feature
> Components: Observability/Metrics
> Reporter: Cameron Zemek
> Priority: Normal
> Fix For: 4.x
>
> Attachments: request_latency_metric.patch
>
>
> There is a disconnect with latency clients experience and the latency
> reported by Cassandra. For example read latency only measures the latency of
> the StorageProxy::readRows call.
> None of the time spent sitting in the Native Transport queue is measured.
> Neither is any of the time for writing the response back to the channel.
> Dispatcher processRequest keep track of when if first starts processing the
> request but best I can tell this is only used in tracking for timeouts.
> It would be useful for tracking down cause of high client latency if there
> was more detailed cassandra metrics around it.
> I have attached a patch that adds latency tracking higher in the call stack.
> Starting timer from before its put into the Native Transport Request
> executor. The patch gives 3 different metrics per Request type:
> delay - measures time from when its submitted to NTR pool till it call
> processRequest
> process - time spent in the Dispatcher processRequest call
> total - time from when first submitted to NTR pool until the response has
> been flushed
>
> This patch may not be cleanest or best way of doing this but hopefully gives
> an idea of what I think would be useful addition that will help operators
> diagonse latency issues.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]