[ 
https://issues.apache.org/jira/browse/CASSANDRA-17175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458607#comment-17458607
 ] 

Stefan Miklosovic commented on CASSANDRA-17175:
-----------------------------------------------

Thanks for the patch, [~cam1982] , I have created a PR to further work on and I 
have also gone through the patch from high level, I have fixed some formatting 
issues as well as improved the clarity of the code + the patch itself was not 
applying on the current trunk anymore so I've fixed that too.

[https://github.com/apache/cassandra/pull/1360/files]

In order to ever merge this, we need also some (unit) tests and I do not find 
any. I ll try to come up with something if you do not make them in the 
meanwhile.

> More detailed latency metrics
> -----------------------------
>
>                 Key: CASSANDRA-17175
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17175
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Observability/Metrics
>            Reporter: Cameron Zemek
>            Priority: Normal
>             Fix For: 4.x
>
>         Attachments: request_latency_metric.patch
>
>
> There is a disconnect with latency clients experience and the latency 
> reported by Cassandra. For example read latency only measures the latency of 
> the StorageProxy::readRows call.
> None of the time spent sitting in the Native Transport queue is measured. 
> Neither is any of the time for writing the response back to the channel.
> Dispatcher processRequest keep track of when if first starts processing the 
> request but best I can tell this is only used in tracking for timeouts.
> It would be useful for tracking down cause of high client latency if there 
> was more detailed cassandra metrics around it.
> I have attached a patch that adds latency tracking higher in the call stack. 
> Starting timer from before its put into the Native Transport Request 
> executor. The patch gives 3 different metrics per Request type:
> delay - measures time from when its submitted to NTR pool till it call 
> processRequest
> process - time spent in the Dispatcher processRequest call
> total - time from when first submitted to NTR pool until the response has 
> been flushed
>  
> This patch may not be cleanest or best way of doing this but hopefully gives 
> an idea of what I think would be useful addition that will help operators 
> diagonse latency issues.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to