[ 
https://issues.apache.org/jira/browse/CASSANDRA-11853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298279#comment-15298279
 ] 

Nitsan Wakart commented on CASSANDRA-11853:
-------------------------------------------

Addressed now:
- code style (apologies where I missed it, I try to no wholesale format for 
patches which makes me miss the occasional bracket)
- Add dependency to build
- Added comments
- Synchronize consumer starting point and set that as the rate limiter start 
reference point to minimize observed initial latency.

"If the rate limit is set too high, such that stress can't keep up with the 
expected rate, the results will make no sense. The actual start time will be 
way after the limiters calculated start time."
If the rate limit is too high the results should reflect the failure IMO. You 
asked for 100K per sec and the cluster can only serve 1K would lead to breached 
SLA in real life and the reported latencies will reflect it. The results make 
perfect sense in that they show failure.

For clarification, the 'latency' figure now reported for throttled runs is now 
response time rather than service time. The hdr log contains data for 
response/service/wait time following this definition:

Let each operation have an intended start time(Ti) (e.g. based on throttle), an 
actual start time(Ta), and an end time(Te), then for each operation we can 
define response time(Te - Ti), service time (Te - Ta) and wait time(Ta - Ti). 
When no intended time is available (e.g. throughput run) we can set it to 
actual start time in which case response time is the same as service time and 
wait time is zero.

We could have the summary report the service time alongside the response time, 
but that may break down stream reporting.

> Improve Cassandra-Stress latency measurement
> --------------------------------------------
>
>                 Key: CASSANDRA-11853
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11853
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Nitsan Wakart
>            Assignee: Nitsan Wakart
>             Fix For: 3.x
>
>
> Currently CS reports latency using a sampling latency container and reporting 
> service time (as opposed to response time from intended schedule) leading to 
> coordinated omission.
> Fixed here:
> https://github.com/nitsanw/cassandra/tree/co-correction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to