[jira] [Commented] (CASSANDRA-6199) Improve Stress Tool

Pavel Yaskevich (JIRA) Mon, 23 Dec 2013 13:02:22 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855926#comment-13855926
 ]


Pavel Yaskevich commented on CASSANDRA-6199:
--------------------------------------------

bq. This was a conscious decision, as I prefer the brevity of the current 
expression (2 vs 5 LOC), and it's a cost incurred only ~once, but this is 
something I vary my position on, so I'm sanguine about changing it.

What I mentioned is a correct and preferred way to do element placement into 
concurrent map because putIfAbsent already does all the work for you.

> Improve Stress Tool
> -------------------
>
>                 Key: CASSANDRA-6199
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6199
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>         Attachments: new.read.latency.svg, new.read.rate.distribution.svg, 
> new.write.latency.svg, new.write.rate.distribution.svg, old.read.latency.svg, 
> old.read.rate.distribution.svg, old.write.latency.svg, 
> old.write.rate.distribution.svg, ops.read.svg, ops.write.svg
>
>
> The stress tool could do with sprucing up. The following is a list of 
> essential improvements and things that would be nice to have.
> Essential:
> - Reduce variability of results, especially start/end tails. Do not trash 
> first/last 10% of readings
> - Reduce contention/overhead in stress to increase overall throughput
> - Short warm-up period, which is ignored for summary (or summarised 
> separately), though prints progress as usual. Potentially automatic detection 
> of rate levelling.
> - Better configurability and defaults for data generation - current column 
> generation populates columns with the same value for every row, which is very 
> easily compressible. Possibly introduce partial random data generator 
> (possibly dictionary-based random data generator)
> Nice to have:
> - Calculate and print stdev and mean
> - Add batched sequential access mode (where a single thread performs 
> batch-size sequential requests before selecting another random key) to test 
> how key proximity affects performance
> - Auto-mode which attempts to establish the maximum throughput rate, by 
> varying the thread count (or otherwise gating the number of parallel 
> requests) for some period, then configures rate limit or thread count to test 
> performance at e.g. 30%, 50%, 70%, 90%, 120%, 150% and unconstrained.
> - Auto-mode could have a target variance ratio for mean throughput and/or 
> latency, and completes a test once this target is hit for x intervals
> - Fix key representation so independent of number of keys (possibly switch to 
> 10 digit hex), and don't use String.format().getBytes() to construct it 
> (expensive)
> Also, remove the skip-key setting, as it is currently ignored. Unless 
> somebody knows the reason for it.
> - Fix latency stats
> - Read/write mode, with configurable recency-of-reads distribution
> - Add new exponential/extreme value distribution for value size, column count 
> and recency-of-reads
> - Support more than 2^31 keys
> - Supports multiple concurrent stress inserts via key-offset parameter or 
> similar



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6199) Improve Stress Tool

Reply via email to