[ 
https://issues.apache.org/jira/browse/CASSANDRA-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805651#comment-13805651
 ] 

Benedict commented on CASSANDRA-6199:
-------------------------------------

I have put a branch up with a completely new stress here:

https://github.com/belliottsmith/cassandra/tree/iss-6199-stress

Some things to expect:

Reduced variability of results
No start/end tail truncation
Short warmup period (50K operations) with pause to get JVM warm
Low garbage production in stress process
New data and key generators, including random sentence generation from 
frequency based word list, exponential and extreme value key distributions
Ability to configure distribution for column count and size
Mixed mode permitting arbitrary ratios of any of the basic operations
Automatic mode that ramps up thread count until cluster is saturated, pushes a 
bit further, then reports a summary
Automatic cessation of a test based on stderr of mean (i.e. instead of asking 
for n operations, ask for e.g. stderr < 0.01, with at least n samples)
Reports latencies per period, plus entire run, as opposed to a running tally of 
latencies
Calculates op rate accurately
Supports huge keys, plus arbitrary key ranges, and keys are not dependent on 
number of operations performed, so a reads can be run with different op count 
to writes

Also, there's a new command line syntax to handle all of the complexity it now 
supports, but there's a legacy support mode - I've tested the mapping, but it 
may need a little more testing to make sure I've caught any potential nooks and 
crannies

Would be great to have some people test it out to see if anything needs 
changing.

> Improve Stress Tool
> -------------------
>
>                 Key: CASSANDRA-6199
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6199
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>
> The stress tool could do with sprucing up. The following is a list of 
> essential improvements and things that would be nice to have.
> Essential:
> - Reduce variability of results, especially start/end tails. Do not trash 
> first/last 10% of readings
> - Reduce contention/overhead in stress to increase overall throughput
> - Short warm-up period, which is ignored for summary (or summarised 
> separately), though prints progress as usual. Potentially automatic detection 
> of rate levelling.
> - Better configurability and defaults for data generation - current column 
> generation populates columns with the same value for every row, which is very 
> easily compressible. Possibly introduce partial random data generator 
> (possibly dictionary-based random data generator)
> Nice to have:
> - Calculate and print stdev and mean
> - Add batched sequential access mode (where a single thread performs 
> batch-size sequential requests before selecting another random key) to test 
> how key proximity affects performance
> - Auto-mode which attempts to establish the maximum throughput rate, by 
> varying the thread count (or otherwise gating the number of parallel 
> requests) for some period, then configures rate limit or thread count to test 
> performance at e.g. 30%, 50%, 70%, 90%, 120%, 150% and unconstrained.
> - Auto-mode could have a target variance ratio for mean throughput and/or 
> latency, and completes a test once this target is hit for x intervals
> - Fix key representation so independent of number of keys (possibly switch to 
> 10 digit hex), and don't use String.format().getBytes() to construct it 
> (expensive)
> Also, remove the skip-key setting, as it is currently ignored. Unless 
> somebody knows the reason for it.
> - Fix latency stats
> - Read/write mode, with configurable recency-of-reads distribution
> - Add new exponential/extreme value distribution for value size, column count 
> and recency-of-reads
> - Support more than 2^31 keys
> - Supports multiple concurrent stress inserts via key-offset parameter or 
> similar



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to