[ 
https://issues.apache.org/jira/browse/CASSANDRA-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727218#comment-14727218
 ] 

Benedict commented on CASSANDRA-8986:
-------------------------------------

I think for consistency of testing, and for realism of read workloads, we need 
stress to be able to build sstables directly to serve to each node when 
bootstrapping a test cluster, which has its compactions disabled. We can then 
specify the distribution of data amongst these sstables, so that we produce 
data that looks like a real cluster with live data might produce. Currently it 
is very hard to get a consistent state across different versions of a cluster 
for performing read tests, and replicating a realistic DTCS cluster state (for 
instance) is hard since we don't run for days, weeks or months (and 
artificially lowering the windows doesn't necessarily give us a realistic 
cluster state).

I don't propose that this all be delivered as part of the rewrite, but I note 
it here to make certain it is considered at the same time, hopefully to be 
delivered soon after.

> Major cassandra-stress refactor
> -------------------------------
>
>                 Key: CASSANDRA-8986
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8986
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>
> We need a tool for both stressing _and_ validating more complex workloads 
> than stress currently supports. Stress needs a raft of changes, and I think 
> it would be easier to deliver many of these as a single major endeavour which 
> I think is justifiable given its audience. The rough behaviours I want stress 
> to support are:
> * Ability to know exactly how many rows it will produce, for any clustering 
> prefix, without generating those prefixes
> * Ability to generate an amount of data proportional to the amount it will 
> produce to the server (or consume from the server), rather than proportional 
> to the variation in clustering columns
> * Ability to reliably produce near identical behaviour each run
> * Ability to understand complex overlays of operation types (LWT, Delete, 
> Expiry, although perhaps not all implemented immediately, the framework for 
> supporting them easily)
> * Ability to (with minimal internal state) understand the complete cluster 
> state through overlays of multiple procedural generations
> * Ability to understand the in-flight state of in-progress operations (i.e. 
> if we're applying a delete, understand that the delete may have been applied, 
> and may not have been, for potentially multiple conflicting in flight 
> operations)
> I think the necessary changes to support this would give us the _functional_ 
> base to support all the functionality I can currently envisage stress 
> needing. Before embarking on this (which I may attempt very soon), it would 
> be helpful to get input from others as to features missing from stress that I 
> haven't covered here that we will certainly want in the future, so that they 
> can be factored in to the overall design and hopefully avoid another refactor 
> one year from now, as its complexity is scaling each time, and each time it 
> is a higher sunk cost. [~jbellis] [~iamaleksey] [~slebresne] [~tjake] 
> [~enigmacurry] [~aweisberg] [~blambov] [~jshook] ... and @everyone else :) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to