[
https://issues.apache.org/jira/browse/CASSANDRA-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727218#comment-14727218
]
Benedict commented on CASSANDRA-8986:
-------------------------------------
I think for consistency of testing, and for realism of read workloads, we need
stress to be able to build sstables directly to serve to each node when
bootstrapping a test cluster, which has its compactions disabled. We can then
specify the distribution of data amongst these sstables, so that we produce
data that looks like a real cluster with live data might produce. Currently it
is very hard to get a consistent state across different versions of a cluster
for performing read tests, and replicating a realistic DTCS cluster state (for
instance) is hard since we don't run for days, weeks or months (and
artificially lowering the windows doesn't necessarily give us a realistic
cluster state).
I don't propose that this all be delivered as part of the rewrite, but I note
it here to make certain it is considered at the same time, hopefully to be
delivered soon after.
> Major cassandra-stress refactor
> -------------------------------
>
> Key: CASSANDRA-8986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8986
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Benedict
> Assignee: Benedict
>
> We need a tool for both stressing _and_ validating more complex workloads
> than stress currently supports. Stress needs a raft of changes, and I think
> it would be easier to deliver many of these as a single major endeavour which
> I think is justifiable given its audience. The rough behaviours I want stress
> to support are:
> * Ability to know exactly how many rows it will produce, for any clustering
> prefix, without generating those prefixes
> * Ability to generate an amount of data proportional to the amount it will
> produce to the server (or consume from the server), rather than proportional
> to the variation in clustering columns
> * Ability to reliably produce near identical behaviour each run
> * Ability to understand complex overlays of operation types (LWT, Delete,
> Expiry, although perhaps not all implemented immediately, the framework for
> supporting them easily)
> * Ability to (with minimal internal state) understand the complete cluster
> state through overlays of multiple procedural generations
> * Ability to understand the in-flight state of in-progress operations (i.e.
> if we're applying a delete, understand that the delete may have been applied,
> and may not have been, for potentially multiple conflicting in flight
> operations)
> I think the necessary changes to support this would give us the _functional_
> base to support all the functionality I can currently envisage stress
> needing. Before embarking on this (which I may attempt very soon), it would
> be helpful to get input from others as to features missing from stress that I
> haven't covered here that we will certainly want in the future, so that they
> can be factored in to the overall design and hopefully avoid another refactor
> one year from now, as its complexity is scaling each time, and each time it
> is a higher sunk cost. [~jbellis] [~iamaleksey] [~slebresne] [~tjake]
> [~enigmacurry] [~aweisberg] [~blambov] [~jshook] ... and @everyone else :)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)