[ 
https://issues.apache.org/jira/browse/CASSANDRA-8654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314058#comment-14314058
 ] 

Benedict commented on CASSANDRA-8654:
-------------------------------------

I think I'm leaning toward this being perhaps an extension to the current 
stress featureset. Stress already can cope with schema population and querying; 
we need to add validation to this anyway. To begin with we could perhaps just 
run a validating stress workload in parallel with some chaos generation that 
should in no way affect the results produced, and see what happens. We will 
have to think about what features need to be added to stress to maximise the 
utility, of course. Things like LWT will need to be supported, as will function 
calls etc. We will probably want to introduce some randomness to the kind of 
behaviour stress undertakes as well (e.g. introduce non-client actions, 
including sleeping, flushing, repair, etc, with some random incidence) but 
these can perhaps be introduced as an overlay. We will also want to think about 
creating some more deterministic partitions, so that edge cases can more 
directly be exercised. The advantage, of course, is that as we improve it we 
also expand the use cases stress can validate for users as well. The main guts 
of the client are also already there and ready to go.

Originally I thought the featureset would be too different, but really it's not 
_so_ different. We already track inside of stress an idea of what the partition 
looks like on the server, it's just not as complete as it could be. We're also 
not in anyway trying to create failure cases, but we randomness may be enough 
to start, and over time we can introduce tweaks to that randomness that favour 
breakage.

I'm not super keen on the idea of two parallel clusters; for one this halves 
our bandwidth (we could instead run two parallel validations, which means twice 
the chance of hitting something useful), and for another it means we only 
really catch regressions. I'm also not so keen on a python client, because I'd 
like to see as much validation as possible performed in a given time horizon. 
One of the reasons we moved away from python for stress was because it simply 
wasn't exercising the cluster enough.

> Data validation test
> --------------------
>
>                 Key: CASSANDRA-8654
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8654
>             Project: Cassandra
>          Issue Type: Test
>            Reporter: Russ Hatch
>            Assignee: Russ Hatch
>
> There was a recent discussion about the utility of data validation testing.
> The goal here would be a harness of some kind that can mix operations and 
> track its own notion of what the DB state should look like, and verify it in  
> detail, or perhaps a sampling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to