[ 
https://issues.apache.org/jira/browse/CASSANDRA-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587740#comment-14587740
 ] 

Stefania commented on CASSANDRA-9102:
-------------------------------------

[~aweisberg], [~philipthompson], this is ready for review, is one of you guys 
happy to review or shall I try to find someone else?

I've modified the existing availability tests to reuse the same cluster, that 
is we test all combinations of read/write consistency levels with a given set 
of nodes alive, as opposite to restarting the nodes for each combination. This 
saves a considerable amount of time.

I've added a new set of tests for checking the actual accuracy of the data. We 
insert, update and delete from a regular table and we update from a table with 
a counter column. For the regular table we also use LWTs. We perform these 
operations one session at a time with different write consistency levels. 
Depending on the consistency level, some nodes will have stale data since the 
previous session would have written to another node. Then we check what we read 
back from each node. When R + W > N, we expect to read the latest value from 
all nodes, no stale data allowed. If that's not the case, we check that we get 
the latest value from at least W nodes. Is there anything else we could check? 
This is done for both single and multi data center clusters. 

I thought about sending different data in parallel, as opposite to sequentially 
like we are doing at the moment, but then we would not know what value to 
expect back since it would depend on the ordering of writes. I believe we are 
going to have Jepsen's tests for more advanced consistency checks?

I also made the accuracy tests run in parallel because they were taking a very 
long time to run. The total duration of consistency_test.py on Jenkins is 
currently 10 minutes. On my box the modified version takes approximately 17.5 
minutes, of which 4 minutes are for the availability test, 8 minutes for the 
accuracy test and 5.5 minutes for the existing tests that did not change 
(except fixing a small problem in {{short_read_test()}}).

We can drop some combinations or reduce the number of partitions to save more 
time in the accuracy test but if we use too few partitions we may not be able 
to find problems. At the moment we only test 50 partitions.

The pull request is [here|https://github.com/riptano/cassandra-dtest/pull/328].

> Consistency levels such as non-local quorum need better tests
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-9102
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9102
>             Project: Cassandra
>          Issue Type: Test
>            Reporter: Ariel Weisberg
>            Assignee: Stefania
>
> We didn't catch unit testing for this functionality. There is dtest 
> consistency_test but it doesn't cover non-local functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to