I've taken CASSANDRA-13194, CASSANDRA-13506, CASSANDRA-13515, and 
CASSANDRA-13372 to start

On May 10, 2017 at 12:44:47 PM, Ariel Weisberg (ar...@weisberg.ws) wrote:

Hi,  

The dev list murdered my rich text formatted email. Here it is  
reformatted as plain text.  

The unit tests are looking pretty reliable right now. There is a long  
tail of infrequently failing tests but it's not bad and almost all  
builds succeed in the current build environment. In CircleCI it seems  
like unit tests might be a little less reliable, but still usable.  

The dtests on the other hand aren't producing clean builds yetl. There  
is also a pretty diverse set of failing tests.  

I did a bit of triaging of the flakey dtests. I started by cataloging  
everything, but what I found is that the long tail of flakey dtests is  
very long indeed so I narrowed focus to just the top frequently failing  
tests for now. See https://goo.gl/b96CdO  

I created spreadsheet with some of the failing tests. Links to JIRA,  
last time the test was seen failing, and how many failures I found in  
Apache Jenkins across the 3 dtest builds. There are a lot of failures  
not listed. There would be 50+ entries if I cataloged each one.  

There are two hard failing tests, but both are already moving along:  
CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta  
reviewing, last updated April 2017) dtest failure in  
topology_test.TestTopology.size_estimates_multidc_test  
CASSANDRA-13113 (Ready to commit, assigned Alex Petrov, Sam T Reviewing,  
last updated March 2017) test failure in  
auth_test.TestAuth.system_auth_ks_is_alterable_test  

I think the tests we should tackle first are on this sheet in priority  
order https://goo.gl/S3khv1  

Suite: bootstrap_test  
Test: TestBootstrap.simultaneous_bootstrap_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13506  
Last failure: 5/5/2017  
Counted failures: 45  

Suite: repair_test  
Test: incremental_repair_test.TestIncRepair.compaction_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13194  
Last failure: 5/4/2017  
Counted failures: 44  

Suite: sstableutil_test  
Test: SSTableUtilTest.compaction_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13182  
Last failure: 5/4/2017  
Counted failures: 35  

Suite: paging_test  
Test: TestPagingWithDeletions.test_ttl_deletions  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13507  
Last failure: 4/25/2017  
Counted failures: 31  

Suite: repair_test  
Test: incremental_repair_test.TestIncRepair.multiple_repair_test  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13515  
Last failed: 5/4/2017  
Counted failures: 18  

Suite: cqlsh_tests  
Test: cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*  
JIRA:  
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
  
Last failed: 5/8/2017  
Counted failures: 23  

Suite: paxos_tests  
Test: TestPaxos.contention_test_many_threads  
JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13517  
Last failed: 5/8/2017  
Counted failures: 15  

Suite: repair_test  
Test: TestRepair  
JIRA:  
https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
  
Last failure: 5/4/2017  
Comment: No one test fails a lot but the number of failing tests is  
substantial  

Suite: cqlsh_tests  
Test: cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |  
test_use_keyspace | test_create_keyspace]  
JIRA: No JIRA yet  
Last failed: 4/22/2017  
count: 6  

If you have spare cycles you can make a huge difference in test  
stability by picking off one of these.  

Regards,  
Ariel  

On Wed, May 10, 2017, at 12:45 PM, Ariel Weisberg wrote:  
> Hi all,  
>  
> The unit tests are looking pretty reliable right now. There is a long  
> tail of infrequently failing tests but it's not bad and almost all  
> builds succeed in the current build environment. In CircleCI it seems  
> like unit tests might be a little less reliable, but still usable.  
> The dtests on the other hand aren't producing clean builds yetl. There  
> is also a pretty diverse set of failing tests.  
> I did a bit of triaging of the flakey dtests. I started by cataloging  
> everything, but what I found is that the long tail of flakey dtests is  
> very long indeed so I narrowed focus to just the top frequently failing  
> tests for now. See https://goo.gl/b96CdO  
> I created spreadsheet with some of the failing tests. Links to JIRA,  
> last time the test was seen failing, and how many failures I found in  
> Apache Jenkins across the 3 dtest builds. There are a lot of failures  
> not listed. There would be 50+ entries if I cataloged each one.  
> There are two hard failing tests, but both are already moving along:  
> CASSANDRA-13229 (Ready to commit, assigned Alex Petrov, Paulo Motta  
> reviewing, last updated April 2017) dtest failure in  
> topology_test.TestTopology.size_estimates_multidc_testCASSANDRA-13113  
> (Ready to commit, assigned Alex Petrov, Sam T Reviewing,  
> last updated March 2017) test failure in  
> auth_test.TestAuth.system_auth_ks_is_alterable_test  
> I think the tests we should tackle first are on this sheet in priority  
> order https://goo.gl/S3khv1  
> Suite Test JIRA Last failure Counted failures Status Assigned Reviewer  
> Comments bootstrap_test TestBootstrap.simultaneous_bootstrap_test  
> https://issues.apache.org/jira/browse/CASSANDRA-13506  
> 5/5/2017 45 Open  
>  
>  
>  
> repair_test incremental_repair_test.TestIncRepair.compaction_test  
> https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 5/4/2017 44 Open  
>  
>  
>  
> sstableutil_test SSTableUtilTest.compaction_test  
> https://issues.apache.org/jira/browse/CASSANDRA-[1]13182  
> 5/4/2017 35 Open  
>  
>  
>  
> paging_test TestPagingWithDeletions.test_ttl_deletions  
> https://issues.apache.org/jira/browse/CASSANDRA-[2]13507  
> 4/25/2017 31 Open  
>  
>  
>  
> repair_test incremental_repair_test.TestIncRepair.multiple_repair_test  
> https://issues.apache.org/jira/browse/CASSANDRA-[3]13515  
> 5/4/2017 18 Open  
>  
>  
>  
> cqlsh_tests cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_*  
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22%2C%20%22Ready%20to%20Commit%22%2C%20%22Awaiting%20Feedback%22)%20AND%20text%20~%20%22CqlshCopyTest%22
>   
> 5/8/2017 23  
>  
>  
>  
>  
> paxos_tests TestPaxos.contention_test_many_threads  
> https://issues.apache.org/jira/browse/CASSANDRA-[4]13517  
> 5/8/2017 15 Open  
>  
>  
>  
> repair_test TestRepair  
> https://issues.apache.org/jira/issues/?jql=status%20%3D%20Open%20AND%20text%20~%20%22dtest%20failure%20repair_test%22
>   
> 5/4/2017  
>  
>  
>  
>  
> No one test fails a lot but the number of failing tests is substantial  
> cqlsh_tests cqlsh_tests.CqlshSmokeTest.[test_insert | test_truncate |  
> test_use_keyspace | test_create_keyspace]  
>  
> 4/22/2017 6  
> If you have spare cycles you can make a huge difference in test  
> stability by picking off one of these.  
> Regards,  
> Ariel  
>  
> Links:  
>  
> 1. https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 2. https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 3. https://issues.apache.org/jira/browse/CASSANDRA-13194  
> 4. https://issues.apache.org/jira/browse/CASSANDRA-13194  

---------------------------------------------------------------------  
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org  
For additional commands, e-mail: dev-h...@cassandra.apache.org  

Reply via email to