[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183392#comment-17183392 ] Berenguer Blasi commented on CASSANDRA-11928: - #justfyi #collborating CASSANDRA-16073 is my take at trying to fix tracing test failures. Apologies I missed this ticket. I didn't go down the route of investigating mismatched CLs but found a few timeouts I could repro locally and a PR that fixes them. If we came to agreement that is good we could close this one > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Craig Kodman >Priority: Normal > Labels: dtest, flaky > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961764#comment-16961764 ] Michael Semb Wever commented on CASSANDRA-11928: I've been able to reproduce the flakiness of the dtest along all of the trunk since the {{cassandra-3.11}} was branched. This corresponds to the date of when this ticket was reported. Oddly though, we don't see the flakiness in ASF Jenkins anywhere except for against trunk. This could be because Cassandra 4.0 uses more resources per node than previous versions (getting a three node ccm working reliably in tests in circleci/travis/jenkens is considerably harder than it was in Cassandra 3.11 and before). The bisecting i did used the following {{bisect.sh}} script and recipe. The {{bisect.sh}} script: {noformat} #!/bin/bash set -e cd /home/mick/src/apache/cassandra ant realclean ant artifacts cd /home/mick/src/apache/cassandra-dtest/ source ~/dtest/bin/activate # iterate 20 times, making sure this flakey test works on this sha for i in {0..19} ; do echo " ITERATION $i " python -m pytest --cassandra-dir=/home/mick/src/apache/cassandra cql_tracing_test.py done cd /home/mick/src/apache/cassandra git stash {noformat} The execution: {code} git bisect start git bisect bad git bisect good `git log cassandra-3.11..trunk --oneline | tail -1 | cut -d' ' -f1` git bisect run ./bisect.sh {code} > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Craig Kodman >Priority: Normal > Labels: dtest, flaky > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960886#comment-16960886 ] Michael Semb Wever commented on CASSANDRA-11928: Seems to be a regression in trunk of CASSANDRA-11465, the tracing doesn't use same consistency level as the request. trunk: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/856/testReport/junit/cql_tracing_test/TestCqlTracing/ 3.11: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-3.11-dtest/466/testReport/cql_tracing_test/TestCqlTracing/ My understanding is that tracing data was intended to be eventually consistent, and making it strictly consistency (via `-Dcassandra.wait_for_tracing_events_timeout_secs=xx`) was only for the purpose of testing. If that's true, a simple fix is just to reduce ccm nodes for that test, ie https://github.com/thelastpickle/cassandra-dtest/commit/f22f89fdb3080ac48f4310ee1a5aeb219ac2f093#diff-b866cba7cf982d53e6406cca014e659eR23 [~pauloricardomg], [~jkni], [~mambocab], thoughts? Is it worth bisecting where the regression came from? Or removing the `wait_for_tracing_events_timeout_secs` flag? > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Craig Kodman >Priority: Normal > Labels: dtest, flaky > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342734#comment-15342734 ] Tyler Hobbs commented on CASSANDRA-11928: - Hmm, I tried this out and blocking on tracing in the write path seems to result in some sort of deadlock. Since this test has only flapped once in a hundred runs, I'm not sure how much effort we want to invest in this right now. > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug >Reporter: Craig Kodman >Assignee: Tyler Hobbs > Labels: dtest > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319565#comment-15319565 ] Tyler Hobbs commented on CASSANDRA-11928: - I think one step we could take to make tests that utilize tracing less flaky is to add a system flag to make trace writes synchronous and at {{CL.ALL}} instead of async and {{CL.ANY}}. > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug >Reporter: Craig Kodman >Assignee: Tyler Hobbs > Labels: dtest > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian JIRA (v6.3.4#6332)