[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162074#comment-17162074 ] Gianluca Righetto commented on CASSANDRA-15792: --- Opened a clean PR to get rid of the merge noise of the first one (pushing then rebasing against two remotes messed that one up): https://github.com/grighetto/cassandra-dtest/pull/2 > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-beta > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162066#comment-17162066 ] Berenguer Blasi commented on CASSANDRA-15792: - I moved to 'review in progress' as still sbdy that is a committer has to review it :-) > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-beta > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162064#comment-17162064 ] Gianluca Righetto commented on CASSANDRA-15792: --- Thanks for the first pass, [~Bereng]. Yes, this has been waiting for a committer to do the final push, I'll check if there's anyone available. I just realized I never added a comment detailing the final solution I landed on, so here it goes: the original implementation didn't expect a {{speculative write}} to be performed to node2, but depending on timing and load of node3, that might actually happen. So, as long as we can guarantee the speculative write to node2 happened after the initial write attempt to node3, we're good. In order to achieve that, I implemented a byteman function that records the {{System.currentTimeMillis}} of each Message Verb, and with that I can reconstruct the order of the events in the test. > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-beta > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159037#comment-17159037 ] Berenguer Blasi commented on CASSANDRA-15792: - Anything else you need from me here [~gianluca]? > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-beta > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153611#comment-17153611 ] Berenguer Blasi commented on CASSANDRA-15792: - LGTM +1. Ran multiple times locally as well and it succeeds. Only confused at the wild commits list, which I hope they are only merges and noop when committed. > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-beta > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130730#comment-17130730 ] Gianluca Righetto commented on CASSANDRA-15792: --- [~jmckenzie] I did add some comments trying to clarify the issue: [https://github.com/grighetto/cassandra-dtest/pull/1/files#diff-60812631a43b8e1f0c9fb53d9f7487ebR530] [https://github.com/grighetto/cassandra-dtest/pull/1/files#diff-60812631a43b8e1f0c9fb53d9f7487ebR816] But I had an idea to make this deterministic, which is using byteman to check if the coordinator node got back a response from node 3 within the specified timeout, if so, we can accept the speculative write. I'll move this back to in progress real quick to implement this solution today. This doesn't change the scope though, still for 4.0-beta. > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-beta > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17125305#comment-17125305 ] Josh McKenzie commented on CASSANDRA-15792: --- Thanks for the insight; agree on fixver. A little sad about the delay-based testing, though I assume converting to any kind of determinism would be significantly more invasive and artificial in the codebase so not worth pursuing. Perhaps commenting in the dtest the issue we ran into here so if we see it pop up subsequently (or annotating / elaborating in the failure state) pointing to this ticket could help in the future if this crops back up? Just a thought. > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-beta > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17121116#comment-17121116 ] Gianluca Righetto commented on CASSANDRA-15792: --- [~jmckenzie] After investigating this for some time now, I determined this is mostly related to a low write timeout used in TestSpeculativeReadRepair. That may lead to a speculated write to a different node depending on how long the original node takes to apply the repair mutation, but the test assertion is expecting no speculated writes. In other words, this is mostly a problem with the test, not with C* runtime, which is doing the right thing. In order to fix this, I made it accept speculated writes in the original test, but I also replicated the test method in a different test class with a longer write timeout to reduce the likelihood of speculated writes. Of course, since this is all time based, the new test may still fail under a system with high CPU contention, but at least for now I can't easily reproduce the failure anymore (whereas it was failing consistently for me before). Here's the pull request in my cassandra-dtest fork: [https://github.com/grighetto/cassandra-dtest/pull/1] Regarding the fixver, I'm ok with moving this to beta, even though the fix is already available, it still needs to go through review, but since this is not a runtime problem, I wouldn't say this is a blocker for alpha. > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-alpha > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15792) test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair
[ https://issues.apache.org/jira/browse/CASSANDRA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17120296#comment-17120296 ] Josh McKenzie commented on CASSANDRA-15792: --- [~gianluca] - any signals from this failure that it's something that would block users from testing beta or require an API change to resolve? If not we should go ahead and punt to beta fixver as per dev ML thread. > test_speculative_data_request - read_repair_test.TestSpeculativeReadRepair > -- > > Key: CASSANDRA-15792 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15792 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Gianluca Righetto >Priority: Normal > Fix For: 4.0-alpha > > > Failing on the latest trunk here: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/127/workflows/dfba669d-4a5c-4553-b6a2-85647d0d8d2b/jobs/668/tests > Failing once in 30 times as per Jenkins: > https://jenkins-cm4.apache.org/job/Cassandra-trunk-dtest/69/testReport/dtest.read_repair_test/TestSpeculativeReadRepair/test_speculative_data_request/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org