[jira] [Commented] (CASSANDRA-19042) Repair fuzz tests fail with paxos_variant: v2
[ https://issues.apache.org/jira/browse/CASSANDRA-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831085#comment-17831085 ] Ekaterina Dimitrova commented on CASSANDRA-19042: - largecolumn_test.TestLargeColumn::largecolumn_test.py::TestLargeColumn::test_cleanup - seems like something in your environment as far as I understand. I can also confirm it is not failing in CircleCI upgrade_tests.upgrade_through_versions_test.TestUpgrade_indev_4_0_x_To_indev_5_0_x::upgrade_test - it does not seem related, but we may want to open a ticket for this one. Though probably not a release blocker as those bootstrap tests are known to flake from time to time every now and then since quite some time. Do you see it also on previous branches in your environment? test_assassinate_valid_node - known from CASSANDRA-18753, just reported today testLocalSerialLocalCommit - CASSANDRA-18851 > Repair fuzz tests fail with paxos_variant: v2 > - > > Key: CASSANDRA-19042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19042 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Feature/Lightweight Transactions, > Test/fuzz >Reporter: Branimir Lambov >Assignee: David Capwell >Priority: Normal > Fix For: 5.0-rc, 5.x > > Attachments: > ci_summary-cassandra-5.0-99dbcac488d74480e84f92f6449cdd684a8e4fb0.html, > ci_summary-trunk-04336bab2c74199ac4062cd8026ee7f8fa4634e3.html > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Adding {{paxos_variant: v2}} to the test yaml causes all fuzz repair tests to > fail with > {code} > java.lang.NullPointerException: null > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:337) > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:300) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:176) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:147) > at > org.apache.cassandra.net.Message$Serializer.payloadSize(Message.java:1067) > at org.apache.cassandra.net.Message.payloadSize(Message.java:1114) > at > org.apache.cassandra.net.Message$Serializer.serializedSize(Message.java:750) > at org.apache.cassandra.net.Message.serializedSize(Message.java:1094) > ... > {code} > This happens for all three options of {{paxos_state_purging}} and both with > and without {{storage_compatibility_mode: NONE}}. > Tests still fail if {{PaxosStartPrepareCleanup}} is changed to use > {{EndpointState.nullableSerializer}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19042) Repair fuzz tests fail with paxos_variant: v2
[ https://issues.apache.org/jira/browse/CASSANDRA-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831076#comment-17831076 ] David Capwell commented on CASSANDRA-19042: --- the failures I saw were either flakey (not there on rerun) or fail 100% of the time in our CI but not on circle... If this does cause anyone CI issues please reach out to me, we can revert if need to > Repair fuzz tests fail with paxos_variant: v2 > - > > Key: CASSANDRA-19042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19042 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Feature/Lightweight Transactions, > Test/fuzz >Reporter: Branimir Lambov >Assignee: David Capwell >Priority: Normal > Fix For: 5.0-rc, 5.x > > Attachments: > ci_summary-cassandra-5.0-99dbcac488d74480e84f92f6449cdd684a8e4fb0.html, > ci_summary-trunk-04336bab2c74199ac4062cd8026ee7f8fa4634e3.html > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Adding {{paxos_variant: v2}} to the test yaml causes all fuzz repair tests to > fail with > {code} > java.lang.NullPointerException: null > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:337) > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:300) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:176) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:147) > at > org.apache.cassandra.net.Message$Serializer.payloadSize(Message.java:1067) > at org.apache.cassandra.net.Message.payloadSize(Message.java:1114) > at > org.apache.cassandra.net.Message$Serializer.serializedSize(Message.java:750) > at org.apache.cassandra.net.Message.serializedSize(Message.java:1094) > ... > {code} > This happens for all three options of {{paxos_state_purging}} and both with > and without {{storage_compatibility_mode: NONE}}. > Tests still fail if {{PaxosStartPrepareCleanup}} is changed to use > {{EndpointState.nullableSerializer}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19042) Repair fuzz tests fail with paxos_variant: v2
[ https://issues.apache.org/jira/browse/CASSANDRA-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830670#comment-17830670 ] Ekaterina Dimitrova commented on CASSANDRA-19042: - Overall LGTM (though limited knowledge in that part of the codebase, I admit). Some nits left, which can be addressed on commit. The CI run was limited preliminary development testing (the skinny dev workflow Berenguer created). We will need a full pre-commit CI. Also, the patch needs to be propagated to trunk and tested there, too. Do we want to run all tests with paxos_v2 enabled, at least on one of the branches? WDYT? Thanks > Repair fuzz tests fail with paxos_variant: v2 > - > > Key: CASSANDRA-19042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19042 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Feature/Lightweight Transactions, > Test/fuzz >Reporter: Branimir Lambov >Assignee: David Capwell >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 2.5h > Remaining Estimate: 0h > > Adding {{paxos_variant: v2}} to the test yaml causes all fuzz repair tests to > fail with > {code} > java.lang.NullPointerException: null > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:337) > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:300) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:176) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:147) > at > org.apache.cassandra.net.Message$Serializer.payloadSize(Message.java:1067) > at org.apache.cassandra.net.Message.payloadSize(Message.java:1114) > at > org.apache.cassandra.net.Message$Serializer.serializedSize(Message.java:750) > at org.apache.cassandra.net.Message.serializedSize(Message.java:1094) > ... > {code} > This happens for all three options of {{paxos_state_purging}} and both with > and without {{storage_compatibility_mode: NONE}}. > Tests still fail if {{PaxosStartPrepareCleanup}} is changed to use > {{EndpointState.nullableSerializer}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19042) Repair fuzz tests fail with paxos_variant: v2
[ https://issues.apache.org/jira/browse/CASSANDRA-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829219#comment-17829219 ] Ekaterina Dimitrova commented on CASSANDRA-19042: - CI ended up green for 5.0. As paxos_variant was not changed in the latest configuration to v2 as part of CASSANDRA-18753, I changed it locally and reran all the mentioned failing tests. They all finished successfully, too. I will start reviewing it in the afternoon. > Repair fuzz tests fail with paxos_variant: v2 > - > > Key: CASSANDRA-19042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19042 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Feature/Lightweight Transactions, > Test/fuzz >Reporter: Branimir Lambov >Assignee: David Capwell >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Adding {{paxos_variant: v2}} to the test yaml causes all fuzz repair tests to > fail with > {code} > java.lang.NullPointerException: null > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:337) > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:300) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:176) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:147) > at > org.apache.cassandra.net.Message$Serializer.payloadSize(Message.java:1067) > at org.apache.cassandra.net.Message.payloadSize(Message.java:1114) > at > org.apache.cassandra.net.Message$Serializer.serializedSize(Message.java:750) > at org.apache.cassandra.net.Message.serializedSize(Message.java:1094) > ... > {code} > This happens for all three options of {{paxos_state_purging}} and both with > and without {{storage_compatibility_mode: NONE}}. > Tests still fail if {{PaxosStartPrepareCleanup}} is changed to use > {{EndpointState.nullableSerializer}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19042) Repair fuzz tests fail with paxos_variant: v2
[ https://issues.apache.org/jira/browse/CASSANDRA-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829196#comment-17829196 ] Ekaterina Dimitrova commented on CASSANDRA-19042: - This is one of our last 5.0 blockers. To help expedite the work, I rebased the patch [here |https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-19042], just started min CI run - [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra?branch=CASSANDRA-19042] If nothing shows up I will do a first pass of review, [~bdeggleston], I hope you can still take a look at it too. > Repair fuzz tests fail with paxos_variant: v2 > - > > Key: CASSANDRA-19042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19042 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Feature/Lightweight Transactions, > Test/fuzz >Reporter: Branimir Lambov >Assignee: David Capwell >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Adding {{paxos_variant: v2}} to the test yaml causes all fuzz repair tests to > fail with > {code} > java.lang.NullPointerException: null > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:337) > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:300) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:176) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:147) > at > org.apache.cassandra.net.Message$Serializer.payloadSize(Message.java:1067) > at org.apache.cassandra.net.Message.payloadSize(Message.java:1114) > at > org.apache.cassandra.net.Message$Serializer.serializedSize(Message.java:750) > at org.apache.cassandra.net.Message.serializedSize(Message.java:1094) > ... > {code} > This happens for all three options of {{paxos_state_purging}} and both with > and without {{storage_compatibility_mode: NONE}}. > Tests still fail if {{PaxosStartPrepareCleanup}} is changed to use > {{EndpointState.nullableSerializer}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19042) Repair fuzz tests fail with paxos_variant: v2
[ https://issues.apache.org/jira/browse/CASSANDRA-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825949#comment-17825949 ] Berenguer Blasi commented on CASSANDRA-19042: - 5.0 looks like is nearing completion :fireworks: so any help here from someone knowledgeable would be great! #collaborating > Repair fuzz tests fail with paxos_variant: v2 > - > > Key: CASSANDRA-19042 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19042 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Feature/Lightweight Transactions, > Test/fuzz >Reporter: Branimir Lambov >Assignee: David Capwell >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Adding {{paxos_variant: v2}} to the test yaml causes all fuzz repair tests to > fail with > {code} > java.lang.NullPointerException: null > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:337) > at > org.apache.cassandra.gms.EndpointStateSerializer.serializedSize(EndpointState.java:300) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:176) > at > org.apache.cassandra.service.paxos.cleanup.PaxosStartPrepareCleanup$RequestSerializer.serializedSize(PaxosStartPrepareCleanup.java:147) > at > org.apache.cassandra.net.Message$Serializer.payloadSize(Message.java:1067) > at org.apache.cassandra.net.Message.payloadSize(Message.java:1114) > at > org.apache.cassandra.net.Message$Serializer.serializedSize(Message.java:750) > at org.apache.cassandra.net.Message.serializedSize(Message.java:1094) > ... > {code} > This happens for all three options of {{paxos_state_purging}} and both with > and without {{storage_compatibility_mode: NONE}}. > Tests still fail if {{PaxosStartPrepareCleanup}} is changed to use > {{EndpointState.nullableSerializer}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19042) Repair fuzz tests fail with paxos_variant: v2
[ https://issues.apache.org/jira/browse/CASSANDRA-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816707#comment-17816707 ] Ekaterina Dimitrova commented on CASSANDRA-19042: - All these tests were introduced in CASSANDRA-18816. I checked, and they also fail with the commit from CASSANDRA-18816 if we change the default test config and add {{{}paxos_variant: v2{}}}. Though they were failing with different seeds, etc. Logs below. [~dcapwell], [~marcuse], [~maedhroz] - as the people involved in CASSANDRA-18816, any immediate thoughts or pointers you may have that can save me some time with this ticket will be highly appreciated. This is one of our last 5.0 blockers. >From ConcurrentWithPreviewFuzzTest: {code:java} org.apache.cassandra.transport.ProtocolException: Property error detected: Seed = -7170984628001741947 Examples = 2 Pure = false Error: Property error detected: Seed = -7170984628001741947 Examples = 2 Pure = false Error: [Unexpected state: CoordinatorState{id=5eff5f10-c9d7-11ee-9b5f-89c2cb879fc0, stateTimesNanos=[12, 13, 14, 2280377, 2280378, 0], status=REPAIR_START [5f1b2470-c9d7-11ee-9b5f-89c2cb879fc0 -> JOBS_START [34249ea9-fe80-3ba4-95f5-f8f9f7b24256 -> START, effc8637-9f49-30b6-8fae-92b328b1961a -> START, 03b2efbf-e8ec-305d-8cf6-cbbf34ff8e2d -> START, ac7eb281-b198-3224-8d14-23f6edbb8a1d -> START, 30380f07-8aef-3202-b36c-446e83e60b7e -> START, 9d8ec591-6095-37c3-8516-18f972869fec -> START, 54e6cdff-0492-3b76-b296-ed493bc7ce09 -> START, 8bb29732-d581-3532-804d-621767a4c143 -> START, ff01ec8c-ced4-37e7-acb0-7c64a98bd60e -> START, eadbfcf2-e8a5-38b6-9825-2681f79472fa -> START, 38a4c786-de6b-31f0-a9b6-59339adfa853 -> START, 7f6999a4-7919-3d5c-bcde-59b5687b4b0a -> START, 6721f872-bee9-31f1-b318-2d7560296e5a -> START, 5d8c7c61-bb2a-3abb-900e-c1149bc97919 -> START, 8f7e4046-fbe9-30bf-8096-9c7c3becf4ff -> START, d522d89b-f38c-3bfe-aea5-a79d0b2b3223 -> START, c32800e8-ab09-35bf-a172-a35c869b1edd -> START]], lastUpdatedAtNs=2280378} -> null; example 0] expected: but was: Values: 0 = accord.utils.DefaultRandom@4a480ae0 Values: at accord.utils.Property$Common.checkWithTimeout(Property.java:103) at accord.utils.Property$SingleBuilder.check(Property.java:223) at accord.utils.Property$ForBuilder.check(Property.java:124) at org.apache.cassandra.repair.ConcurrentIrWithPreviewFuzzTest.concurrentIrWithPreview(ConcurrentIrWithPreviewFuzzTest.java:46) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38) at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55){code} >From FailedAckTest: {code:java} accord.utils.Property$PropertyError: Property error detected: Seed = -6503298441918509340 Examples = 10 Pure = false Error: java.lang.NullPointerException Values: 0 =