[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609548#comment-17609548 ] Ekaterina Dimitrova commented on CASSANDRA-16061: - Seen on trunk: https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/1948/workflows/b7f46851-91bd-485d-aff3-c7d2150cb401/jobs/15425/tests#failed-test-0 > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482693#comment-17482693 ] Josh McKenzie commented on CASSANDRA-16061: --- Possibly related: [https://ci-cassandra.apache.org/job/Cassandra-4.0/308/testReport/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/] {code:java} Error Message ccmlib.node.TimeoutError: 12 Jan 2022 00:01:53 [node1] after 120.13/120 seconds Missing: ['127.0.0.4:7000.* is now UP'] not found in system.log: Head: INFO [Messaging-EventLoop-3-5] 2022-01-11 23:59:5 Tail: 0.0.1:7000(/127.0.0.1:50014)->/127.0.0.4:7000-URGENT_MESSAGES-36cbc506 successfully connected, version = 12, framing = CRC, encryption = unencrypted Stacktrace self = @flaky(max_runs=1) @pytest.mark.no_vnodes def test_move_backwards_and_cleanup(self): """Test moving a node backwards without moving past a neighbor token""" move_token = '5' expected_after_move = [gen_expected(range(0, 6), range(31, 40)), gen_expected(range(0, 21, 2)), gen_expected(range(1, 6, 2), range(6, 31)), gen_expected(range(7, 20, 2), range(21, 40))] expected_after_repair = [gen_expected(range(0, 6), range(31, 40)), gen_expected(range(0, 21)), gen_expected(range(6, 31)), gen_expected(range(21, 40))] > self.move_test(move_token, expected_after_move, expected_after_repair) transient_replication_ring_test.py:335: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ transient_replication_ring_test.py:237: in move_test node4.start(wait_for_binary_proto=NODE_WAIT_TIMEOUT_IN_SECS * 2) transient_replication_ring_test.py:52: in new_start return old_start(*args, **kwargs) ../venv/lib/python3.8/site-packages/ccmlib/node.py:895: in start node.watch_log_for_alive(self, from_mark=mark) ../venv/lib/python3.8/site-packages/ccmlib/node.py:664: in watch_log_for_alive self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, filename=filename) ../venv/lib/python3.8/site-packages/ccmlib/node.py:588: in watch_log_for TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ start = 1641945593.295578, timeout = 120 msg = "Missing: ['127.0.0.4:7000.* is now UP'] not found in system.log:\n Head: INFO [Messaging-EventLoop-3-5] 2022-01-11 2...27.0.0.4:7000-URGENT_MESSAGES-36cbc506 successfully connected, version = 12, framing = CRC, encryption = unencrypted\n" node = 'node1' @staticmethod def raise_if_passed(start, timeout, msg, node=None): if start + timeout < time.time(): > raise TimeoutError.create(start, timeout, msg, node) E ccmlib.node.TimeoutError: 12 Jan 2022 00:01:53 [node1] after 120.13/120 seconds Missing: ['127.0.0.4:7000.* is now UP'] not found in system.log: EHead: INFO [Messaging-EventLoop-3-5] 2022-01-11 23:59:5 ETail: 0.0.1:7000(/127.0.0.1:50014)->/127.0.0.4:7000-URGENT_MESSAGES-36cbc506 successfully connected, version = 12, framing = CRC, encryption = unencrypted ../venv/lib/python3.8/site-packages/ccmlib/node.py:56: TimeoutError {code} > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17471678#comment-17471678 ] Berenguer Blasi commented on CASSANDRA-16061: - ^Reopening as it seems a valid one as per [~adelapena]'s comments. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461605#comment-17461605 ] Andres de la Peña commented on CASSANDRA-16061: --- Here is [another failure|https://app.circleci.com/pipelines/github/adelapena/cassandra/1224/workflows/3b3018d4-7584-4869-bff3-4c183da56980/jobs/11307], this time on trunk. It seems definitively flaky, although it's quite hard to reproduce. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461415#comment-17461415 ] Andres de la Peña commented on CASSANDRA-16061: --- This one has been reproduced in [this run|https://app.circleci.com/pipelines/github/adelapena/cassandra/1217/workflows/df29aa61-5171-4ee7-a214-d36d507ce5af/jobs/11212] while testing CASSANDRA-17195. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281598#comment-17281598 ] Berenguer Blasi commented on CASSANDRA-16061: - Anybody has seen this recently? > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17270749#comment-17270749 ] Michael Semb Wever commented on CASSANDRA-16061: Here's the link(s) based off the latest build (rather than a build number that will expire to 404 soon) - [trunk dtest-novnode|https://ci-cassandra.apache.org/job/Cassandra-trunk/lastCompletedBuild/testReport/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_forwards_and_cleanup/] - [trunk dtest|https://ci-cassandra.apache.org/job/Cassandra-trunk/lastCompletedBuild/testReport/dtest.transient_replication_ring_test/TestTransientReplicationRing/test_move_forwards_and_cleanup/] - [trunk dtest-offheap|https://ci-cassandra.apache.org/job/Cassandra-trunk/lastCompletedBuild/testReport/dtest-offheap.transient_replication_ring_test/TestTransientReplicationRing/test_move_forwards_and_cleanup/] > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267565#comment-17267565 ] Ekaterina Dimitrova commented on CASSANDRA-16061: - Agreed, I also don't see it as a blocker. Being the reporter of this issue I just changed its Fix version to 4.x. We should try to figure out what is going on and we will do it, just changing priority. Thank you > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267057#comment-17267057 ] Berenguer Blasi commented on CASSANDRA-16061: - I wouldn't make this a 4.0 blocker since 2 people have had a go at it already, so it must be an obscure thing. Unless [~e.dimitrova] does her magic and finds it I would keep this one open but for a later version, as it's still there. At most a 'can't repro' resolution. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223750#comment-17223750 ] Blake Eggleston commented on CASSANDRA-16061: - I'm afraid I don't have any insight [~e.dimitrova], I wasn't involved in transient replication topology changes > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218462#comment-17218462 ] Ekaterina Dimitrova commented on CASSANDRA-16061: - Hey [~ifesdjeen], [~bdeggleston], I see you added this test some time ago and it was initially marked as flakey. The commit description points to CASSANDRA-14404 but I believe the test was added as part of different ticket which I wasn't able to identify to read the reasoning behind the initial flaky mark. Any thoughts and advice on this and the failures we see? Thanks in advance! > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218095#comment-17218095 ] Berenguer Blasi commented on CASSANDRA-16061: - SGTM. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217665#comment-17217665 ] Adam Holmberg commented on CASSANDRA-16061: --- I see. Did the thing you were chasing manifest the same symptoms? I did try reducing VM resources but just found it was creating timeouts due to trying to run four nodes in an unreasonable amount of resources. I think I would still advocate for increasing this timeout based on what we're seeing in ci-cassandra. In the mean time I will set back to open in case someone else wants to take a shot at reproducing. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217260#comment-17217260 ] Berenguer Blasi commented on CASSANDRA-16061: - Increasing the timeout didn't fix it. I managed to repro locally and indeed tested that with no luck. Try setting a VM and reduce RAM and CPU until you hit it. If I come back to this guy I will post exact repro steps. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217083#comment-17217083 ] Adam Holmberg commented on CASSANDRA-16061: --- I let this run for quite some time in the background and didn't reproduce for myself. There's not much to go on with the error Berenguer was describing. All failures, in including several on [ci-cassandra|https://ci-cassandra.apache.org/job/Cassandra-trunk/77/testReport/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/] right now, are simple client timeouts. For now I think we should just increase the timeout on that request. I'm open to suggestion on whether we should keep this open for a time, or reopen if it continues. [patch|https://github.com/apache/cassandra-dtest/compare/master...aholmberg:CASSANDRA-16061?expand=1] [ci|https://app.circleci.com/pipelines/github/aholmberg/cassandra?branch=CASSANDRA-16061] > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193150#comment-17193150 ] Caleb Rackliffe commented on CASSANDRA-16061: - I'm seeing a failure in {{test_remove}} on trunk: https://app.circleci.com/pipelines/github/dcapwell/cassandra/502/workflows/026c9b94-bbbe-4984-9321-14420d699384/jobs/2743 If it doesn't look like the same root cause, I can throw up a different Jira. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189074#comment-17189074 ] Berenguer Blasi commented on CASSANDRA-16061: - This is the failure for reference as ci-cass logs are gone now: {noformat} ===Flaky Test Report=== test_move_forwards_and_cleanup failed; it passed 0 out of the required 1 times. 02 Sep 2020 07:18:39 [node4] Missing: ['Starting listening for CQL clients']: INFO [main] 2020-09-02 09:17:30,390 YamlConfigura. See system.log for remainder [, , , , , ] ===End Flaky Test Report=== {noformat} It's hard to repro. There is some exotic race where [boostrap.get()|https://github.com/apache/cassandra/blob/23ba48aa935d3f81e66b65285fa8e7972f94dcfe/src/java/org/apache/cassandra/service/StorageService.java#L1584] will block as a default connection never completes blocking [here|https://github.com/apache/cassandra/blob/23ba48aa935d3f81e66b65285fa8e7972f94dcfe/src/java/org/apache/cassandra/streaming/DefaultConnectionFactory.java#L50]. That just times out the test. That shouldn't be as the default connection has a built in timeout. Even forcing a timeout myself when waiting on it won't do the trick. Somehow connecting to node1 is not possible. I have been debugging this as much as I can. The netty code needs some time to penetrate and I don't have a full grasp of it despite what I saw made sense. {{bootstrap.get()}} blocks on AbstractFuture [parking|https://github.com/google/guava/blob/v27.0/guava/src/com/google/common/util/concurrent/AbstractFuture.java#L523] the thread. If you google a bit you'll find many people getting blocked threads around this area given guava makes some assumptions apparently. I am taking a break here as I am not progressing so I need to go back to the drawing board on how to approach this. If sbdy wants to try the challenge feel free to assign it. > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186452#comment-17186452 ] Berenguer Blasi commented on CASSANDRA-16061: - Seems this test can fail in a couple different ways depending on whether it runs on ci-cass or circle: - [circle|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] - [ci-cass|https://ci-cassandra.apache.org/job/Cassandra-trunk/294/testReport/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_forwards_and_cleanup/] I can't repro the circle one, only the ci-cass one on a vm with constrained resources. Might give this one a try next week unless sbd beats me to it. My 2cts... > transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup > > > Key: CASSANDRA-16061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16061 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > Failing here, also locally: > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org