[jira] [Updated] (CASSANDRASC-32) Sidecar health checks are failing since CassandraAdaptorDelegate is not started
[ https://issues.apache.org/jira/browse/CASSANDRASC-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRASC-32: Status: Ready to Commit (was: Review In Progress) > Sidecar health checks are failing since CassandraAdaptorDelegate is not > started > --- > > Key: CASSANDRASC-32 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-32 > Project: Sidecar for Apache Cassandra > Issue Type: Bug > Components: Rest API >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > > CassandraAdaptorDelegate class in Sidecar periodically checks if it is able > to connect to Cassandra instance. Currently we are not starting this delegate > and hence Sidecar health checks are failing. We need to start the delegate > while starting the server > > https://github.com/apache/cassandra-sidecar/pull/24 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRASC-32) Sidecar health checks are failing since CassandraAdaptorDelegate is not started
[ https://issues.apache.org/jira/browse/CASSANDRASC-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437115#comment-17437115 ] Dinesh Joshi commented on CASSANDRASC-32: - +1 > Sidecar health checks are failing since CassandraAdaptorDelegate is not > started > --- > > Key: CASSANDRASC-32 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-32 > Project: Sidecar for Apache Cassandra > Issue Type: Bug > Components: Rest API >Reporter: Saranya Krishnakumar >Assignee: Saranya Krishnakumar >Priority: Normal > > CassandraAdaptorDelegate class in Sidecar periodically checks if it is able > to connect to Cassandra instance. Currently we are not starting this delegate > and hence Sidecar health checks are failing. We need to start the delegate > while starting the server > > https://github.com/apache/cassandra-sidecar/pull/24 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437077#comment-17437077 ] David Capwell commented on CASSANDRA-17085: --- Ok I think I figured out the issue; we have no daemon threads running.. updated drain to log threads {code} protected synchronized void drain(boolean isFinalShutdown) throws IOException, InterruptedException, ExecutionException { logger.info("drain({})", isFinalShutdown); if (isFinalShutdown) { Map traces = Thread.getAllStackTraces(); List nonDaemonThreads = new ArrayList<>(); for (Entry e : traces.entrySet()) { if (e.getKey().isDaemon()) continue; nonDaemonThreads.add(e.getKey().getName()); } logger.info("Non-daemon threads: {}", nonDaemonThreads); } {code} This produces {code} INFO [StorageServiceShutdownHook] 2021-11-01 23:54:50,181 StorageService.java:5003 - Non-daemon threads: [DestroyJavaVM, StorageServiceShutdownHook] {code} > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > start = 1635453190.3118386, timeout = 120 > msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n > Head: \n Tail: ..." > node = 'node3' > @staticmethod > def raise_if_passed(start, timeout, msg, node=None): > if start + timeout < time.time(): > > raise TimeoutError.create(start, timeout, msg, node) > E ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after > 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in > system.log: > EHead: > ETail: ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437072#comment-17437072 ] David Capwell commented on CASSANDRA-17085: --- Added the following to CassandraDaemon to detect if System.exit is getting called; it doesn't look to be {code} System.setSecurityManager(new SecurityManager() { public void checkExit(int status) { System.err.println("System.exit("+status+")) callled"); new Throwable("System.exit("+status+")) callled").printStackTrace(); } }); {code} So something else looks to be causing the JVM to halt. > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > start = 1635453190.3118386, timeout = 120 > msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n > Head: \n Tail: ..." > node = 'node3' > @staticmethod > def raise_if_passed(start, timeout, msg, node=None): > if start + timeout < time.time(): > > raise TimeoutError.create(start, timeout, msg, node) > E ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after > 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in > system.log: > EHead: > ETail: ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437062#comment-17437062 ] David Capwell edited comment on CASSANDRA-17085 at 11/1/21, 11:21 PM: -- bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled is more complex it seems, something is causing the JVM to do a graceful shutdown; is see the following in the node3 logs {code} WARN [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:663 - Not starting client transports in write_survey mode as it's bootstrapping or auth is enabled INFO [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:764 - Startup complete INFO [StorageServiceShutdownHook] 2021-11-01 22:43:33,219 HintsService.java:233 - Paused hints dispatch {code} I enhanced the test to log where it is out, so I can correlate test and server logs {code} 22:43:35,415 bootstrap_test INFO Attempting to resume bootstrap on node3 {code} So, I see that we start a graceful shutdown at 22:43:33,219, and the test tries to resume bootstrap at 22:43:35,415; 2 seconds AFTER we started shutdown. I don't see us doing a shutdown in the test; and this fails 100% of the time for me, so we always seem to do a graceful shutdown for some reason... was (Author: dcapwell): bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled is more complex it seems, something is causing the JVM to do a graceful shutdown; is see the following in the node3 logs {code} INFO [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:764 - Startup complete INFO [StorageServiceShutdownHook] 2021-11-01 22:43:33,219 HintsService.java:233 - Paused hints dispatch {code} I enhanced the test to log where it is out, so I can correlate test and server logs {code} 22:43:35,415 bootstrap_test INFO Attempting to resume bootstrap on node3 {code} So, I see that we start a graceful shutdown at 22:43:33,219, and the test tries to resume bootstrap at 22:43:35,415; 2 seconds AFTER we started shutdown. I don't see us doing a shutdown in the test; and this fails 100% of the time for me, so we always seem to do a graceful shutdown for some reason... > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437062#comment-17437062 ] David Capwell commented on CASSANDRA-17085: --- bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled is more complex it seems, something is causing the JVM to do a graceful shutdown; is see the following in the node3 logs {code} INFO [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:764 - Startup complete INFO [StorageServiceShutdownHook] 2021-11-01 22:43:33,219 HintsService.java:233 - Paused hints dispatch {code} I enhanced the test to log where it is out, so I can correlate test and server logs {code} 22:43:35,415 bootstrap_test INFO Attempting to resume bootstrap on node3 {code} So, I see that we start a graceful shutdown at 22:43:33,219, and the test tries to resume bootstrap at 22:43:35,415; 2 seconds AFTER we started shutdown. I don't see us doing a shutdown in the test; and this fails 100% of the time for me, so we always seem to do a graceful shutdown for some reason... > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > start = 1635453190.3118386, timeout = 120 > msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n > Head: \n Tail: ..." > node = 'node3' > @staticmethod > def raise_if_passed(start, timeout, msg, node=None): > if start + timeout < time.time(): > > raise TimeoutError.create(start, timeout, msg, node) > E ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after > 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in > system.log: > EHead: > ETail: ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437032#comment-17437032 ] David Capwell commented on CASSANDRA-17085: --- bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled also fails with the following {code} try: node3.nodetool('join') pytest.fail('nodetool should have errored and failed to join ring') except ToolError as t: > assert "Cannot join the ring until bootstrap completes" in t.stdout E assert 'Cannot join the ring until bootstrap completes' in '' E+ where '' = ToolError("Subprocess ['nodetool', '-h', 'localhost', '-p', '7300', 'join'] exited with non-zero status; exit status: 1; \nstderr: nodetool: Failed to connect to 'localhost:7300' - SocketException: 'Connection reset'.\n",).stdout {code} In both cases it looks like issue connecting to JMX. Logs for node3 don't show the process going down during the test so not sure what's up yet. This is localhost networking so shouldn't have random connection close events, so not sure what leads to this flaky behavior yet. > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > start = 1635453190.3118386, timeout = 120 > msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n > Head: \n Tail: ..." > node = 'node3' > @staticmethod > def raise_if_passed(start, timeout, msg, node=None): > if start + timeout < time.time(): > > raise TimeoutError.create(start, timeout, msg, node) > E ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after > 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in > system.log: > EHead: > ETail: ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell reassigned CASSANDRA-17085: - Assignee: David Capwell > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > start = 1635453190.3118386, timeout = 120 > msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n > Head: \n Tail: ..." > node = 'node3' > @staticmethod > def raise_if_passed(start, timeout, msg, node=None): > if start + timeout < time.time(): > > raise TimeoutError.create(start, timeout, msg, node) > E ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after > 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in > system.log: > EHead: > ETail: ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
[ https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437020#comment-17437020 ] David Capwell commented on CASSANDRA-17081: --- Here is the section of code failing in ccm https://github.com/riptano/ccm/blob/master/ccmlib/node.py#L890-L895 {code} if common.is_int_not_bool(wait_other_notice): for node, mark in marks: node.watch_log_for_alive(self, from_mark=mark, timeout=wait_other_notice) elif wait_other_notice: for node, mark in marks: node.watch_log_for_alive(self, from_mark=mark) {code} marks is defined as follows (the issue is here) https://github.com/riptano/ccm/blob/master/ccmlib/node.py#L772-L775 {code} if wait_other_notice: marks = [(node, node.mark_log()) for node in list(self.cluster.nodes.values()) if node.is_live()] else: marks = [] {code} the node.is_live() check returns true in some cases for node3 (the node which failed to start up), which causes ccm to watch node3's logs for node1 to show up... since node3 is actually down the logs will not see node1; which leads to a timeout. > Fix test: > bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state > - > > Key: CASSANDRA-17081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17081 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Josh McKenzie >Assignee: David Capwell >Priority: Normal > Fix For: NA > > > Seeing in circle and locally on trunk: > Looks like it's timing out waiting for the bootstrap to complete. > {code:java} > test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2). > > 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c > Tail: ...b336de0e72/nb-1-big-Data.db > ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 > StorageService.java:483 - Stopping gossiper > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the > required 1 times. > > 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: > Tail: ... > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
[ https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-17081: -- Test and Documentation Plan: ran locally and in CI Status: Patch Available (was: In Progress) > Fix test: > bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state > - > > Key: CASSANDRA-17081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17081 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Josh McKenzie >Assignee: David Capwell >Priority: Normal > Fix For: NA > > > Seeing in circle and locally on trunk: > Looks like it's timing out waiting for the bootstrap to complete. > {code:java} > test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2). > > 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c > Tail: ...b336de0e72/nb-1-big-Data.db > ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 > StorageService.java:483 - Stopping gossiper > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the > required 1 times. > > 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: > Tail: ... > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
[ https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437015#comment-17437015 ] David Capwell commented on CASSANDRA-17081: --- Posted the cause here https://issues.apache.org/jira/browse/CASSANDRA-17085?focusedCommentId=17436995=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17436995 Here is what I am seeing in bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state {code} node3 = new_node(cluster) try: node3.start() except NodeError: pass # node doesn't start as expected t.join() node1.start() {code} node1.start checks all the alive nodes (according to ccm) to see if node1 is seen as up in the logs. node3 is dead (or dying), so it should not be included in the watch set I was able to repro the issue when I limit the environment to 2 cores; trying a patch where we force shutdown node3 before starting node1 to avoid ccm checking node3's logs > Fix test: > bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state > - > > Key: CASSANDRA-17081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17081 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Josh McKenzie >Assignee: David Capwell >Priority: Normal > Fix For: NA > > > Seeing in circle and locally on trunk: > Looks like it's timing out waiting for the bootstrap to complete. > {code:java} > test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2). > > 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c > Tail: ...b336de0e72/nb-1-big-Data.db > ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 > StorageService.java:483 - Stopping gossiper > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the > required 1 times. > > 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: > Tail: ... > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
[ https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-17081: -- Description: Seeing in circle and locally on trunk: Looks like it's timing out waiting for the bootstrap to complete. {code:java} test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2). 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log: Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c Tail: ...b336de0e72/nb-1-big-Data.db ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 StorageService.java:483 - Stopping gossiper [ ] test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the required 1 times. 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log: Head: Tail: ... [ ] {code} was: Seeing in circle and locally on trunk: Looks like it's timing out waiting for the bootstrap to complete. {code:java} test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2). 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log: Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c Tail: ...b336de0e72/nb-1-big-Data.db ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 StorageService.java:483 - Stopping gossiper [, , , , ] test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the required 1 times. 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log: Head: Tail: ... [, , , , ] {code} > Fix test: > bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state > - > > Key: CASSANDRA-17081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17081 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Josh McKenzie >Assignee: David Capwell >Priority: Normal > Fix For: NA > > > Seeing in circle and locally on trunk: > Looks like it's timing out waiting for the bootstrap to complete. > {code:java} > test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2). > > 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c > Tail: ...b336de0e72/nb-1-big-Data.db > ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 > StorageService.java:483 - Stopping gossiper > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the > required 1 times. > > 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: > Tail: ... > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483> > > > > ] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
[ https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-17081: -- Bug Category: Parent values: Correctness(12982)Level 1 values: Test Failure(12990) Complexity: Low Hanging Fruit Component/s: Test/dtest/python Discovered By: Unit Test Fix Version/s: NA Severity: Low Assignee: David Capwell Status: Open (was: Triage Needed) > Fix test: > bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state > - > > Key: CASSANDRA-17081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17081 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Josh McKenzie >Assignee: David Capwell >Priority: Normal > Fix For: NA > > > Seeing in circle and locally on trunk: > Looks like it's timing out waiting for the bootstrap to complete. > {code:java} > test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2). > > 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c > Tail: ...b336de0e72/nb-1-big-Data.db > ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 > StorageService.java:483 - Stopping gossiper > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>, /Users/jmckenzie/src/ccm/ccmlib/node.py:895>, /Users/jmckenzie/src/ccm/ccmlib/node.py:664>, /Users/jmckenzie/src/ccm/ccmlib/node.py:588>, /Users/jmckenzie/src/ccm/ccmlib/node.py:56>] > test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the > required 1 times. > > 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: > ['127.0.0.1:7000.* is now UP'] not found in system.log: > Head: > Tail: ... > [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>, /Users/jmckenzie/src/ccm/ccmlib/node.py:895>, /Users/jmckenzie/src/ccm/ccmlib/node.py:664>, /Users/jmckenzie/src/ccm/ccmlib/node.py:588>, /Users/jmckenzie/src/ccm/ccmlib/node.py:56>] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436995#comment-17436995 ] David Capwell edited comment on CASSANDRA-17085 at 11/1/21, 7:35 PM: - Here is what I am seeing in bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state {code} node3 = new_node(cluster) try: node3.start() except NodeError: pass # node doesn't start as expected t.join() node1.start() {code} node1.start checks all the alive nodes (according to ccm) to see if node1 is seen as up in the logs. node3 is dead (or dying), so it should not be included in the watch set I was able to repro the issue when I limit the environment to 2 cores; trying a patch where we force shutdown node3 before starting node1 to avoid ccm checking node3's logs was (Author: dcapwell): Here is what I am seeing in bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state {code} node3 = new_node(cluster) try: node3.start() except NodeError: pass # node doesn't start as expected t.join() node1.start() {code} node1.start checks all the alive nodes (according to ccm) to see if node1 is seen as up in the logs. node3 is dead (or dying), so it should not be included in the watch set > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > start = 1635453190.3118386, timeout = 120 > msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n > Head: \n Tail: ..." > node = 'node3' > @staticmethod > def raise_if_passed(start, timeout, msg, node=None): > if start + timeout < time.time(): > > raise TimeoutError.create(start, timeout, msg, node) > E ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after > 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in > system.log: > EHead: > ETail: ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436995#comment-17436995 ] David Capwell commented on CASSANDRA-17085: --- Here is what I am seeing in bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state {code} node3 = new_node(cluster) try: node3.start() except NodeError: pass # node doesn't start as expected t.join() node1.start() {code} node1.start checks all the alive nodes (according to ccm) to see if node1 is seen as up in the logs. node3 is dead (or dying), so it should not be included in the watch set > Fix python dtests bootstrap_test.py::TestBootstrap > -- > > Key: CASSANDRA-17085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17085 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Priority: Normal > Fix For: 4.x > > > Right now bootstrap tests are failing every time we run, this work is to > debug and fix the underling issue. > Examples: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089 > {code} > > node3.nodetool('bootstrap resume') > bootstrap_test.py:1014: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool > return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', > '-p', str(self.jmx_port)] + shlex.split(cmd)) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > process = > cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...] > def handle_external_tool_process(process, cmd_args): > out, err = process.communicate() > if (out is not None) and isinstance(out, bytes): > out = out.decode() > if (err is not None) and isinstance(err, bytes): > err = err.decode() > rc = process.returncode > > if rc != 0: > > raise ToolError(cmd_args, rc, out, err) > E ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', > '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit > status: 1; > E stderr: nodetool: Failed to connect to 'localhost:7300' - > EOFException: 'null'. > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError > {code} > https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087 > {code} > > node1.start() > bootstrap_test.py:483: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start > node.watch_log_for_alive(self, from_mark=mark) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in > watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for > head=reads[:50], tail="..."+reads[len(reads)-150:])) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > start = 1635453190.3118386, timeout = 120 > msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n > Head: \n Tail: ..." > node = 'node3' > @staticmethod > def raise_if_passed(start, timeout, msg, node=None): > if start + timeout < time.time(): > > raise TimeoutError.create(start, timeout, msg, node) > E ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after > 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in > system.log: > EHead: > ETail: ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored
[ https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-12106: -- Reviewers: Aleksei Zotov, Sumanth Pasupuleti (was: Aleksei Zotov, Dinesh Joshi, Sumanth Pasupuleti) > Add ability to blocklist / denylist a CQL partition so all requests are > ignored > --- > > Key: CASSANDRA-12106 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12106 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Local Write-Read Paths, Local/Config >Reporter: Geoffrey Yu >Assignee: Josh McKenzie >Priority: Low > Fix For: 4.x > > Attachments: 12106-trunk.txt > > > Sometimes reads/writes to a given partition may cause problems due to the > data present. It would be useful to have a manual way to blocklist / denylist > such partitions so all read and write requests to them are rejected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored
[ https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-12106: -- Status: Ready to Commit (was: Review In Progress) > Add ability to blocklist / denylist a CQL partition so all requests are > ignored > --- > > Key: CASSANDRA-12106 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12106 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Local Write-Read Paths, Local/Config >Reporter: Geoffrey Yu >Assignee: Josh McKenzie >Priority: Low > Fix For: 4.x > > Attachments: 12106-trunk.txt > > > Sometimes reads/writes to a given partition may cause problems due to the > data present. It would be useful to have a manual way to blocklist / denylist > such partitions so all read and write requests to them are rejected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored
[ https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-12106: -- Fix Version/s: (was: 4.x) 4.1 Source Control Link: https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=ab920c30310a8c095ba76b363142b8e74cbf0a0a Resolution: Fixed Status: Resolved (was: Ready to Commit) > Add ability to blocklist / denylist a CQL partition so all requests are > ignored > --- > > Key: CASSANDRA-12106 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12106 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Local Write-Read Paths, Local/Config >Reporter: Geoffrey Yu >Assignee: Josh McKenzie >Priority: Low > Fix For: 4.1 > > Attachments: 12106-trunk.txt > > > Sometimes reads/writes to a given partition may cause problems due to the > data present. It would be useful to have a manual way to blocklist / denylist > such partitions so all read and write requests to them are rejected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored
[ https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-12106: -- Status: Review In Progress (was: Patch Available) > Add ability to blocklist / denylist a CQL partition so all requests are > ignored > --- > > Key: CASSANDRA-12106 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12106 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Local Write-Read Paths, Local/Config >Reporter: Geoffrey Yu >Assignee: Josh McKenzie >Priority: Low > Fix For: 4.x > > Attachments: 12106-trunk.txt > > > Sometimes reads/writes to a given partition may cause problems due to the > data present. It would be useful to have a manual way to blocklist / denylist > such partitions so all read and write requests to them are rejected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Add a Denylist to block reads and writes on specific partition keys
This is an automated email from the ASF dual-hosted git repository. jmckenzie pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new ab920c3 Add a Denylist to block reads and writes on specific partition keys ab920c3 is described below commit ab920c30310a8c095ba76b363142b8e74cbf0a0a Author: Josh McKenzie AuthorDate: Fri Sep 17 16:34:04 2021 -0400 Add a Denylist to block reads and writes on specific partition keys Patch by Josh McKenzie, reviewed by Aleksei Zotov and Sumanth Pasupuleti for CASSANDRA-12106 Co-authored by Josh McKenzie Co-authored by Sam Overton --- conf/cassandra.yaml| 30 ++ doc/source/operating/denylisting_partitions.rst| 110 + doc/source/operating/index.rst | 1 + src/java/org/apache/cassandra/config/Config.java | 32 ++ .../cassandra/config/DatabaseDescriptor.java | 98 .../org/apache/cassandra/db/view/ViewBuilder.java | 2 +- .../org/apache/cassandra/db/view/ViewManager.java | 2 +- .../apache/cassandra/metrics/DenylistMetrics.java | 58 +++ .../org/apache/cassandra/repair/RepairJob.java | 1 + .../apache/cassandra/repair/RepairRunnable.java| 20 +- .../org/apache/cassandra/repair/RepairSession.java | 1 + .../apache/cassandra/schema/PartitionDenylist.java | 535 + .../apache/cassandra/schema/SchemaConstants.java | 3 + .../SystemDistributedKeyspace.java | 33 +- .../org/apache/cassandra/service/StorageProxy.java | 185 ++- .../cassandra/service/StorageProxyMBean.java | 14 + .../apache/cassandra/service/StorageService.java | 3 + .../service/reads/range/RangeCommands.java | 26 + .../distributed/test/GossipSettlesTest.java| 2 +- .../distributed/test/PartitionDenylistTest.java| 155 ++ .../distributed/test/metric/TableMetricTest.java | 2 +- .../config/DatabaseDescriptorRefTest.java | 3 + .../cassandra/config/DatabaseDescriptorTest.java | 22 + .../cassandra/service/PartitionDenylistTest.java | 495 +++ 24 files changed, 1800 insertions(+), 33 deletions(-) diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index 87df25a..65eb385 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -1026,6 +1026,36 @@ slow_query_log_timeout_in_ms: 500 # bound (for example a few nodes with big files). # streaming_connections_per_host: 1 +# Allows denying configurable access (rw/rr) to operations on configured ks, table, and partitions, intended for use by +# operators to manage cluster health vs application access. See CASSANDRA-12106 and CEP-13 for more details. +# enable_partition_denylist = false; + +# enable_denylist_writes = true; +# enable_denylist_reads = true; +# enable_denylist_range_reads = true; + +# The interval at which keys in the cache for denylisting will "expire" and async refresh from the backing DB. +# Note: this serves only as a fail-safe, as the usage pattern is expected to be "mutate state, refresh cache" on any +# changes to the underlying denylist entries. See documentation for details. +# denylist_refresh_seconds = 600; + +# In the event of errors on attempting to load the denylist cache, retry on this interval. +# denylist_initial_load_retry_seconds = 5; + +# We cap the number of denylisted keys allowed per table to keep things from growing unbounded. Nodes will warn above +# this limit while allowing new denylisted keys to be inserted. Denied keys are loaded in natural query / clustering +# ordering by partition key in case of overflow. +# denylist_max_keys_per_table = 1000; + +# We cap the total number of denylisted keys allowed in the cluster to keep things from growing unbounded. +# Nodes will warn on initial cache load that there are too many keys and be direct the operator to trim down excess +# entries to within the configured limits. +# denylist_max_keys_total = 1; + +# Since the denylist in many ways serves to protect the health of the cluster from partitions operators have identified +# as being in a bad state, we usually want more robustness than just CL.ONE on operations to/from these tables to +# ensure that these safeguards are in place. That said, we allow users to configure this if they're so inclined. +# denylist_consistency_level = ConsistencyLevel.QUORUM; # phi value that must be reached for a host to be marked down. # most users should never need to adjust this. diff --git a/doc/source/operating/denylisting_partitions.rst b/doc/source/operating/denylisting_partitions.rst new file mode 100644 index 000..3e70f2d --- /dev/null +++ b/doc/source/operating/denylisting_partitions.rst @@ -0,0 +1,110 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this
[jira] [Commented] (CASSANDRA-17113) Add flags to CircleCI generation script to setup the workflows
[ https://issues.apache.org/jira/browse/CASSANDRA-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436872#comment-17436872 ] Andres de la Peña commented on CASSANDRA-17113: --- Here are some examples of the CI runs generated by different combinations of flags: * [{{./circleci/generate.sh -p}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-p] * [{{./circleci/generate.sh -pr}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-pr] * [{{./circleci/generate.sh -s}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-s] * [{{./circleci/generate.sh -sr}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-sr] * [{{./circleci/generate.sh -sp}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-sp] > Add flags to CircleCI generation script to setup the workflows > -- > > Key: CASSANDRA-17113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17113 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > > CASSANDRA-16882 modified the CircleCI config to contain two separate pairs of > j8/j11 workflows. The {{pre-commit_tests}} workflows are meant for patches > that are mostly read for commit, and they have a single approval step on the > Circle GUI to run the most relevant tests. The {{separate_tests}} workflows > are meant for intermediate commits and special cases such as fixing flaky > tests, and every test group requires manual approval on the Circle GUI. Both > pairs of workflows are always created, so every commit/push creates the four > workflows. None of these workflows runs anything unless manually approved on > the GUI, not even the build. > This ticket is a followup for those changes, and it aims to implement [this > suggestion|https://lists.apache.org/thread/8bghc7ng18s83vd4m16ccpj89dy6bm7x] > about having a script that enables the relevant workflows for each use case. > I have modified the existing {{.circleci/generate.sh}} script to be able to > generate different workflows. > The new {{-p}} flag generates only the pre-commit workflows, whereas the > {{-s}} flag generates only the workflows with separate approval steps for > each test job. Both flags can be used together to generate the two pairs of > workflows ({{-ps}}, {{-sp}}, etc.). The default option is generating all the > workflows, so users can decide what workflow are they going to use in the > CircleCI GUI, after pushing their changes. We can easily change the workflows > that are generated by default to use the single pair of workflows that we > think is better. > Additionally, there is a {{-r}} flag that disables the first approval step of > the generated workflows. For the {{separate_tests}} workflows it means that > the build is automatically run, but the individual steps still need to be > manually approved in the GUI. For the {{pre-commit_tests}} workflows, the > {{-r}} flag will automatically run the build and the most relevant tests. > For example, users pushing a mostly ready patch and wanting to run the tests > at maximum speed would probably want to generate their config file with > {{.circleci/generate.sh -hpr}} ({{-h}} for HIGHRES config, see > CASSANDRA-16871). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17113) Add flags to CircleCI generation script to setup the workflows
[ https://issues.apache.org/jira/browse/CASSANDRA-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-17113: -- Change Category: Quality Assurance Complexity: Normal Status: Open (was: Triage Needed) > Add flags to CircleCI generation script to setup the workflows > -- > > Key: CASSANDRA-17113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17113 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > > CASSANDRA-16882 modified the CircleCI config to contain two separate pairs of > j8/j11 workflows. The {{pre-commit_tests}} workflows are meant for patches > that are mostly read for commit, and they have a single approval step on the > Circle GUI to run the most relevant tests. The {{separate_tests}} workflows > are meant for intermediate commits and special cases such as fixing flaky > tests, and every test group requires manual approval on the Circle GUI. Both > pairs of workflows are always created, so every commit/push creates the four > workflows. None of these workflows runs anything unless manually approved on > the GUI, not even the build. > This ticket is a followup for those changes, and it aims to implement [this > suggestion|https://lists.apache.org/thread/8bghc7ng18s83vd4m16ccpj89dy6bm7x] > about having a script that enables the relevant workflows for each use case. > I have modified the existing {{.circleci/generate.sh}} script to be able to > generate different workflows. > The new {{-p}} flag generates only the pre-commit workflows, whereas the > {{-s}} flag generates only the workflows with separate approval steps for > each test job. Both flags can be used together to generate the two pairs of > workflows ({{-ps}}, {{-sp}}, etc.). The default option is generating all the > workflows, so users can decide what workflow are they going to use in the > CircleCI GUI, after pushing their changes. We can easily change the workflows > that are generated by default to use the single pair of workflows that we > think is better. > Additionally, there is a {{-r}} flag that disables the first approval step of > the generated workflows. For the {{separate_tests}} workflows it means that > the build is automatically run, but the individual steps still need to be > manually approved in the GUI. For the {{pre-commit_tests}} workflows, the > {{-r}} flag will automatically run the build and the most relevant tests. > For example, users pushing a mostly ready patch and wanting to run the tests > at maximum speed would probably want to generate their config file with > {{.circleci/generate.sh -hpr}} ({{-h}} for HIGHRES config, see > CASSANDRA-16871). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17113) Add flags to CircleCI generation script to setup the workflows
Andres de la Peña created CASSANDRA-17113: - Summary: Add flags to CircleCI generation script to setup the workflows Key: CASSANDRA-17113 URL: https://issues.apache.org/jira/browse/CASSANDRA-17113 Project: Cassandra Issue Type: Task Components: CI Reporter: Andres de la Peña Assignee: Andres de la Peña CASSANDRA-16882 modified the CircleCI config to contain two separate pairs of j8/j11 workflows. The {{pre-commit_tests}} workflows are meant for patches that are mostly read for commit, and they have a single approval step on the Circle GUI to run the most relevant tests. The {{separate_tests}} workflows are meant for intermediate commits and special cases such as fixing flaky tests, and every test group requires manual approval on the Circle GUI. Both pairs of workflows are always created, so every commit/push creates the four workflows. None of these workflows runs anything unless manually approved on the GUI, not even the build. This ticket is a followup for those changes, and it aims to implement [this suggestion|https://lists.apache.org/thread/8bghc7ng18s83vd4m16ccpj89dy6bm7x] about having a script that enables the relevant workflows for each use case. I have modified the existing {{.circleci/generate.sh}} script to be able to generate different workflows. The new {{-p}} flag generates only the pre-commit workflows, whereas the {{-s}} flag generates only the workflows with separate approval steps for each test job. Both flags can be used together to generate the two pairs of workflows ({{-ps}}, {{-sp}}, etc.). The default option is generating all the workflows, so users can decide what workflow are they going to use in the CircleCI GUI, after pushing their changes. We can easily change the workflows that are generated by default to use the single pair of workflows that we think is better. Additionally, there is a {{-r}} flag that disables the first approval step of the generated workflows. For the {{separate_tests}} workflows it means that the build is automatically run, but the individual steps still need to be manually approved in the GUI. For the {{pre-commit_tests}} workflows, the {{-r}} flag will automatically run the build and the most relevant tests. For example, users pushing a mostly ready patch and wanting to run the tests at maximum speed would probably want to generate their config file with {{.circleci/generate.sh -hpr}} ({{-h}} for HIGHRES config, see CASSANDRA-16871). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1
[ https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marten Kenbeek updated CASSANDRA-14113: --- Attachment: (was: 14133-3.0.txt) > AssertionError while trying to upgrade 2.2.11 -> 3.11.1 > --- > > Key: CASSANDRA-14113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14113 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Environment: Tables have been created in 2.2.11 using thrift and have > supercolumns >Reporter: Guillaume Herail >Assignee: Marten Kenbeek >Priority: Normal > Labels: supercolumns > Attachments: 14113-3.0.txt, data.tar.gz > > > We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra > 3.11.1. The tables have been created using thrift and have supercolumns. When > I try to run {{nodetool upgradesstables}} I get the following: > {noformat}error: null > -- StackTrace -- > java.lang.AssertionError > at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > at > org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at java.lang.Thread.run(Thread.java:748) > {noformat} > We also tried to upgrade to 3.0.15 instead and had a different error: > {noformat} > ERROR 11:00:40 Exception
[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1
[ https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marten Kenbeek updated CASSANDRA-14113: --- Attachment: 14113-3.0.txt > AssertionError while trying to upgrade 2.2.11 -> 3.11.1 > --- > > Key: CASSANDRA-14113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14113 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Environment: Tables have been created in 2.2.11 using thrift and have > supercolumns >Reporter: Guillaume Herail >Assignee: Marten Kenbeek >Priority: Normal > Labels: supercolumns > Attachments: 14113-3.0.txt, data.tar.gz > > > We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra > 3.11.1. The tables have been created using thrift and have supercolumns. When > I try to run {{nodetool upgradesstables}} I get the following: > {noformat}error: null > -- StackTrace -- > java.lang.AssertionError > at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > at > org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at java.lang.Thread.run(Thread.java:748) > {noformat} > We also tried to upgrade to 3.0.15 instead and had a different error: > {noformat} > ERROR 11:00:40 Exception in thread
[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1
[ https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marten Kenbeek updated CASSANDRA-14113: --- Test and Documentation Plan: Unit tests included Status: Patch Available (was: Open) > AssertionError while trying to upgrade 2.2.11 -> 3.11.1 > --- > > Key: CASSANDRA-14113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14113 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Environment: Tables have been created in 2.2.11 using thrift and have > supercolumns >Reporter: Guillaume Herail >Assignee: Marten Kenbeek >Priority: Normal > Labels: supercolumns > Attachments: 14133-3.0.txt, data.tar.gz > > > We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra > 3.11.1. The tables have been created using thrift and have supercolumns. When > I try to run {{nodetool upgradesstables}} I get the following: > {noformat}error: null > -- StackTrace -- > java.lang.AssertionError > at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > at > org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at java.lang.Thread.run(Thread.java:748) > {noformat} > We also tried to upgrade to 3.0.15
[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1
[ https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marten Kenbeek updated CASSANDRA-14113: --- Attachment: 14133-3.0.txt > AssertionError while trying to upgrade 2.2.11 -> 3.11.1 > --- > > Key: CASSANDRA-14113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14113 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Environment: Tables have been created in 2.2.11 using thrift and have > supercolumns >Reporter: Guillaume Herail >Assignee: Marten Kenbeek >Priority: Normal > Labels: supercolumns > Attachments: 14133-3.0.txt, data.tar.gz > > > We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra > 3.11.1. The tables have been created using thrift and have supercolumns. When > I try to run {{nodetool upgradesstables}} I get the following: > {noformat}error: null > -- StackTrace -- > java.lang.AssertionError > at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > at > org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at java.lang.Thread.run(Thread.java:748) > {noformat} > We also tried to upgrade to 3.0.15 instead and had a different error: > {noformat} > ERROR 11:00:40 Exception in thread
[jira] [Assigned] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1
[ https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marten Kenbeek reassigned CASSANDRA-14113: -- Assignee: Marten Kenbeek (was: Benjamin Lerer) > AssertionError while trying to upgrade 2.2.11 -> 3.11.1 > --- > > Key: CASSANDRA-14113 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14113 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Environment: Tables have been created in 2.2.11 using thrift and have > supercolumns >Reporter: Guillaume Herail >Assignee: Marten Kenbeek >Priority: Normal > Labels: supercolumns > Attachments: data.tar.gz > > > We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra > 3.11.1. The tables have been created using thrift and have supercolumns. When > I try to run {{nodetool upgradesstables}} I get the following: > {noformat}error: null > -- StackTrace -- > java.lang.AssertionError > at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242) > at > org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472) > at > org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188) > at > org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > at > org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at java.lang.Thread.run(Thread.java:748) > {noformat} > We also tried to upgrade to 3.0.15 instead and had a different error: > {noformat} > ERROR 11:00:40