[jira] [Updated] (CASSANDRASC-32) Sidecar health checks are failing since CassandraAdaptorDelegate is not started

2021-11-01 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRASC-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRASC-32:

Status: Ready to Commit  (was: Review In Progress)

> Sidecar health checks are failing since CassandraAdaptorDelegate is not 
> started
> ---
>
> Key: CASSANDRASC-32
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-32
> Project: Sidecar for Apache Cassandra
>  Issue Type: Bug
>  Components: Rest API
>Reporter: Saranya Krishnakumar
>Assignee: Saranya Krishnakumar
>Priority: Normal
>
> CassandraAdaptorDelegate class in Sidecar periodically checks if it is able 
> to connect to Cassandra instance. Currently we are not starting this delegate 
> and hence Sidecar health checks are failing. We need to start the delegate 
> while starting the server
>  
> https://github.com/apache/cassandra-sidecar/pull/24



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRASC-32) Sidecar health checks are failing since CassandraAdaptorDelegate is not started

2021-11-01 Thread Dinesh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRASC-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437115#comment-17437115
 ] 

Dinesh Joshi commented on CASSANDRASC-32:
-

+1

> Sidecar health checks are failing since CassandraAdaptorDelegate is not 
> started
> ---
>
> Key: CASSANDRASC-32
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-32
> Project: Sidecar for Apache Cassandra
>  Issue Type: Bug
>  Components: Rest API
>Reporter: Saranya Krishnakumar
>Assignee: Saranya Krishnakumar
>Priority: Normal
>
> CassandraAdaptorDelegate class in Sidecar periodically checks if it is able 
> to connect to Cassandra instance. Currently we are not starting this delegate 
> and hence Sidecar health checks are failing. We need to start the delegate 
> while starting the server
>  
> https://github.com/apache/cassandra-sidecar/pull/24



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437077#comment-17437077
 ] 

David Capwell commented on CASSANDRA-17085:
---

Ok I think I figured out the issue; we have no daemon threads running..

updated drain to log threads

{code}
protected synchronized void drain(boolean isFinalShutdown) throws IOException, 
InterruptedException, ExecutionException
{
logger.info("drain({})", isFinalShutdown);
if (isFinalShutdown)
{
Map traces = 
Thread.getAllStackTraces();
List nonDaemonThreads = new ArrayList<>();
for (Entry e : traces.entrySet())
{
if (e.getKey().isDaemon())
continue;
nonDaemonThreads.add(e.getKey().getName());
}
logger.info("Non-daemon threads: {}", nonDaemonThreads);
}
{code}

This produces

{code}
INFO  [StorageServiceShutdownHook] 2021-11-01 23:54:50,181 
StorageService.java:5003 - Non-daemon threads: [DestroyJavaVM, 
StorageServiceShutdownHook]
{code}

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1635453190.3118386, timeout = 120
> msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n 
> Head: \n Tail: ..."
> node = 'node3'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after 
> 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in 
> system.log:
> EHead: 
> ETail: ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437072#comment-17437072
 ] 

David Capwell commented on CASSANDRA-17085:
---

Added the following to CassandraDaemon to detect if System.exit is getting 
called; it doesn't look to be

{code}
System.setSecurityManager(new SecurityManager()
{
public void checkExit(int status)
{
System.err.println("System.exit("+status+")) callled");
new Throwable("System.exit("+status+")) 
callled").printStackTrace();
}
});
{code}

So something else looks to be causing the JVM to halt.

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1635453190.3118386, timeout = 120
> msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n 
> Head: \n Tail: ..."
> node = 'node3'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after 
> 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in 
> system.log:
> EHead: 
> ETail: ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437062#comment-17437062
 ] 

David Capwell edited comment on CASSANDRA-17085 at 11/1/21, 11:21 PM:
--

bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled  is more 
complex it seems, something is causing the JVM to do a graceful shutdown; is 
see the following in the node3 logs

{code}
WARN  [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:663 - Not starting 
client transports in write_survey mode as it's bootstrapping or auth is enabled
INFO  [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:764 - Startup complete
INFO  [StorageServiceShutdownHook] 2021-11-01 22:43:33,219 
HintsService.java:233 - Paused hints dispatch
{code}

I enhanced the test to log where it is out, so I can correlate test and server 
logs

{code}
22:43:35,415 bootstrap_test INFO Attempting to resume bootstrap on node3
{code}


So, I see that we start a graceful shutdown at 22:43:33,219, and the test tries 
to resume bootstrap at 22:43:35,415; 2 seconds AFTER we started shutdown.  I 
don't see us doing a shutdown in the test; and this fails 100% of the time for 
me, so we always seem to do a graceful shutdown for some reason...


was (Author: dcapwell):
bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled  is more 
complex it seems, something is causing the JVM to do a graceful shutdown; is 
see the following in the node3 logs

{code}
INFO  [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:764 - Startup complete
INFO  [StorageServiceShutdownHook] 2021-11-01 22:43:33,219 
HintsService.java:233 - Paused hints dispatch
{code}

I enhanced the test to log where it is out, so I can correlate test and server 
logs

{code}
22:43:35,415 bootstrap_test INFO Attempting to resume bootstrap on node3
{code}


So, I see that we start a graceful shutdown at 22:43:33,219, and the test tries 
to resume bootstrap at 22:43:35,415; 2 seconds AFTER we started shutdown.  I 
don't see us doing a shutdown in the test; and this fails 100% of the time for 
me, so we always seem to do a graceful shutdown for some reason...

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437062#comment-17437062
 ] 

David Capwell commented on CASSANDRA-17085:
---

bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled  is more 
complex it seems, something is causing the JVM to do a graceful shutdown; is 
see the following in the node3 logs

{code}
INFO  [main] 2021-11-01 22:43:33,212 CassandraDaemon.java:764 - Startup complete
INFO  [StorageServiceShutdownHook] 2021-11-01 22:43:33,219 
HintsService.java:233 - Paused hints dispatch
{code}

I enhanced the test to log where it is out, so I can correlate test and server 
logs

{code}
22:43:35,415 bootstrap_test INFO Attempting to resume bootstrap on node3
{code}


So, I see that we start a graceful shutdown at 22:43:33,219, and the test tries 
to resume bootstrap at 22:43:35,415; 2 seconds AFTER we started shutdown.  I 
don't see us doing a shutdown in the test; and this fails 100% of the time for 
me, so we always seem to do a graceful shutdown for some reason...

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1635453190.3118386, timeout = 120
> msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n 
> Head: \n Tail: ..."
> node = 'node3'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after 
> 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in 
> system.log:
> EHead: 
> ETail: ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437032#comment-17437032
 ] 

David Capwell commented on CASSANDRA-17085:
---

bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled also fails 
with the following

{code}
try:
node3.nodetool('join')
pytest.fail('nodetool should have errored and failed to join ring')
except ToolError as t:
>   assert "Cannot join the ring until bootstrap completes" in t.stdout
E   assert 'Cannot join the ring until bootstrap completes' in ''
E+  where '' = ToolError("Subprocess ['nodetool', '-h', 
'localhost', '-p', '7300', 'join'] exited with non-zero status; exit status: 1; 
\nstderr: nodetool: Failed to connect to 'localhost:7300' - SocketException: 
'Connection reset'.\n",).stdout
{code}

In both cases it looks like issue connecting to JMX.  Logs for node3 don't show 
the process going down during the test so not sure what's up yet.  This is 
localhost networking so shouldn't have random connection close events, so not 
sure what leads to this flaky behavior yet.

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1635453190.3118386, timeout = 120
> msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n 
> Head: \n Tail: ..."
> node = 'node3'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after 
> 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in 
> system.log:
> EHead: 
> ETail: ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell reassigned CASSANDRA-17085:
-

Assignee: David Capwell

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1635453190.3118386, timeout = 120
> msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n 
> Head: \n Tail: ..."
> node = 'node3'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after 
> 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in 
> system.log:
> EHead: 
> ETail: ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437020#comment-17437020
 ] 

David Capwell commented on CASSANDRA-17081:
---

Here is the section of code failing in ccm

https://github.com/riptano/ccm/blob/master/ccmlib/node.py#L890-L895

{code}
 if common.is_int_not_bool(wait_other_notice):
for node, mark in marks:
node.watch_log_for_alive(self, from_mark=mark, 
timeout=wait_other_notice)
elif wait_other_notice:
for node, mark in marks:
node.watch_log_for_alive(self, from_mark=mark)
{code}

marks is defined as follows (the issue is here)

https://github.com/riptano/ccm/blob/master/ccmlib/node.py#L772-L775

{code}
if wait_other_notice:
marks = [(node, node.mark_log()) for node in 
list(self.cluster.nodes.values()) if node.is_live()]
else:
marks = []
{code}

the node.is_live() check returns true in some cases for node3 (the node which 
failed to start up), which causes ccm to watch node3's logs for node1 to show 
up... since node3 is actually down the logs will not see node1; which leads to 
a timeout.

> Fix test: 
> bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
> -
>
> Key: CASSANDRA-17081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: David Capwell
>Priority: Normal
> Fix For: NA
>
>
> Seeing in circle and locally on trunk:
> Looks like it's timing out waiting for the bootstrap to complete.
> {code:java}
> test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).
> 
> 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
>  Tail: ...b336de0e72/nb-1-big-Data.db 
> ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
> StorageService.java:483 - Stopping gossiper
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
> required 1 times.
> 
> 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: 
>  Tail: ...
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

2021-11-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17081:
--
Test and Documentation Plan: ran locally and in CI
 Status: Patch Available  (was: In Progress)

> Fix test: 
> bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
> -
>
> Key: CASSANDRA-17081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: David Capwell
>Priority: Normal
> Fix For: NA
>
>
> Seeing in circle and locally on trunk:
> Looks like it's timing out waiting for the bootstrap to complete.
> {code:java}
> test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).
> 
> 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
>  Tail: ...b336de0e72/nb-1-big-Data.db 
> ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
> StorageService.java:483 - Stopping gossiper
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
> required 1 times.
> 
> 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: 
>  Tail: ...
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437015#comment-17437015
 ] 

David Capwell commented on CASSANDRA-17081:
---

Posted the cause here 
https://issues.apache.org/jira/browse/CASSANDRA-17085?focusedCommentId=17436995=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17436995

Here is what I am seeing in 
bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

{code}
  node3 = new_node(cluster)
try:
node3.start()
except NodeError:
pass  # node doesn't start as expected
t.join()
node1.start()
{code}

node1.start checks all the alive nodes (according to ccm) to see if node1 is 
seen as up in the logs.  node3 is dead (or dying), so it should not be included 
in the watch set

I was able to repro the issue when I limit the environment to 2 cores; trying a 
patch where we force shutdown node3 before starting node1 to avoid ccm checking 
node3's logs

> Fix test: 
> bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
> -
>
> Key: CASSANDRA-17081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: David Capwell
>Priority: Normal
> Fix For: NA
>
>
> Seeing in circle and locally on trunk:
> Looks like it's timing out waiting for the bootstrap to complete.
> {code:java}
> test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).
> 
> 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
>  Tail: ...b336de0e72/nb-1-big-Data.db 
> ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
> StorageService.java:483 - Stopping gossiper
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
> required 1 times.
> 
> 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: 
>  Tail: ...
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

2021-11-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17081:
--
Description: 
Seeing in circle and locally on trunk:

Looks like it's timing out waiting for the bootstrap to complete.
{code:java}
test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).

28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
['127.0.0.1:7000.* is now UP'] not found in system.log:
 Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
 Tail: ...b336de0e72/nb-1-big-Data.db 
ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
StorageService.java:483 - Stopping gossiper

[



]
test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
required 1 times.

28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
['127.0.0.1:7000.* is now UP'] not found in system.log:
 Head: 
 Tail: ...
[



]
{code}
 

  was:
Seeing in circle and locally on trunk:

Looks like it's timing out waiting for the bootstrap to complete.
{code:java}
test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).

28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
['127.0.0.1:7000.* is now UP'] not found in system.log:
 Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
 Tail: ...b336de0e72/nb-1-big-Data.db 
ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
StorageService.java:483 - Stopping gossiper

[, , , , ]
test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
required 1 times.

28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
['127.0.0.1:7000.* is now UP'] not found in system.log:
 Head: 
 Tail: ...
[, , , , ]
{code}
 


> Fix test: 
> bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
> -
>
> Key: CASSANDRA-17081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: David Capwell
>Priority: Normal
> Fix For: NA
>
>
> Seeing in circle and locally on trunk:
> Looks like it's timing out waiting for the bootstrap to complete.
> {code:java}
> test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).
> 
> 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
>  Tail: ...b336de0e72/nb-1-big-Data.db 
> ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
> StorageService.java:483 - Stopping gossiper
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
> required 1 times.
> 
> 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: 
>  Tail: ...
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> 
> 
> 
> ]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17081) Fix test: bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

2021-11-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-17081:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: Test 
Failure(12990)
   Complexity: Low Hanging Fruit
  Component/s: Test/dtest/python
Discovered By: Unit Test
Fix Version/s: NA
 Severity: Low
 Assignee: David Capwell
   Status: Open  (was: Triage Needed)

> Fix test: 
> bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
> -
>
> Key: CASSANDRA-17081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: David Capwell
>Priority: Normal
> Fix For: NA
>
>
> Seeing in circle and locally on trunk:
> Looks like it's timing out waiting for the bootstrap to complete.
> {code:java}
> test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).
> 
> 28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
>  Tail: ...b336de0e72/nb-1-big-Data.db 
> ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
> StorageService.java:483 - Stopping gossiper
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:895>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:664>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:588>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:56>]
> test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
> required 1 times.
> 
> 28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: 
>  Tail: ...
> [ /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:895>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:664>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:588>,  /Users/jmckenzie/src/ccm/ccmlib/node.py:56>]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436995#comment-17436995
 ] 

David Capwell edited comment on CASSANDRA-17085 at 11/1/21, 7:35 PM:
-

Here is what I am seeing in 
bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

{code}
  node3 = new_node(cluster)
try:
node3.start()
except NodeError:
pass  # node doesn't start as expected
t.join()
node1.start()
{code}

node1.start checks all the alive nodes (according to ccm) to see if node1 is 
seen as up in the logs.  node3 is dead (or dying), so it should not be included 
in the watch set

I was able to repro the issue when I limit the environment to 2 cores; trying a 
patch where we force shutdown node3 before starting node1 to avoid ccm checking 
node3's logs


was (Author: dcapwell):
Here is what I am seeing in 
bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

{code}
  node3 = new_node(cluster)
try:
node3.start()
except NodeError:
pass  # node doesn't start as expected
t.join()
node1.start()
{code}

node1.start checks all the alive nodes (according to ccm) to see if node1 is 
seen as up in the logs.  node3 is dead (or dying), so it should not be included 
in the watch set

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1635453190.3118386, timeout = 120
> msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n 
> Head: \n Tail: ..."
> node = 'node3'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after 
> 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in 
> system.log:
> EHead: 
> ETail: ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (CASSANDRA-17085) Fix python dtests bootstrap_test.py::TestBootstrap

2021-11-01 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436995#comment-17436995
 ] 

David Capwell commented on CASSANDRA-17085:
---

Here is what I am seeing in 
bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

{code}
  node3 = new_node(cluster)
try:
node3.start()
except NodeError:
pass  # node doesn't start as expected
t.join()
node1.start()
{code}

node1.start checks all the alive nodes (according to ccm) to see if node1 is 
seen as up in the logs.  node3 is dead (or dying), so it should not be included 
in the watch set

> Fix python dtests bootstrap_test.py::TestBootstrap
> --
>
> Key: CASSANDRA-17085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>
> Right now bootstrap tests are failing every time we run, this work is to 
> debug and fix the underling issue.
> Examples:
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7089
> {code}
> >   node3.nodetool('bootstrap resume')
> bootstrap_test.py:1014: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1005: in nodetool
> return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', 
> '-p', str(self.jmx_port)] + shlex.split(cmd))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> process = 
> cmd_args = ['nodetool', '-h', 'localhost', '-p', '7300', 'bootstrap', ...]
> def handle_external_tool_process(process, cmd_args):
> out, err = process.communicate()
> if (out is not None) and isinstance(out, bytes):
> out = out.decode()
> if (err is not None) and isinstance(err, bytes):
> err = err.decode()
> rc = process.returncode
> 
> if rc != 0:
> >   raise ToolError(cmd_args, rc, out, err)
> E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
> '-p', '7300', 'bootstrap', 'resume'] exited with non-zero status; exit 
> status: 1; 
> E   stderr: nodetool: Failed to connect to 'localhost:7300' - 
> EOFException: 'null'.
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2305: ToolError
> {code}
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/1062/workflows/ba3e6395-ef22-4724-8424-0549e65d8cff/jobs/7087
> {code}
> >   node1.start()
> bootstrap_test.py:483: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:895: in start
> node.watch_log_for_alive(self, from_mark=mark)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:664: in 
> watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
> ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:592: in watch_log_for
> head=reads[:50], tail="..."+reads[len(reads)-150:]))
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1635453190.3118386, timeout = 120
> msg = "Missing: ['127.0.0.1:7000.* is now UP'] not found in system.log:\n 
> Head: \n Tail: ..."
> node = 'node3'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 28 Oct 2021 20:35:10 [node3] after 
> 120.12/120 seconds Missing: ['127.0.0.1:7000.* is now UP'] not found in 
> system.log:
> EHead: 
> ETail: ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored

2021-11-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-12106:
--
Reviewers: Aleksei Zotov, Sumanth Pasupuleti  (was: Aleksei Zotov, Dinesh 
Joshi, Sumanth Pasupuleti)

> Add ability to blocklist / denylist a CQL partition so all requests are 
> ignored
> ---
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Local Write-Read Paths, Local/Config
>Reporter: Geoffrey Yu
>Assignee: Josh McKenzie
>Priority: Low
> Fix For: 4.x
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blocklist / denylist
>  such partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored

2021-11-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-12106:
--
Status: Ready to Commit  (was: Review In Progress)

> Add ability to blocklist / denylist a CQL partition so all requests are 
> ignored
> ---
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Local Write-Read Paths, Local/Config
>Reporter: Geoffrey Yu
>Assignee: Josh McKenzie
>Priority: Low
> Fix For: 4.x
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blocklist / denylist
>  such partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored

2021-11-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-12106:
--
  Fix Version/s: (was: 4.x)
 4.1
Source Control Link: 
https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=ab920c30310a8c095ba76b363142b8e74cbf0a0a
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add ability to blocklist / denylist a CQL partition so all requests are 
> ignored
> ---
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Local Write-Read Paths, Local/Config
>Reporter: Geoffrey Yu
>Assignee: Josh McKenzie
>Priority: Low
> Fix For: 4.1
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blocklist / denylist
>  such partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12106) Add ability to blocklist / denylist a CQL partition so all requests are ignored

2021-11-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-12106:
--
Status: Review In Progress  (was: Patch Available)

> Add ability to blocklist / denylist a CQL partition so all requests are 
> ignored
> ---
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Local Write-Read Paths, Local/Config
>Reporter: Geoffrey Yu
>Assignee: Josh McKenzie
>Priority: Low
> Fix For: 4.x
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blocklist / denylist
>  such partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Add a Denylist to block reads and writes on specific partition keys

2021-11-01 Thread jmckenzie
This is an automated email from the ASF dual-hosted git repository.

jmckenzie pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new ab920c3  Add a Denylist to block reads and writes on specific 
partition keys
ab920c3 is described below

commit ab920c30310a8c095ba76b363142b8e74cbf0a0a
Author: Josh McKenzie 
AuthorDate: Fri Sep 17 16:34:04 2021 -0400

Add a Denylist to block reads and writes on specific partition keys

Patch by Josh McKenzie, reviewed by Aleksei Zotov and Sumanth Pasupuleti 
for CASSANDRA-12106

Co-authored by Josh McKenzie 
Co-authored by Sam Overton
---
 conf/cassandra.yaml|  30 ++
 doc/source/operating/denylisting_partitions.rst| 110 +
 doc/source/operating/index.rst |   1 +
 src/java/org/apache/cassandra/config/Config.java   |  32 ++
 .../cassandra/config/DatabaseDescriptor.java   |  98 
 .../org/apache/cassandra/db/view/ViewBuilder.java  |   2 +-
 .../org/apache/cassandra/db/view/ViewManager.java  |   2 +-
 .../apache/cassandra/metrics/DenylistMetrics.java  |  58 +++
 .../org/apache/cassandra/repair/RepairJob.java |   1 +
 .../apache/cassandra/repair/RepairRunnable.java|  20 +-
 .../org/apache/cassandra/repair/RepairSession.java |   1 +
 .../apache/cassandra/schema/PartitionDenylist.java | 535 +
 .../apache/cassandra/schema/SchemaConstants.java   |   3 +
 .../SystemDistributedKeyspace.java |  33 +-
 .../org/apache/cassandra/service/StorageProxy.java | 185 ++-
 .../cassandra/service/StorageProxyMBean.java   |  14 +
 .../apache/cassandra/service/StorageService.java   |   3 +
 .../service/reads/range/RangeCommands.java |  26 +
 .../distributed/test/GossipSettlesTest.java|   2 +-
 .../distributed/test/PartitionDenylistTest.java| 155 ++
 .../distributed/test/metric/TableMetricTest.java   |   2 +-
 .../config/DatabaseDescriptorRefTest.java  |   3 +
 .../cassandra/config/DatabaseDescriptorTest.java   |  22 +
 .../cassandra/service/PartitionDenylistTest.java   | 495 +++
 24 files changed, 1800 insertions(+), 33 deletions(-)

diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index 87df25a..65eb385 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -1026,6 +1026,36 @@ slow_query_log_timeout_in_ms: 500
 # bound (for example a few nodes with big files).
 # streaming_connections_per_host: 1
 
+# Allows denying configurable access (rw/rr) to operations on configured ks, 
table, and partitions, intended for use by
+# operators to manage cluster health vs application access. See 
CASSANDRA-12106 and CEP-13 for more details.
+# enable_partition_denylist = false;
+
+# enable_denylist_writes = true;
+# enable_denylist_reads = true;
+# enable_denylist_range_reads = true;
+
+# The interval at which keys in the cache for denylisting will "expire" and 
async refresh from the backing DB.
+# Note: this serves only as a fail-safe, as the usage pattern is expected to 
be "mutate state, refresh cache" on any
+# changes to the underlying denylist entries. See documentation for details.
+# denylist_refresh_seconds = 600;
+
+# In the event of errors on attempting to load the denylist cache, retry on 
this interval.
+# denylist_initial_load_retry_seconds = 5;
+
+# We cap the number of denylisted keys allowed per table to keep things from 
growing unbounded. Nodes will warn above
+# this limit while allowing new denylisted keys to be inserted. Denied keys 
are loaded in natural query / clustering
+# ordering by partition key in case of overflow.
+# denylist_max_keys_per_table = 1000;
+
+# We cap the total number of denylisted keys allowed in the cluster to keep 
things from growing unbounded.
+# Nodes will warn on initial cache load that there are too many keys and be 
direct the operator to trim down excess
+# entries to within the configured limits.
+# denylist_max_keys_total = 1;
+
+# Since the denylist in many ways serves to protect the health of the cluster 
from partitions operators have identified
+# as being in a bad state, we usually want more robustness than just CL.ONE on 
operations to/from these tables to
+# ensure that these safeguards are in place. That said, we allow users to 
configure this if they're so inclined.
+# denylist_consistency_level = ConsistencyLevel.QUORUM;
 
 # phi value that must be reached for a host to be marked down.
 # most users should never need to adjust this.
diff --git a/doc/source/operating/denylisting_partitions.rst 
b/doc/source/operating/denylisting_partitions.rst
new file mode 100644
index 000..3e70f2d
--- /dev/null
+++ b/doc/source/operating/denylisting_partitions.rst
@@ -0,0 +1,110 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this 

[jira] [Commented] (CASSANDRA-17113) Add flags to CircleCI generation script to setup the workflows

2021-11-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436872#comment-17436872
 ] 

Andres de la Peña commented on CASSANDRA-17113:
---

Here are some examples of the CI runs generated by different combinations of 
flags:

* [{{./circleci/generate.sh 
-p}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-p]
* [{{./circleci/generate.sh 
-pr}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-pr]
* [{{./circleci/generate.sh 
-s}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-s]
* [{{./circleci/generate.sh 
-sr}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-sr]
* [{{./circleci/generate.sh 
-sp}}|https://app.circleci.com/pipelines/github/adelapena/cassandra?branch=17113-trunk-sp]

> Add flags to CircleCI generation script to setup the workflows
> --
>
> Key: CASSANDRA-17113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17113
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
>
> CASSANDRA-16882 modified the CircleCI config to contain two separate pairs of 
> j8/j11 workflows. The {{pre-commit_tests}} workflows are meant for patches 
> that are mostly read for commit, and they have a single approval step on the 
> Circle GUI to run the most relevant tests. The {{separate_tests}} workflows 
> are meant for intermediate commits and special cases such as fixing flaky 
> tests, and every test group requires manual approval on the Circle GUI. Both 
> pairs of workflows are always created, so every commit/push creates the four 
> workflows. None of these workflows runs anything unless manually approved on 
> the GUI, not even the build.
> This ticket is a followup for those changes, and it aims to implement [this 
> suggestion|https://lists.apache.org/thread/8bghc7ng18s83vd4m16ccpj89dy6bm7x] 
> about having a script that enables the relevant workflows for each use case. 
> I have modified the existing {{.circleci/generate.sh}} script to be able to 
> generate different workflows.
> The new {{-p}} flag generates only the pre-commit workflows, whereas the 
> {{-s}} flag generates only the workflows with separate approval steps for 
> each test job. Both flags can be used together to generate the two pairs of 
> workflows ({{-ps}}, {{-sp}}, etc.). The default option is generating all the 
> workflows, so users can decide what workflow are they going to use in the 
> CircleCI GUI, after pushing their changes. We can easily change the workflows 
> that are generated by default to use the single pair of workflows that we 
> think is better.
> Additionally, there is a {{-r}} flag that disables the first approval step of 
> the generated workflows. For the {{separate_tests}} workflows it means that 
> the build is automatically run, but the individual steps still need to be 
> manually approved in the GUI. For the {{pre-commit_tests}} workflows, the 
> {{-r}} flag will automatically run the build and the most relevant tests.
> For example, users pushing a mostly ready patch and wanting to run the tests 
> at maximum speed would probably want to generate their config file with 
> {{.circleci/generate.sh -hpr}} ({{-h}} for HIGHRES config, see 
> CASSANDRA-16871).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17113) Add flags to CircleCI generation script to setup the workflows

2021-11-01 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-17113:
--
Change Category: Quality Assurance
 Complexity: Normal
 Status: Open  (was: Triage Needed)

> Add flags to CircleCI generation script to setup the workflows
> --
>
> Key: CASSANDRA-17113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17113
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
>
> CASSANDRA-16882 modified the CircleCI config to contain two separate pairs of 
> j8/j11 workflows. The {{pre-commit_tests}} workflows are meant for patches 
> that are mostly read for commit, and they have a single approval step on the 
> Circle GUI to run the most relevant tests. The {{separate_tests}} workflows 
> are meant for intermediate commits and special cases such as fixing flaky 
> tests, and every test group requires manual approval on the Circle GUI. Both 
> pairs of workflows are always created, so every commit/push creates the four 
> workflows. None of these workflows runs anything unless manually approved on 
> the GUI, not even the build.
> This ticket is a followup for those changes, and it aims to implement [this 
> suggestion|https://lists.apache.org/thread/8bghc7ng18s83vd4m16ccpj89dy6bm7x] 
> about having a script that enables the relevant workflows for each use case. 
> I have modified the existing {{.circleci/generate.sh}} script to be able to 
> generate different workflows.
> The new {{-p}} flag generates only the pre-commit workflows, whereas the 
> {{-s}} flag generates only the workflows with separate approval steps for 
> each test job. Both flags can be used together to generate the two pairs of 
> workflows ({{-ps}}, {{-sp}}, etc.). The default option is generating all the 
> workflows, so users can decide what workflow are they going to use in the 
> CircleCI GUI, after pushing their changes. We can easily change the workflows 
> that are generated by default to use the single pair of workflows that we 
> think is better.
> Additionally, there is a {{-r}} flag that disables the first approval step of 
> the generated workflows. For the {{separate_tests}} workflows it means that 
> the build is automatically run, but the individual steps still need to be 
> manually approved in the GUI. For the {{pre-commit_tests}} workflows, the 
> {{-r}} flag will automatically run the build and the most relevant tests.
> For example, users pushing a mostly ready patch and wanting to run the tests 
> at maximum speed would probably want to generate their config file with 
> {{.circleci/generate.sh -hpr}} ({{-h}} for HIGHRES config, see 
> CASSANDRA-16871).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17113) Add flags to CircleCI generation script to setup the workflows

2021-11-01 Thread Jira
Andres de la Peña created CASSANDRA-17113:
-

 Summary: Add flags to CircleCI generation script to setup the 
workflows
 Key: CASSANDRA-17113
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17113
 Project: Cassandra
  Issue Type: Task
  Components: CI
Reporter: Andres de la Peña
Assignee: Andres de la Peña


CASSANDRA-16882 modified the CircleCI config to contain two separate pairs of 
j8/j11 workflows. The {{pre-commit_tests}} workflows are meant for patches that 
are mostly read for commit, and they have a single approval step on the Circle 
GUI to run the most relevant tests. The {{separate_tests}} workflows are meant 
for intermediate commits and special cases such as fixing flaky tests, and 
every test group requires manual approval on the Circle GUI. Both pairs of 
workflows are always created, so every commit/push creates the four workflows. 
None of these workflows runs anything unless manually approved on the GUI, not 
even the build.

This ticket is a followup for those changes, and it aims to implement [this 
suggestion|https://lists.apache.org/thread/8bghc7ng18s83vd4m16ccpj89dy6bm7x] 
about having a script that enables the relevant workflows for each use case. I 
have modified the existing {{.circleci/generate.sh}} script to be able to 
generate different workflows.

The new {{-p}} flag generates only the pre-commit workflows, whereas the {{-s}} 
flag generates only the workflows with separate approval steps for each test 
job. Both flags can be used together to generate the two pairs of workflows 
({{-ps}}, {{-sp}}, etc.). The default option is generating all the workflows, 
so users can decide what workflow are they going to use in the CircleCI GUI, 
after pushing their changes. We can easily change the workflows that are 
generated by default to use the single pair of workflows that we think is 
better.

Additionally, there is a {{-r}} flag that disables the first approval step of 
the generated workflows. For the {{separate_tests}} workflows it means that the 
build is automatically run, but the individual steps still need to be manually 
approved in the GUI. For the {{pre-commit_tests}} workflows, the {{-r}} flag 
will automatically run the build and the most relevant tests.

For example, users pushing a mostly ready patch and wanting to run the tests at 
maximum speed would probably want to generate their config file with 
{{.circleci/generate.sh -hpr}} ({{-h}} for HIGHRES config, see CASSANDRA-16871).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1

2021-11-01 Thread Marten Kenbeek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marten Kenbeek updated CASSANDRA-14113:
---
Attachment: (was: 14133-3.0.txt)

> AssertionError while trying to upgrade 2.2.11 -> 3.11.1
> ---
>
> Key: CASSANDRA-14113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14113
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: Tables have been created in 2.2.11 using thrift and have 
> supercolumns
>Reporter: Guillaume Herail
>Assignee: Marten Kenbeek
>Priority: Normal
>  Labels: supercolumns
> Attachments: 14113-3.0.txt, data.tar.gz
>
>
> We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra 
> 3.11.1. The tables have been created using thrift and have supercolumns. When 
> I try to run {{nodetool upgradesstables}} I get the following:
> {noformat}error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26)
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> We also tried to upgrade to 3.0.15 instead and had a different error:
> {noformat}
> ERROR 11:00:40 Exception 

[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1

2021-11-01 Thread Marten Kenbeek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marten Kenbeek updated CASSANDRA-14113:
---
Attachment: 14113-3.0.txt

> AssertionError while trying to upgrade 2.2.11 -> 3.11.1
> ---
>
> Key: CASSANDRA-14113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14113
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: Tables have been created in 2.2.11 using thrift and have 
> supercolumns
>Reporter: Guillaume Herail
>Assignee: Marten Kenbeek
>Priority: Normal
>  Labels: supercolumns
> Attachments: 14113-3.0.txt, data.tar.gz
>
>
> We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra 
> 3.11.1. The tables have been created using thrift and have supercolumns. When 
> I try to run {{nodetool upgradesstables}} I get the following:
> {noformat}error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26)
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> We also tried to upgrade to 3.0.15 instead and had a different error:
> {noformat}
> ERROR 11:00:40 Exception in thread 

[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1

2021-11-01 Thread Marten Kenbeek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marten Kenbeek updated CASSANDRA-14113:
---
Test and Documentation Plan: Unit tests included
 Status: Patch Available  (was: Open)

> AssertionError while trying to upgrade 2.2.11 -> 3.11.1
> ---
>
> Key: CASSANDRA-14113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14113
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: Tables have been created in 2.2.11 using thrift and have 
> supercolumns
>Reporter: Guillaume Herail
>Assignee: Marten Kenbeek
>Priority: Normal
>  Labels: supercolumns
> Attachments: 14133-3.0.txt, data.tar.gz
>
>
> We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra 
> 3.11.1. The tables have been created using thrift and have supercolumns. When 
> I try to run {{nodetool upgradesstables}} I get the following:
> {noformat}error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26)
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> We also tried to upgrade to 3.0.15 

[jira] [Updated] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1

2021-11-01 Thread Marten Kenbeek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marten Kenbeek updated CASSANDRA-14113:
---
Attachment: 14133-3.0.txt

> AssertionError while trying to upgrade 2.2.11 -> 3.11.1
> ---
>
> Key: CASSANDRA-14113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14113
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: Tables have been created in 2.2.11 using thrift and have 
> supercolumns
>Reporter: Guillaume Herail
>Assignee: Marten Kenbeek
>Priority: Normal
>  Labels: supercolumns
> Attachments: 14133-3.0.txt, data.tar.gz
>
>
> We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra 
> 3.11.1. The tables have been created using thrift and have supercolumns. When 
> I try to run {{nodetool upgradesstables}} I get the following:
> {noformat}error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26)
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> We also tried to upgrade to 3.0.15 instead and had a different error:
> {noformat}
> ERROR 11:00:40 Exception in thread 

[jira] [Assigned] (CASSANDRA-14113) AssertionError while trying to upgrade 2.2.11 -> 3.11.1

2021-11-01 Thread Marten Kenbeek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marten Kenbeek reassigned CASSANDRA-14113:
--

Assignee: Marten Kenbeek  (was: Benjamin Lerer)

> AssertionError while trying to upgrade 2.2.11 -> 3.11.1
> ---
>
> Key: CASSANDRA-14113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14113
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: Tables have been created in 2.2.11 using thrift and have 
> supercolumns
>Reporter: Guillaume Herail
>Assignee: Marten Kenbeek
>Priority: Normal
>  Labels: supercolumns
> Attachments: data.tar.gz
>
>
> We're trying to upgrade a test cluster from Cassandra 2.2.11 to Cassandra 
> 3.11.1. The tables have been created using thrift and have supercolumns. When 
> I try to run {{nodetool upgradesstables}} I get the following:
> {noformat}error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:42)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:1242)
>   at 
> org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:1185)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.readRow(UnfilteredDeserializer.java:498)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer$UnfilteredIterator.hasNext(UnfilteredDeserializer.java:472)
>   at 
> org.apache.cassandra.db.UnfilteredDeserializer$OldFormatDeserializer.hasNext(UnfilteredDeserializer.java:306)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:188)
>   at 
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$OldFormatIterator.computeNext(SSTableSimpleIterator.java:140)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:484)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:499)
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:359)
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:74)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:75)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:26)
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:233)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:196)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:428)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:315)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> We also tried to upgrade to 3.0.15 instead and had a different error:
> {noformat}
> ERROR 11:00:40