[jira] [Created] (CASSANDRASC-122) yaml configuration defaults to a file that doesn't exist

2024-04-19 Thread Jon Haddad (Jira)
Jon Haddad created CASSANDRASC-122:
--

 Summary: yaml configuration defaults to a file that doesn't exist
 Key: CASSANDRASC-122
 URL: https://issues.apache.org/jira/browse/CASSANDRASC-122
 Project: Sidecar for Apache Cassandra
  Issue Type: Bug
Reporter: Jon Haddad


In the CassandraSidecarDaemon, it loads a default yaml like this:

{noformat}
String yamlConfigurationPath = System.getProperty("sidecar.config", 
"file://./conf/config.yaml");
{noformat}

but that config file doesn't exist in the distribution.

{noformat}
$ ls conf
logback.xml  sidecar.yaml
{noformat}

That file doesn't exist in the repository.

We should pick a default name and have it be consistent.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19578) Concurrent equivalent schema updates lead to unresolved disagreement

2024-04-19 Thread Chris Lohfink (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-19578:
--
Description: 
As part of CASSANDRA-17819 a check for empty schema changes was added to the 
updateSchema. This only looks at the _logical_ schema difference of the 
schemas, but the changes made to the system_schema keyspace are the ones that 
actually are involved in the digest.

If two nodes issue the same CREATE statement the difference from the 
keyspace.diff would be empty but the timestamps on the mutations would be 
different, leading to a pseudo schema disagreement which will never resolve 
until resetlocalschema or nodes being bounced.

Only impacts 4.1

test and fix : 
https://github.com/clohfink/cassandra/commit/ba915f839089006ac6d08494ef19dc010bcd6411

  was:
As part of CASSANDRA-17819 a check for empty schema changes was added to the 
updateSchema. This only looks at the _logical_ schema difference of the 
schemas, but the changes made to the system_schema keyspace are the ones that 
actually are involved in the digest.

If two nodes issue the same CREATE statement the difference from the 
keyspace.diff would be empty but the timestamps on the mutations would be 
different, leading to a pseudo schema disagreement which will never resolve 
until resetlocalschema or nodes being bounced.

test and fix : 
https://github.com/clohfink/cassandra/commit/ba915f839089006ac6d08494ef19dc010bcd6411


> Concurrent equivalent schema updates lead to unresolved disagreement
> 
>
> Key: CASSANDRA-19578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19578
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Chris Lohfink
>Priority: Normal
>
> As part of CASSANDRA-17819 a check for empty schema changes was added to the 
> updateSchema. This only looks at the _logical_ schema difference of the 
> schemas, but the changes made to the system_schema keyspace are the ones that 
> actually are involved in the digest.
> If two nodes issue the same CREATE statement the difference from the 
> keyspace.diff would be empty but the timestamps on the mutations would be 
> different, leading to a pseudo schema disagreement which will never resolve 
> until resetlocalschema or nodes being bounced.
> Only impacts 4.1
> test and fix : 
> https://github.com/clohfink/cassandra/commit/ba915f839089006ac6d08494ef19dc010bcd6411



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19578) Concurrent equivalent schema updates lead to unresolved disagreement

2024-04-19 Thread Chris Lohfink (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-19578:
--
Description: 
As part of CASSANDRA-17819 a check for empty schema changes was added to the 
updateSchema. This only looks at the _logical_ schema difference of the 
schemas, but the changes made to the system_schema keyspace are the ones that 
actually are involved in the digest.

If two nodes issue the same CREATE statement the difference from the 
keyspace.diff would be empty but the timestamps on the mutations would be 
different, leading to a pseudo schema disagreement which will never resolve 
until resetlocalschema or nodes being bounced.

test and fix : 
https://github.com/clohfink/cassandra/commit/ba915f839089006ac6d08494ef19dc010bcd6411

  was:
As part of CASSANDRA-17819 a check for empty schema changes was added to the 
updateSchema. This only looks at the _logical_ schema difference of the 
schemas, but the changes made to the system_schema keyspace are the ones that 
actually are involved in the digest.

If two nodes issue the same CREATE statement the difference from the 
keyspace.diff would be empty but the timestamps on the mutations would be 
different, leading to a pseudo schema disagreement which will never resolve 
until resetlocalschema or nodes being bounced.


> Concurrent equivalent schema updates lead to unresolved disagreement
> 
>
> Key: CASSANDRA-19578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19578
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Chris Lohfink
>Priority: Normal
>
> As part of CASSANDRA-17819 a check for empty schema changes was added to the 
> updateSchema. This only looks at the _logical_ schema difference of the 
> schemas, but the changes made to the system_schema keyspace are the ones that 
> actually are involved in the digest.
> If two nodes issue the same CREATE statement the difference from the 
> keyspace.diff would be empty but the timestamps on the mutations would be 
> different, leading to a pseudo schema disagreement which will never resolve 
> until resetlocalschema or nodes being bounced.
> test and fix : 
> https://github.com/clohfink/cassandra/commit/ba915f839089006ac6d08494ef19dc010bcd6411



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19578) Concurrent equivalent schema updates lead to unresolved disagreement

2024-04-19 Thread Chris Lohfink (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-19578:
--
Since Version: 4.1-beta1

> Concurrent equivalent schema updates lead to unresolved disagreement
> 
>
> Key: CASSANDRA-19578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19578
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Chris Lohfink
>Priority: Normal
>
> As part of CASSANDRA-17819 a check for empty schema changes was added to the 
> updateSchema. This only looks at the _logical_ schema difference of the 
> schemas, but the changes made to the system_schema keyspace are the ones that 
> actually are involved in the digest.
> If two nodes issue the same CREATE statement the difference from the 
> keyspace.diff would be empty but the timestamps on the mutations would be 
> different, leading to a pseudo schema disagreement which will never resolve 
> until resetlocalschema or nodes being bounced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRASC-121) Sidecar should default to JMX on port 7199

2024-04-19 Thread Jon Haddad (Jira)
Jon Haddad created CASSANDRASC-121:
--

 Summary: Sidecar should default to JMX on port 7199
 Key: CASSANDRASC-121
 URL: https://issues.apache.org/jira/browse/CASSANDRASC-121
 Project: Sidecar for Apache Cassandra
  Issue Type: Bug
Reporter: Jon Haddad


{{src/main/dist/conf/sidecar.yaml}} lists JMX port 7100, it should default to 
7199.  Requiring the user to change it back to the Cassandra default is an 
unnecessary hinderance.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19578) Concurrent equivalent schema updates lead to unresolved disagreement

2024-04-19 Thread Chris Lohfink (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-19578:
--
Impacts:   (was: None)

> Concurrent equivalent schema updates lead to unresolved disagreement
> 
>
> Key: CASSANDRA-19578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19578
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Chris Lohfink
>Priority: Normal
>
> As part of CASSANDRA-17819 a check for empty schema changes was added to the 
> updateSchema. This only looks at the _logical_ schema difference of the 
> schemas, but the changes made to the system_schema keyspace are the ones that 
> actually are involved in the digest.
> If two nodes issue the same CREATE statement the difference from the 
> keyspace.diff would be empty but the timestamps on the mutations would be 
> different, leading to a pseudo schema disagreement which will never resolve 
> until resetlocalschema or nodes being bounced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19578) Concurrent equivalent schema updates lead to unresolved disagreement

2024-04-19 Thread Chris Lohfink (Jira)
Chris Lohfink created CASSANDRA-19578:
-

 Summary: Concurrent equivalent schema updates lead to unresolved 
disagreement
 Key: CASSANDRA-19578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19578
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Schema
Reporter: Chris Lohfink


As part of CASSANDRA-17819 a check for empty schema changes was added to the 
updateSchema. This only looks at the _logical_ schema difference of the 
schemas, but the changes made to the system_schema keyspace are the ones that 
actually are involved in the digest.

If two nodes issue the same CREATE statement the difference from the 
keyspace.diff would be empty but the timestamps on the mutations would be 
different, leading to a pseudo schema disagreement which will never resolve 
until resetlocalschema or nodes being bounced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRASC-120) Improve packaging by properly supporting application plugin

2024-04-19 Thread Jon Haddad (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRASC-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839169#comment-17839169
 ] 

Jon Haddad commented on CASSANDRASC-120:


I think simply adding this below the copyJolokia task will fix it:
{noformat}
build.dependsOn("copyJolokia"){noformat}

> Improve packaging by properly supporting application plugin
> ---
>
> Key: CASSANDRASC-120
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-120
> Project: Sidecar for Apache Cassandra
>  Issue Type: Bug
>Reporter: Jon Haddad
>Priority: Normal
>
> The Gradle application plugin is used in the project meaning we should be 
> able to do distZip, distTar, and installDist.  We can run them, but the 
> artifacts generated fail to start, throwing errors about jolokia:
> {noformat}
> build/install/cassandra-sidecar on trunk on 
> ❯ bin/cassandra-sidecar
> Error opening zip file or JAR manifest missing : 
> /Users/jhaddad/dev/cassandra-sidecar/build/install/cassandra-sidecar/agents/jolokia-jvm-1.6.0-agent.jar
> Error occurred during initialization of VM
> agent library failed to init: instrument{noformat}
> The jolokia jar is not present at all on the filesystem either:
> {noformat}
> ❯ find . -name '*jolokia*'{noformat}
> It's hard to tell what the intention is here, there's no documentation around 
> building the project or how to deploy it aside from using {{./gradlew run}} 
> which is not appropriate for production.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRASC-120) Improve packaging by properly supporting distZip

2024-04-19 Thread Jon Haddad (Jira)
Jon Haddad created CASSANDRASC-120:
--

 Summary: Improve packaging by properly supporting distZip
 Key: CASSANDRASC-120
 URL: https://issues.apache.org/jira/browse/CASSANDRASC-120
 Project: Sidecar for Apache Cassandra
  Issue Type: Bug
Reporter: Jon Haddad


The Gradle application plugin is used in the project meaning we should be able 
to do distZip, distTar, and installDist.  We can run them, but the artifacts 
generated fail to start, throwing errors about jolokia:
{noformat}
build/install/cassandra-sidecar on trunk on 
❯ bin/cassandra-sidecar
Error opening zip file or JAR manifest missing : 
/Users/jhaddad/dev/cassandra-sidecar/build/install/cassandra-sidecar/agents/jolokia-jvm-1.6.0-agent.jar
Error occurred during initialization of VM
agent library failed to init: instrument{noformat}
The jolokia jar is not present at all on the filesystem either:
{noformat}
❯ find . -name '*jolokia*'{noformat}
It's hard to tell what the intention is here, there's no documentation around 
building the project or how to deploy it aside from using {{./gradlew run}} 
which is not appropriate for production.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRASC-120) Improve packaging by properly supporting application plugin

2024-04-19 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRASC-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRASC-120:
---
Summary: Improve packaging by properly supporting application plugin  (was: 
Improve packaging by properly supporting distZip)

> Improve packaging by properly supporting application plugin
> ---
>
> Key: CASSANDRASC-120
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-120
> Project: Sidecar for Apache Cassandra
>  Issue Type: Bug
>Reporter: Jon Haddad
>Priority: Normal
>
> The Gradle application plugin is used in the project meaning we should be 
> able to do distZip, distTar, and installDist.  We can run them, but the 
> artifacts generated fail to start, throwing errors about jolokia:
> {noformat}
> build/install/cassandra-sidecar on trunk on 
> ❯ bin/cassandra-sidecar
> Error opening zip file or JAR manifest missing : 
> /Users/jhaddad/dev/cassandra-sidecar/build/install/cassandra-sidecar/agents/jolokia-jvm-1.6.0-agent.jar
> Error occurred during initialization of VM
> agent library failed to init: instrument{noformat}
> The jolokia jar is not present at all on the filesystem either:
> {noformat}
> ❯ find . -name '*jolokia*'{noformat}
> It's hard to tell what the intention is here, there's no documentation around 
> building the project or how to deploy it aside from using {{./gradlew run}} 
> which is not appropriate for production.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18732) Baseline Diagnostic vtables for Accord

2024-04-19 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839106#comment-17839106
 ] 

Caleb Rackliffe commented on CASSANDRA-18732:
-

While looking at item 1 above, I discovered CASSANDRA-19577, so I'll be taking 
a detour to fix that...

> Baseline Diagnostic vtables for Accord
> --
>
> Key: CASSANDRA-18732
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18732
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Accord, Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> In addition to JMX-based metrics, there are bits of diagnostic information 
> for Accord that we should consider exposing through vtables:
> 1.) We should ensure that coordinator-level CQL transactions and the local 
> reads and writes they spawn are visible to the existing {{QueriesTable}} 
> vtable.
> The first may already just work. We may need to make some tweaks to 
> {{TxnNamedRead}} and {{TxnWrite}} for the local operations though. 
> ({{CommandStore}} tasks are out of scope here, as they would probably be more 
> confusing than useful in {{QueriesTable}}?)
> 2.) A new vtable for pending commands for a key.
> - Disable SELECT */require a partition key
> - Might require partial back-port of stringifying table/partition key from 
> Accord to be correct
> - ex. {{SELECT timestamps FROM myawesometable where ks=? and table=? and 
> partition_key=?}}
> - Clustering can be the Accord timestamp elements, no further normal columns.
> 3.) A new vtable for command store-specific cache stats
> - Gather via {{Store.execute()}} for correctness.
> - Store id should be partition key (see {{AccordCommandStore}})
> - hits, misses, total (maybe just throw out the keyspaces and coalesce 
> ranges?)
> 4.) (Requires [~aweisberg]'s outstanding work) A new vtable for live 
> migration state
> - {{TableMigrationState}} could be flattened into a row
> - Is this already persisted? If so, why a new vtable?
> 5.) A vtable to expose {{accord.local.Node#coordinating()}} as a map
> - ex. {{SELECT txn_id, type}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19577) Queries are not visible to the "system_views.queries" virtual table at the coordinator level

2024-04-19 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19577:

 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)
   Complexity: Normal
Discovered By: Adhoc Test
Fix Version/s: 4.1.x
   5.0.x
   5.1
 Severity: Low
   Status: Open  (was: Triage Needed)

> Queries are not visible to the "system_views.queries" virtual table at the 
> coordinator level
> 
>
> Key: CASSANDRA-19577
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19577
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.1
>
>
> There appears to be a hole in the implementation of CASSANDRA-15241 where 
> {{DebuggableTasks}} at the coordinator are not preserved through the creation 
> of {{FutureTasks}} in {{TaskFactory}}. This means that {{QueriesTable}} can't 
> see them when is asks {{SharedExecutorPool}} for running tasks. It should be 
> possible to fix this in {{TaskFactory}} by making sure to propagate any 
> {{RunnableDebuggableTask}} we encounter. We already do this in 
> {{toExecute()}}, but it also needs to happen in the relevant {{toSubmit()}} 
> method(s).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19577) Queries are not visible to the "system_views.queries" virtual table at the coordinator level

2024-04-19 Thread Caleb Rackliffe (Jira)
Caleb Rackliffe created CASSANDRA-19577:
---

 Summary: Queries are not visible to the "system_views.queries" 
virtual table at the coordinator level
 Key: CASSANDRA-19577
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19577
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/Virtual Tables
Reporter: Caleb Rackliffe
Assignee: Caleb Rackliffe


There appears to be a hole in the implementation of CASSANDRA-15241 where 
{{DebuggableTasks}} at the coordinator are not preserved through the creation 
of {{FutureTasks}} in {{TaskFactory}}. This means that {{QueriesTable}} can't 
see them when is asks {{SharedExecutorPool}} for running tasks. It should be 
possible to fix this in {{TaskFactory}} by making sure to propagate any 
{{RunnableDebuggableTask}} we encounter. We already do this in {{toExecute()}}, 
but it also needs to happen in the relevant {{toSubmit()}} method(s).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-04-19 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19365:

Reviewers: Caleb Rackliffe

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0, 5.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.
> This bug affects versions 4.0+
> On 3.11 there's locking in DEHR. I did not check earlier versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19567) Minimize the heap consumption when registering metrics

2024-04-19 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19567:

Reviewers: Caleb Rackliffe

> Minimize the heap consumption when registering metrics
> --
>
> Key: CASSANDRA-19567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19567
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The problem is only reproducible on the x86 machine, the problem is not 
> reproducible on the arm64. A quick analysis showed a lot of MetricName 
> objects stored in the heap, although the real cause could be related to 
> something else, the MetricName object requires extra attention.
> To reproduce run the command run locally:
> {code}
> ant test-jvm-dtest-some 
> -Dtest.name=org.apache.cassandra.distributed.test.ReadRepairTest
> {code}
> The error:
> {code:java}
> [junit-timeout] Exception in thread "main" java.lang.OutOfMemoryError: Java 
> heap space
> [junit-timeout]     at 
> java.base/java.lang.StringLatin1.newString(StringLatin1.java:769)
> [junit-timeout]     at 
> java.base/java.lang.StringBuffer.toString(StringBuffer.java:716)
> [junit-timeout]     at 
> org.apache.cassandra.CassandraBriefJUnitResultFormatter.endTestSuite(CassandraBriefJUnitResultFormatter.java:191)
> [junit-timeout]     at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.fireEndTestSuite(JUnitTestRunner.java:854)
> [junit-timeout]     at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:578)
> [junit-timeout]     at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1197)
> [junit-timeout]     at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:1042)
> [junit-timeout] Testsuite: 
> org.apache.cassandra.distributed.test.ReadRepairTest-cassandra.testtag_IS_UNDEFINED
> [junit-timeout] Testsuite: 
> org.apache.cassandra.distributed.test.ReadRepairTest-cassandra.testtag_IS_UNDEFINED
>  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 sec
> [junit-timeout] 
> [junit-timeout] Testcase: 
> org.apache.cassandra.distributed.test.ReadRepairTest:readRepairRTRangeMovementTest-cassandra.testtag_IS_UNDEFINED:
>     Caused an ERROR
> [junit-timeout] Forked Java VM exited abnormally. Please note the time in the 
> report does not reflect the time until the VM exit.
> [junit-timeout] junit.framework.AssertionFailedError: Forked Java VM exited 
> abnormally. Please note the time in the report does not reflect the time 
> until the VM exit.
> [junit-timeout]     at 
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit-timeout]     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout]     at java.base/java.util.Vector.forEach(Vector.java:1365)
> [junit-timeout]     at 
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit-timeout]     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout]     at 
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit-timeout]     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout]     at java.base/java.util.Vector.forEach(Vector.java:1365)
> [junit-timeout]     at 
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit-timeout]     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout]     at 
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit-timeout]     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout] 
> [junit-timeout] 
> [junit-timeout] Test org.apache.cassandra.distributed.test.ReadRepairTest 
> FAILED (crashed)BUILD FAILED
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19576) blocking on memtable allocations blocks compactions from completing

2024-04-19 Thread Jon Haddad (Jira)
Jon Haddad created CASSANDRA-19576:
--

 Summary: blocking on memtable allocations blocks compactions from 
completing
 Key: CASSANDRA-19576
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19576
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Compaction
Reporter: Jon Haddad


{{updateCompactionHistory}} writes to compaction_history in a blocking manner.  
If we've also run out of memtable memory, this call will block and compaction 
falls behind.  

It might make sense to have this be non blocking.  We'd risk having a missing 
entry in nodetool getcompactionhistory, but I think that's preferable to having 
a dead node.  

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19564) MemtablePostFlush deadlock leads to stuck nodes and crashes

2024-04-19 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-19564:
---
Attachment: screenshot-1.png

> MemtablePostFlush deadlock leads to stuck nodes and crashes
> ---
>
> Key: CASSANDRA-19564
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19564
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/Memtable
>Reporter: Jon Haddad
>Priority: Urgent
> Fix For: 4.1.x
>
> Attachments: image-2024-04-16-11-55-54-750.png, 
> image-2024-04-16-12-29-15-386.png, image-2024-04-16-13-43-11-064.png, 
> image-2024-04-16-13-53-24-455.png, image-2024-04-17-18-46-29-474.png, 
> image-2024-04-17-19-13-06-769.png, image-2024-04-17-19-14-34-344.png, 
> screenshot-1.png
>
>
> I've run into an issue on a 4.1.4 cluster where an entire node has locked up 
> due to what I believe is a deadlock in memtable flushing. Here's what I know 
> so far.  I've stitched together what happened based on conversations, logs, 
> and some flame graphs.
> *Log reports memtable flushing*
> The last successful flush happens at 12:19. 
> {noformat}
> INFO  [NativePoolCleaner] 2024-04-16 12:19:53,634 
> AbstractAllocatorMemtable.java:286 - Flushing largest CFS(Keyspace='ks', 
> ColumnFamily='version') to free up room. Used total: 0.24/0.33, live: 
> 0.16/0.20, flushing: 0.09/0.13, this: 0.13/0.15
> INFO  [NativePoolCleaner] 2024-04-16 12:19:53,634 ColumnFamilyStore.java:1012 
> - Enqueuing flush of ks.version, Reason: MEMTABLE_LIMIT, Usage: 660.521MiB 
> (13%) on-heap, 790.606MiB (15%) off-heap
> {noformat}
> *MemtablePostFlush appears to be blocked*
> At this point, MemtablePostFlush completed tasks stops incrementing, active 
> stays at 1 and pending starts to rise.
> {noformat}
> MemtablePostFlush   1    1   3446   0   0
> {noformat}
>  
> The flame graph reveals that PostFlush.call is stuck.  I don't have the line 
> number, but I know we're stuck in 
> {{org.apache.cassandra.db.ColumnFamilyStore.PostFlush#call}} given the visual 
> below:
> *!image-2024-04-16-13-43-11-064.png!*
> *Memtable flushing is now blocked.*
> All MemtableFlushWriter threads are Parked waiting on 
> {{{}OpOrder.Barrier.await{}}}. A wall clock profile of 30s reveals all time 
> is spent here.  Presumably we're waiting on the single threaded Post Flush.
> !image-2024-04-16-12-29-15-386.png!
> *Memtable allocations start to block*
> Eventually it looks like the NativeAllocator stops successfully allocating 
> memory. I assume it's waiting on memory to be freed, but since memtable 
> flushes are blocked, we wait indefinitely.
> Looking at a wall clock flame graph, all writer threads have reached the 
> allocation failure path of {{MemtableAllocator.allocate()}}.  I believe we're 
> waiting on {{signal.awaitThrowUncheckedOnInterrupt()}}
> {noformat}
>  MutationStage    48    828425      980253369      0    0{noformat}
> !image-2024-04-16-11-55-54-750.png!
>  
> *Compaction Stops*
> Since we write to the compaction history table, and that requires memtables, 
> compactions are now blocked as well.
>  
> !image-2024-04-16-13-53-24-455.png!
>  
> The node is now doing basically nothing and must be restarted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19568) Jennkins pipeline's default Java version for Jabba has changed

2024-04-19 Thread Bret McGuire (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839037#comment-17839037
 ] 

Bret McGuire commented on CASSANDRA-19568:
--

To be clear: this ticket covers a Jenkins update on DataStax infrastructure and 
not anything related to the ASF CI

> Jennkins pipeline's default Java version for Jabba has changed
> --
>
> Key: CASSANDRA-19568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19568
> Project: Cassandra
>  Issue Type: Bug
>  Components: Client/java-driver
>Reporter: Jane He
>Assignee: Henry Hughes
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We need Java 8 to build the driver. In the past, `jabba use default` will use 
> Java 8. After a jenkins runner update, it uses java 11 now. Therefore, we 
> have to specify `jabba use 1.8` instead before we build the driver.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-staging updated (1f33a036 -> 66cf8ef3)

2024-04-19 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 1f33a036 generate docs for 1d4732f3
 new 66cf8ef3 generate docs for 1d4732f3

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (1f33a036)
\
 N -- N -- N   refs/heads/asf-staging (66cf8ef3)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/search-index.js |   2 +-
 site-ui/build/ui-bundle.zip | Bin 4883646 -> 4883646 bytes
 2 files changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19132) Update use of transition plan in PrepareReplace

2024-04-19 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19132:

Attachment: ci_summary.html

> Update use of transition plan in PrepareReplace
> ---
>
> Key: CASSANDRA-19132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19132
> Project: Cassandra
>  Issue Type: Task
>  Components: Cluster/Membership
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When PlacementTransitionPlan was reworked to make its use more consistent 
> across join and leave operations, PrepareReplace was not updated. This could 
> now be simplified in line with the other operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19132) Update use of transition plan in PrepareReplace

2024-04-19 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19132:

Status: Ready to Commit  (was: Review In Progress)

+1 LGTM. I rebased and added a second commit with a slight tweak to 
{{PlacementTransitionPlan}}. CI looks reasonable, 2 previously known failures + 
1 {{Port already in use}} which I believe is an infra problem

> Update use of transition plan in PrepareReplace
> ---
>
> Key: CASSANDRA-19132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19132
> Project: Cassandra
>  Issue Type: Task
>  Components: Cluster/Membership
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
> Attachments: ci_summary.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When PlacementTransitionPlan was reworked to make its use more consistent 
> across join and leave operations, PrepareReplace was not updated. This could 
> now be simplified in line with the other operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19575) StartupCheck error for read_ahead_kb

2024-04-19 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839022#comment-17839022
 ] 

Brandon Williams edited comment on CASSANDRA-19575 at 4/19/24 2:46 PM:
---

I think instead of trying to decompose the specified device name to a block 
device via regex, it would be better to list all the block devices and match 
against device names.


was (Author: brandon.williams):
I think instead of trying to decompose the specified device name to a block 
device, it would be better to list all the block devices and compare against 
device names.

> StartupCheck error for read_ahead_kb
> 
>
> Key: CASSANDRA-19575
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19575
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Startup and Shutdown
>Reporter: Morten Joenby
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.x
>
>
> We believe the StartupChecks.java has a minor bug here:
> [https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/service/StartupChecks.java#L737]
> {code:java}
> String deviceName = blockDirComponents[2].replaceAll("[0-9]*$", "");{code}
> We are using a RAID setup with two disks, so removing the "[0-9]" makes the 
> check fail:
> cat /sys/block/md/queue/read_ahead_kb
> cat: /sys/block/md/queue/read_ahead_kb: No such file or directory
> It should be "md0" in our case, so removing the '0' won't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19575) StartupCheck error for read_ahead_kb

2024-04-19 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839022#comment-17839022
 ] 

Brandon Williams commented on CASSANDRA-19575:
--

I think instead of trying to decompose the specified device name to a block 
device, it would be better to list all the block devices and compare against 
device names.

> StartupCheck error for read_ahead_kb
> 
>
> Key: CASSANDRA-19575
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19575
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Startup and Shutdown
>Reporter: Morten Joenby
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.x
>
>
> We believe the StartupChecks.java has a minor bug here:
> [https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/service/StartupChecks.java#L737]
> {code:java}
> String deviceName = blockDirComponents[2].replaceAll("[0-9]*$", "");{code}
> We are using a RAID setup with two disks, so removing the "[0-9]" makes the 
> check fail:
> cat /sys/block/md/queue/read_ahead_kb
> cat: /sys/block/md/queue/read_ahead_kb: No such file or directory
> It should be "md0" in our case, so removing the '0' won't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-19 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839012#comment-17839012
 ] 

Michael Semb Wever commented on CASSANDRA-19566:


bq. Maybe I misunderstood something, isn’t Jenkins dev back to business after 
the standalone Jenkins file was committed?

Unfortunately there's still some (more) things that need fixing, particular to 
ci-cassandra.a.o and especially to post-commit runs.  This is being ironed out 
in CASSANDRA-19558, with the remaining issue around docker pull rate limits.  
The test results in the cassandra-5.0 and trunk post-commit builds are legit 
though (despite the red status of those builds), otherwise these jobs are 
disabled until 19558 lands.

The jenkinsfile _can_ now be used in other jenkins successfully. 

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19575) StartupCheck error for read_ahead_kb

2024-04-19 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19575:
-
 Bug Category: Parent values: Degradation(12984)Level 1 values: Resource 
Management(12995)
   Complexity: Normal
Discovered By: User Report
Fix Version/s: 4.1.x
   5.0.x
   5.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> StartupCheck error for read_ahead_kb
> 
>
> Key: CASSANDRA-19575
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19575
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Startup and Shutdown
>Reporter: Morten Joenby
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.x
>
>
> We believe the StartupChecks.java has a minor bug here:
> [https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/service/StartupChecks.java#L737]
> {code:java}
> String deviceName = blockDirComponents[2].replaceAll("[0-9]*$", "");{code}
> We are using a RAID setup with two disks, so removing the "[0-9]" makes the 
> check fail:
> cat /sys/block/md/queue/read_ahead_kb
> cat: /sys/block/md/queue/read_ahead_kb: No such file or directory
> It should be "md0" in our case, so removing the '0' won't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19575) StartupCheck error for read_ahead_kb

2024-04-19 Thread Morten Joenby (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838992#comment-17838992
 ] 

Morten Joenby commented on CASSANDRA-19575:
---

According to the debug.log it's not an error as such, but just an incorrect 
check:


{code:java}
INFO  [main] 2024-04-18 11:40:20,139 SigarLibrary.java:178 - Checked OS 
settings and found them configured for optimal performance.
DEBUG [main] 2024-04-18 11:40:20,141 StartupChecks.java:424 - No 
'read_ahead_kb' setting found for device /dev/md0 of data directory 
/var/lib/cassandra/data.
DEBUG [main] 2024-04-18 11:40:20,146 StartupChecks.java:512 - Checking 
directory /var/lib/cassandra/data
DEBUG [main] 2024-04-18 11:40:20,149 StartupChecks.java:512 - Checking 
directory /commitlogs
DEBUG [main] 2024-04-18 11:40:20,149 StartupChecks.java:512 - Checking 
directory /var/lib/cassandra/data/saved_caches
DEBUG [main] 2024-04-18 11:40:20,150 StartupChecks.java:512 - Checking 
directory /var/lib/cassandra/data/hints{code}

> StartupCheck error for read_ahead_kb
> 
>
> Key: CASSANDRA-19575
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19575
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Startup and Shutdown
>Reporter: Morten Joenby
>Priority: Normal
>
> We believe the StartupChecks.java has a minor bug here:
> [https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/service/StartupChecks.java#L737]
> {code:java}
> String deviceName = blockDirComponents[2].replaceAll("[0-9]*$", "");{code}
> We are using a RAID setup with two disks, so removing the "[0-9]" makes the 
> check fail:
> cat /sys/block/md/queue/read_ahead_kb
> cat: /sys/block/md/queue/read_ahead_kb: No such file or directory
> It should be "md0" in our case, so removing the '0' won't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19575) StartupCheck error for read_ahead_kb

2024-04-19 Thread Morten Joenby (Jira)
Morten Joenby created CASSANDRA-19575:
-

 Summary: StartupCheck error for read_ahead_kb
 Key: CASSANDRA-19575
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19575
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Startup and Shutdown
Reporter: Morten Joenby


We believe the StartupChecks.java has a minor bug here:
[https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/service/StartupChecks.java#L737]
{code:java}
String deviceName = blockDirComponents[2].replaceAll("[0-9]*$", "");{code}
We are using a RAID setup with two disks, so removing the "[0-9]" makes the 
check fail:


cat /sys/block/md/queue/read_ahead_kb
cat: /sys/block/md/queue/read_ahead_kb: No such file or directory

It should be "md0" in our case, so removing the '0' won't work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-19 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838985#comment-17838985
 ] 

Ekaterina Dimitrova commented on CASSANDRA-19566:
-

Maybe I misunderstood something, isn’t Jenkins dev back to business after the 
standalone Jenkins file was committed?

I see some runs here 
[https://ci-cassandra.apache.org/job/Cassandra-5-devbranch/116/]

[~mck] ?

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-19 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838982#comment-17838982
 ] 

Berenguer Blasi commented on CASSANDRA-19566:
-

I just fired upgrade tests in one of our internal envs for 5.0. If it fails 
I'll try on circle. If it passes you can do the rest of PRs :shrug:

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19565) SIGSEGV on Cassandra v4.1.4

2024-04-19 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19565:
-
Test and Documentation Plan: run packaging CI
 Status: Patch Available  (was: In Progress)

> SIGSEGV on Cassandra v4.1.4
> ---
>
> Key: CASSANDRA-19565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19565
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Thomas De Keulenaer
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.x
>
> Attachments: cassandra_57_debian_jdk11_amd64_attempt1.log.xz, 
> cassandra_57_redhat_jdk11_amd64_attempt1.log.xz, hs_err_pid1116450.log
>
>
> Hello,
> Since upgrading to v4.1. we cannat run CAssandra any more. Each start 
> immediately crashes:
> {{Apr 17 08:58:34 SVALD108 cassandra[1116450]: # A fatal error has been 
> detected by the Java Runtime Environment:
> Apr 17 08:58:34 SVALD108 cassandra[1116450]: #  SIGSEGV (0xb) at 
> pc=0x7fccaab4d152, pid=1116450, tid=1116451}}
> I have added the log from the coe dump.
> This issue is perhaps related to 
> https://davecturner.github.io/2021/08/30/seven-year-old-segfault.html ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19565) SIGSEGV on Cassandra v4.1.4

2024-04-19 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838977#comment-17838977
 ] 

Brandon Williams commented on CASSANDRA-19565:
--

bq.  Now I just need to figure out CI for packaging on 4.1.

This turned out to be problematic, but I merged into 5.0 
[here|https://github.com/driftx/cassandra/tree/CASSANDRA-19565-5.0] and ran the 
packaging build in private CI, then attached the logs from the package building 
here.  The package changes will be the same in all branches so this should be 
equivalent.

> SIGSEGV on Cassandra v4.1.4
> ---
>
> Key: CASSANDRA-19565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19565
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Thomas De Keulenaer
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.x
>
> Attachments: cassandra_57_debian_jdk11_amd64_attempt1.log.xz, 
> cassandra_57_redhat_jdk11_amd64_attempt1.log.xz, hs_err_pid1116450.log
>
>
> Hello,
> Since upgrading to v4.1. we cannat run CAssandra any more. Each start 
> immediately crashes:
> {{Apr 17 08:58:34 SVALD108 cassandra[1116450]: # A fatal error has been 
> detected by the Java Runtime Environment:
> Apr 17 08:58:34 SVALD108 cassandra[1116450]: #  SIGSEGV (0xb) at 
> pc=0x7fccaab4d152, pid=1116450, tid=1116451}}
> I have added the log from the coe dump.
> This issue is perhaps related to 
> https://davecturner.github.io/2021/08/30/seven-year-old-segfault.html ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19565) SIGSEGV on Cassandra v4.1.4

2024-04-19 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19565:
-
Attachment: cassandra_57_redhat_jdk11_amd64_attempt1.log.xz
cassandra_57_debian_jdk11_amd64_attempt1.log.xz

> SIGSEGV on Cassandra v4.1.4
> ---
>
> Key: CASSANDRA-19565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19565
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Thomas De Keulenaer
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.x
>
> Attachments: cassandra_57_debian_jdk11_amd64_attempt1.log.xz, 
> cassandra_57_redhat_jdk11_amd64_attempt1.log.xz, hs_err_pid1116450.log
>
>
> Hello,
> Since upgrading to v4.1. we cannat run CAssandra any more. Each start 
> immediately crashes:
> {{Apr 17 08:58:34 SVALD108 cassandra[1116450]: # A fatal error has been 
> detected by the Java Runtime Environment:
> Apr 17 08:58:34 SVALD108 cassandra[1116450]: #  SIGSEGV (0xb) at 
> pc=0x7fccaab4d152, pid=1116450, tid=1116451}}
> I have added the log from the coe dump.
> This issue is perhaps related to 
> https://davecturner.github.io/2021/08/30/seven-year-old-segfault.html ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-19 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19566:
-
Fix Version/s: 4.0.x
   4.1.x

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-19 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838964#comment-17838964
 ] 

Brandon Williams commented on CASSANDRA-19566:
--

Jenkins won't do < 5.0 well right now, but I can facilitate upgrade tests if 
you make the branches.

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19534) unbounded queues in native transport requests lead to node instability

2024-04-19 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838960#comment-17838960
 ] 

Alex Petrov commented on CASSANDRA-19534:
-

I guess this can explain it. We have 32 read, 32 write threads, and 128 native 
threads, so 2:1 relation. Read queue is slightly deeper (about 80) requests, 
which is clear since latency there is probably higher (however depends on the 
request), and write queue is almost empty. We easily can have all 128 requests 
blocked in this case, so they can not really overload the downstream stages. 
Besides, there's no hints, so at least a part of the issue we may have in a 
distributed environment is not applicable. 

> unbounded queues in native transport requests lead to node instability
> --
>
> Key: CASSANDRA-19534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19534
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Jon Haddad
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> When a node is under pressure, hundreds of thousands of requests can show up 
> in the native transport queue, and it looks like it can take way longer to 
> timeout than is configured.  We should be shedding load much more 
> aggressively and use a bounded queue for incoming work.  This is extremely 
> evident when we combine a resource consuming workload with a smaller one:
> Running 5.0 HEAD on a single node as of today:
> {noformat}
> # populate only
> easy-cass-stress run RandomPartitionAccess -p 100  -r 1 
> --workload.rows=10 --workload.select=partition --maxrlat 100 --populate 
> 10m --rate 50k -n 1
> # workload 1 - larger reads
> easy-cass-stress run RandomPartitionAccess -p 100  -r 1 
> --workload.rows=10 --workload.select=partition --rate 200 -d 1d
> # second workload - small reads
> easy-cass-stress run KeyValue -p 1m --rate 20k -r .5 -d 24h{noformat}
> It appears our results don't time out at the requested server time either:
>  
> {noformat}
>                  Writes                                  Reads                
>                   Deletes                       Errors
>   Count  Latency (p99)  1min (req/s) |   Count  Latency (p99)  1min (req/s) | 
>   Count  Latency (p99)  1min (req/s) |   Count  1min (errors/s)
>  950286       70403.93        634.77 |  789524       70442.07        426.02 | 
>       0              0             0 | 9580484         18980.45
>  952304       70567.62         640.1 |  791072       70634.34        428.36 | 
>       0              0             0 | 9636658         18969.54
>  953146       70767.34         640.1 |  791400       70767.76        428.36 | 
>       0              0             0 | 9695272         18969.54
>  956833       71171.28        623.14 |  794009        71175.6        412.79 | 
>       0              0             0 | 9749377         19002.44
>  959627       71312.58        656.93 |  795703       71349.87        435.56 | 
>       0              0             0 | 9804907         18943.11{noformat}
>  
> After stopping the load test altogether, it took nearly a minute before the 
> requests were no longer queued.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19573) incorrect queries showing up in queries virtual table

2024-04-19 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19573:
-
 Bug Category: Parent values: Correctness(12982)Level 1 values: Transient 
Incorrect Response(12987)
   Complexity: Normal
Discovered By: User Report
Fix Version/s: 4.1.x
   5.0.x
   5.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> incorrect queries showing up in queries virtual table
> -
>
> Key: CASSANDRA-19573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Jon Haddad
>Priority: Normal
> Fix For: 4.1.x, 5.0.x, 5.x
>
>
> While running an easy-cass-stress workload I queried system_views.queries and 
> got this (edited for sanity):
>  
> {noformat}
> thread_id       | queued_micros | running_micros | task
> -+---++
>  MutationStage-2 |             0 |              0 | 
> Mutation(keyspace='easy_cass_stress', key='3030312e302e3937363535', 
> modifications=[\n  [easy_cass_stress.random_access] key=001.0.97655 
> partition_deletion=deletedAt=-9223372036854775808, localDeletion=2147483647 
> columns=[[] | [value]]\n    Row[info=[ts=1713501354789139] ]: row_id=291 | 
> [value=SXFNPWZDYFHTUSBWMUQCTTRAHQWXMGYOHASTGDFYLILWMOSFQWZGKUAIPUUGCLTADKFFXZRQGKIJJLXNOQKMAIOVSSVMVSFSFAVPABIIHGQSGRPACFWCKYMZMSNZZARSBFVDASTMCRHAVAYHKQDZWFCHRUPDWZJVTEVIWKPMKLAOZGBUDFJVOPSAHLAIWOGNXZHCBVK
>  ts=1713501354789139]\n])
>      ReadStage-4 |             0 |           6216 |  SELECT * FROM 
> easy_cass_stress.random_access WHERE partition_id = '001.0.18474' AND row_id 
> = 746 LIMIT 5000 ALLOW FILTERING
> {noformat}
> What's interesting is that I supply neither a LIMIT or ALLOW FILTERING when I 
> prepared the query.  I assume the limit is coming from the driver, and while 
> it's technically correct from the standpoint of what it does, it's not what I 
> prepared so it's a little weird to see it there.
> The ALLOW FILTERING, on the other hand, was definitely not prepared.
> {noformat}
>  session.prepare("SELECT * from random_access WHERE partition_id = ? and 
> row_id = ?")
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19534) unbounded queues in native transport requests lead to node instability

2024-04-19 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838957#comment-17838957
 ] 

Brandon Williams commented on CASSANDRA-19534:
--

FWIW, that was a single node.

> unbounded queues in native transport requests lead to node instability
> --
>
> Key: CASSANDRA-19534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19534
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Jon Haddad
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> When a node is under pressure, hundreds of thousands of requests can show up 
> in the native transport queue, and it looks like it can take way longer to 
> timeout than is configured.  We should be shedding load much more 
> aggressively and use a bounded queue for incoming work.  This is extremely 
> evident when we combine a resource consuming workload with a smaller one:
> Running 5.0 HEAD on a single node as of today:
> {noformat}
> # populate only
> easy-cass-stress run RandomPartitionAccess -p 100  -r 1 
> --workload.rows=10 --workload.select=partition --maxrlat 100 --populate 
> 10m --rate 50k -n 1
> # workload 1 - larger reads
> easy-cass-stress run RandomPartitionAccess -p 100  -r 1 
> --workload.rows=10 --workload.select=partition --rate 200 -d 1d
> # second workload - small reads
> easy-cass-stress run KeyValue -p 1m --rate 20k -r .5 -d 24h{noformat}
> It appears our results don't time out at the requested server time either:
>  
> {noformat}
>                  Writes                                  Reads                
>                   Deletes                       Errors
>   Count  Latency (p99)  1min (req/s) |   Count  Latency (p99)  1min (req/s) | 
>   Count  Latency (p99)  1min (req/s) |   Count  1min (errors/s)
>  950286       70403.93        634.77 |  789524       70442.07        426.02 | 
>       0              0             0 | 9580484         18980.45
>  952304       70567.62         640.1 |  791072       70634.34        428.36 | 
>       0              0             0 | 9636658         18969.54
>  953146       70767.34         640.1 |  791400       70767.76        428.36 | 
>       0              0             0 | 9695272         18969.54
>  956833       71171.28        623.14 |  794009        71175.6        412.79 | 
>       0              0             0 | 9749377         19002.44
>  959627       71312.58        656.93 |  795703       71349.87        435.56 | 
>       0              0             0 | 9804907         18943.11{noformat}
>  
> After stopping the load test altogether, it took nearly a minute before the 
> requests were no longer queued.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-19 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838921#comment-17838921
 ] 

Stefan Miklosovic commented on CASSANDRA-19566:
---

I dont know how to run upgrades on my end. I dont have CircleCI plan which 
would not timeout for me and Jenkins still broken? 

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19344) Range movements involving transient replicas must safely enact changes to read and write replica sets

2024-04-19 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19344:

  Fix Version/s: 5.1
 (was: 5.x)
  Since Version: NA
Source Control Link: 
https://github.com/apache/cassandra/commit/dabcb175527d3c2daef54c6ce029b3c3054b2a77
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed, thanks!

> Range movements involving transient replicas must safely enact changes to 
> read and write replica sets
> -
>
> Key: CASSANDRA-19344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19344
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 5.1
>
> Attachments: ci_summary-1.html, ci_summary.html, 
> remove-n4-post-19344.txt, remove-n4-pre-19344.txt, result_details.tar.gz
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> (edit) This was originally opened due to a flaky test 
> {{org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode-_jdk17}}
> The test can fail in two different ways:
> {code:java}
> junit.framework.AssertionFailedError: NOT IN CURRENT: 31 -- [(00,20), 
> (31,50)] at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.assertAllContained(TransientRangeMovementTest.java:203)
>  at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode(TransientRangeMovementTest.java:183)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}
> as in here - 
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2639/workflows/32b92ce7-5e9d-4efb-8362-d200d2414597/jobs/55139/tests#failed-test-0]
> and
> {code:java}
> junit.framework.AssertionFailedError: nodetool command [removenode, 
> 6d194555-f6eb-41d0-c000-0003, --force] was not successful stdout: 
> stderr: error: Node /127.0.0.4:7012 is alive and owns this ID. Use 
> decommission command to remove it from the ring -- StackTrace -- 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  at java.base/java.lang.Thread.run(Thread.java:833) Notifications: Error: 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at 

[jira] [Updated] (CASSANDRA-19344) Range movements involving transient replicas must safely enact changes to read and write replica sets

2024-04-19 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19344:

Status: Ready to Commit  (was: Review In Progress)

> Range movements involving transient replicas must safely enact changes to 
> read and write replica sets
> -
>
> Key: CASSANDRA-19344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19344
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 5.x
>
> Attachments: ci_summary-1.html, ci_summary.html, 
> remove-n4-post-19344.txt, remove-n4-pre-19344.txt, result_details.tar.gz
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> (edit) This was originally opened due to a flaky test 
> {{org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode-_jdk17}}
> The test can fail in two different ways:
> {code:java}
> junit.framework.AssertionFailedError: NOT IN CURRENT: 31 -- [(00,20), 
> (31,50)] at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.assertAllContained(TransientRangeMovementTest.java:203)
>  at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode(TransientRangeMovementTest.java:183)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}
> as in here - 
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2639/workflows/32b92ce7-5e9d-4efb-8362-d200d2414597/jobs/55139/tests#failed-test-0]
> and
> {code:java}
> junit.framework.AssertionFailedError: nodetool command [removenode, 
> 6d194555-f6eb-41d0-c000-0003, --force] was not successful stdout: 
> stderr: error: Node /127.0.0.4:7012 is alive and owns this ID. Use 
> decommission command to remove it from the ring -- StackTrace -- 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  at java.base/java.lang.Thread.run(Thread.java:833) Notifications: Error: 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> 

[jira] [Commented] (CASSANDRA-19344) Range movements involving transient replicas must safely enact changes to read and write replica sets

2024-04-19 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838909#comment-17838909
 ] 

Marcus Eriksson commented on CASSANDRA-19344:
-

+1

> Range movements involving transient replicas must safely enact changes to 
> read and write replica sets
> -
>
> Key: CASSANDRA-19344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19344
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 5.x
>
> Attachments: ci_summary-1.html, ci_summary.html, 
> remove-n4-post-19344.txt, remove-n4-pre-19344.txt, result_details.tar.gz
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> (edit) This was originally opened due to a flaky test 
> {{org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode-_jdk17}}
> The test can fail in two different ways:
> {code:java}
> junit.framework.AssertionFailedError: NOT IN CURRENT: 31 -- [(00,20), 
> (31,50)] at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.assertAllContained(TransientRangeMovementTest.java:203)
>  at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode(TransientRangeMovementTest.java:183)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}
> as in here - 
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2639/workflows/32b92ce7-5e9d-4efb-8362-d200d2414597/jobs/55139/tests#failed-test-0]
> and
> {code:java}
> junit.framework.AssertionFailedError: nodetool command [removenode, 
> 6d194555-f6eb-41d0-c000-0003, --force] was not successful stdout: 
> stderr: error: Node /127.0.0.4:7012 is alive and owns this ID. Use 
> decommission command to remove it from the ring -- StackTrace -- 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  at java.base/java.lang.Thread.run(Thread.java:833) Notifications: Error: 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> 

[jira] [Commented] (CASSANDRA-19344) Range movements involving transient replicas must safely enact changes to read and write replica sets

2024-04-19 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838910#comment-17838910
 ] 

Alex Petrov commented on CASSANDRA-19344:
-

+1

> Range movements involving transient replicas must safely enact changes to 
> read and write replica sets
> -
>
> Key: CASSANDRA-19344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19344
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 5.x
>
> Attachments: ci_summary-1.html, ci_summary.html, 
> remove-n4-post-19344.txt, remove-n4-pre-19344.txt, result_details.tar.gz
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> (edit) This was originally opened due to a flaky test 
> {{org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode-_jdk17}}
> The test can fail in two different ways:
> {code:java}
> junit.framework.AssertionFailedError: NOT IN CURRENT: 31 -- [(00,20), 
> (31,50)] at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.assertAllContained(TransientRangeMovementTest.java:203)
>  at 
> org.apache.cassandra.distributed.test.TransientRangeMovementTest.testRemoveNode(TransientRangeMovementTest.java:183)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}
> as in here - 
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2639/workflows/32b92ce7-5e9d-4efb-8362-d200d2414597/jobs/55139/tests#failed-test-0]
> and
> {code:java}
> junit.framework.AssertionFailedError: nodetool command [removenode, 
> 6d194555-f6eb-41d0-c000-0003, --force] was not successful stdout: 
> stderr: error: Node /127.0.0.4:7012 is alive and owns this ID. Use 
> decommission command to remove it from the ring -- StackTrace -- 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  at java.base/java.lang.Thread.run(Thread.java:833) Notifications: Error: 
> java.lang.UnsupportedOperationException: Node /127.0.0.4:7012 is alive and 
> owns this ID. Use decommission command to remove it from the ring at 
> org.apache.cassandra.tcm.sequences.SingleNodeSequences.removeNode(SingleNodeSequences.java:110)
>  at 
> org.apache.cassandra.service.StorageService.removeNode(StorageService.java:3682)
>  at org.apache.cassandra.tools.NodeProbe.removeNode(NodeProbe.java:1020) at 
> org.apache.cassandra.tools.nodetool.RemoveNode.execute(RemoveNode.java:51) at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.runInternal(NodeTool.java:388)
>  at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:373) at 
> org.apache.cassandra.tools.NodeTool.execute(NodeTool.java:272) at 
> org.apache.cassandra.distributed.impl.Instance$DTestNodeTool.execute(Instance.java:1129)
>  at 
> org.apache.cassandra.distributed.impl.Instance.lambda$nodetoolResult$51(Instance.java:1038)
>  at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61) at 
> org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> 

[jira] [Commented] (CASSANDRA-19566) JSON encoded timestamp value does not always match non-JSON encoded value

2024-04-19 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838901#comment-17838901
 ] 

Berenguer Blasi commented on CASSANDRA-19566:
-

You basically move everything to {{Instant}} which sgtm.

I think we should do this 4.0->trunk . Also CI should be j11 iiuc and I would 
run upgrade tests as touching serializers and toString sometimes has surprising 
side effects. Wdyt?

> JSON encoded timestamp value does not always match non-JSON encoded value
> -
>
> Key: CASSANDRA-19566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19566
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Legacy/CQL
>Reporter: Bowen Song
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description:
> "SELECT JSON ..." and "toJson(...)" on Cassandra 4.1.4 produces different 
> date than "SELECT ..."  for some timestamp type values.
>  
> Steps to reproduce:
> {code:java}
> $ sudo docker pull cassandra:4.1.4
> $ sudo docker create --name cass cassandra:4.1.4
> $ sudo docker start cass
> $ # wait for the Cassandra instance becomes ready
> $ sudo docker exec -ti cass cqlsh
> Connected to Test Cluster at 127.0.0.1:9042
> [cqlsh 6.1.0 | Cassandra 4.1.4 | CQL spec 3.4.6 | Native protocol v5]
> Use HELP for help.
> cqlsh> create keyspace test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table tbl (id int, ts timestamp, primary key (id));
> cqlsh:test> insert into tbl (id, ts) values (1, -1376701920);
> cqlsh:test> select tounixtimestamp(ts), ts, tojson(ts) from tbl where id=1;
>  system.tounixtimestamp(ts) | ts                              | 
> system.tojson(ts)
> +-+
>             -1376701920 | 1533-09-28 12:00:00.00+ | "1533-09-18 
> 12:00:00.000Z"
> (1 rows)
> cqlsh:test> select json * from tbl where id=1;
>  [json]
> -
>  {"id": 1, "ts": "1533-09-18 12:00:00.000Z"}
> (1 rows)
> {code}
>  
> Expected behaviour:
> The "select ts", "select tojson(ts)" and "select json *" should all produce 
> the same date.
>  
> Actual behaviour:
> The "select ts" produced the "1533-09-28" date but the "select tojson(ts)" 
> and "select json *" produced the "1533-09-18" date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19132) Update use of transition plan in PrepareReplace

2024-04-19 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-19132:

Status: Review In Progress  (was: Patch Available)

> Update use of transition plan in PrepareReplace
> ---
>
> Key: CASSANDRA-19132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19132
> Project: Cassandra
>  Issue Type: Task
>  Components: Cluster/Membership
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 5.1-alpha1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When PlacementTransitionPlan was reworked to make its use more consistent 
> across join and leave operations, PrepareReplace was not updated. This could 
> now be simplified in line with the other operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-staging updated (25cd5019 -> 1f33a036)

2024-04-19 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


omit 25cd5019 generate docs for 1d4732f3
 new 1f33a036 generate docs for 1d4732f3

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (25cd5019)
\
 N -- N -- N   refs/heads/asf-staging (1f33a036)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../managing/configuration/cass_yaml_file.html |  75 +
 .../managing/configuration/cass_yaml_file.html |  75 +
 .../managing/configuration/cass_yaml_file.html |  75 +
 .../managing/configuration/cass_yaml_file.html |  75 +
 content/search-index.js|   2 +-
 site-ui/build/ui-bundle.zip| Bin 4883646 -> 4883646 
bytes
 6 files changed, 301 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19534) unbounded queues in native transport requests lead to node instability

2024-04-19 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1783#comment-1783
 ] 

Alex Petrov commented on CASSANDRA-19534:
-

Talked to [~brandon.williams] and checked the remains of the cluster in the bad 
state to observe that at least the symptoms match my own observations and the 
issue I have seen. 180+K of tasks in the Native queue. I am a bit surprised 
that read and write queues are almost empty (under 100 items in both), but 
depending on which node was coordinating this can be ok.

> unbounded queues in native transport requests lead to node instability
> --
>
> Key: CASSANDRA-19534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19534
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Jon Haddad
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> When a node is under pressure, hundreds of thousands of requests can show up 
> in the native transport queue, and it looks like it can take way longer to 
> timeout than is configured.  We should be shedding load much more 
> aggressively and use a bounded queue for incoming work.  This is extremely 
> evident when we combine a resource consuming workload with a smaller one:
> Running 5.0 HEAD on a single node as of today:
> {noformat}
> # populate only
> easy-cass-stress run RandomPartitionAccess -p 100  -r 1 
> --workload.rows=10 --workload.select=partition --maxrlat 100 --populate 
> 10m --rate 50k -n 1
> # workload 1 - larger reads
> easy-cass-stress run RandomPartitionAccess -p 100  -r 1 
> --workload.rows=10 --workload.select=partition --rate 200 -d 1d
> # second workload - small reads
> easy-cass-stress run KeyValue -p 1m --rate 20k -r .5 -d 24h{noformat}
> It appears our results don't time out at the requested server time either:
>  
> {noformat}
>                  Writes                                  Reads                
>                   Deletes                       Errors
>   Count  Latency (p99)  1min (req/s) |   Count  Latency (p99)  1min (req/s) | 
>   Count  Latency (p99)  1min (req/s) |   Count  1min (errors/s)
>  950286       70403.93        634.77 |  789524       70442.07        426.02 | 
>       0              0             0 | 9580484         18980.45
>  952304       70567.62         640.1 |  791072       70634.34        428.36 | 
>       0              0             0 | 9636658         18969.54
>  953146       70767.34         640.1 |  791400       70767.76        428.36 | 
>       0              0             0 | 9695272         18969.54
>  956833       71171.28        623.14 |  794009        71175.6        412.79 | 
>       0              0             0 | 9749377         19002.44
>  959627       71312.58        656.93 |  795703       71349.87        435.56 | 
>       0              0             0 | 9804907         18943.11{noformat}
>  
> After stopping the load test altogether, it took nearly a minute before the 
> requests were no longer queued.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19574) LIKE_MATCHES should be an equality

2024-04-19 Thread Berenguer Blasi (Jira)
Berenguer Blasi created CASSANDRA-19574:
---

 Summary: LIKE_MATCHES should be an equality
 Key: CASSANDRA-19574
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19574
 Project: Cassandra
  Issue Type: Bug
Reporter: Berenguer Blasi
Assignee: Berenguer Blasi


The operator LIKE_MATCHES is 
[currently|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/Operator.java#L245]
 coded as a 'contains' instead of an 'equality'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19574) LIKE_MATCHES should be an equality

2024-04-19 Thread Berenguer Blasi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-19574:

 Bug Category: Parent values: Correctness(12982)
   Complexity: Normal
  Component/s: CQL/Semantics
Discovered By: User Report
 Severity: Normal
   Status: Open  (was: Triage Needed)

> LIKE_MATCHES should be an equality
> --
>
> Key: CASSANDRA-19574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19574
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> The operator LIKE_MATCHES is 
> [currently|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/Operator.java#L245]
>  coded as a 'contains' instead of an 'equality'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19574) LIKE_MATCHES should be an equality

2024-04-19 Thread Berenguer Blasi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-19574:

Fix Version/s: 4.0.x
   4.1.x
   5.0.x
   5.x

> LIKE_MATCHES should be an equality
> --
>
> Key: CASSANDRA-19574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19574
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> The operator LIKE_MATCHES is 
> [currently|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/Operator.java#L245]
>  coded as a 'contains' instead of an 'equality'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org