[jira] [Commented] (CASSANDRA-18635) Test failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest

2024-02-14 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817593#comment-17817593
 ] 

Berenguer Blasi commented on CASSANDRA-18635:
-

^ ahhh so iiuc this might be sthg different. As this ticket's failure Andres 
bisected it to CASSANDRA-17851 thx

> Test failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest
> ---
>
> Key: CASSANDRA-18635
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18635
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Brandon Williams
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> Seen here: 
> https://app.circleci.com/pipelines/github/driftx/cassandra/1095/workflows/6114e2e3-8dcc-4bb0-b664-ae7d82c3349f/jobs/33405/tests
> {noformat}
> junit.framework.AssertionFailedError: expected:<0> but was:<2>
>   at 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.upgradeSSTablesInterruptsOngoingCompaction(UpgradeSSTablesTest.java:86)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18762) Repair triggers OOM with direct buffer memory

2024-02-14 Thread Manish Khandelwal (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817590#comment-17817590
 ] 

Manish Khandelwal commented on CASSANDRA-18762:
---

We are also getting the same issue on multi DC setup. Though in single DC 
things run fine for 11 nodes. But once another DC is addded it starts to fail 
pretty quickly. Getting the same error as mentioned in the issue here. Running 
repair table wise seems to be successful most of the times. But on keyspace 
level repairs always fails for one of the keyspace. This keyspace has three 
tables, all STCS with one table having almost no data. Tried setting 
*-XX:MaxDirectMemorySize* but results are same, i.e., getting out of memory. We 
are on java8. and Cassandra 4.0.10. I think with multi DC should be easy to 
reproduce.

> Repair triggers OOM with direct buffer memory
> -
>
> Key: CASSANDRA-18762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18762
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Brad Schoening
>Priority: Normal
>  Labels: OutOfMemoryError
> Attachments: Cluster-dm-metrics-1.PNG, 
> image-2023-12-06-15-28-05-459.png, image-2023-12-06-15-29-31-491.png, 
> image-2023-12-06-15-58-55-007.png
>
>
> We are seeing repeated failures of nodes with 16GB of heap on a VM with 32GB 
> of physical RAM due to direct memory.  This seems to be related to 
> CASSANDRA-15202 which moved Merkel trees off-heap in 4.0.   Using Cassandra 
> 4.0.6 with Java 11.
> {noformat}
> 2023-08-09 04:30:57,470 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e55a3b0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_a from 
> /169.102.200.241:7000
> 2023-08-09 04:30:57,567 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e0d2900-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
> /169.93.192.29:7000
> 2023-08-09 04:30:57,568 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e1dcad0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_c from 
> /169.104.171.134:7000
> 2023-08-09 04:30:57,591 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e69a0e0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
> /169.79.232.67:7000
> 2023-08-09 04:30:57,876 [INFO ] [Service Thread] cluster_id=101 
> ip_address=169.0.0.1 GCInspector.java:294 - G1 Old Generation GC in 282ms. 
> Compressed Class Space: 8444560 -> 8372152; G1 Eden Space: 7809794048 -> 0; 
> G1 Old Gen: 1453478400 -> 820942800; G1 Survivor Space: 419430400 -> 0; 
> Metaspace: 80411136 -> 80176528
> 2023-08-09 04:30:58,387 [ERROR] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 JVMStabilityInspector.java:102 - OutOfMemory error 
> letting the JVM handle the error:
> java.lang.OutOfMemoryError: Direct buffer memory
> at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
> at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:118)
> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:318)
> at org.apache.cassandra.utils.MerkleTree.allocate(MerkleTree.java:742)
> at 
> org.apache.cassandra.utils.MerkleTree.deserializeOffHeap(MerkleTree.java:780)
> at org.apache.cassandra.utils.MerkleTree.deserializeTree(MerkleTree.java:751)
> at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:720)
> at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:698)
> at 
> org.apache.cassandra.utils.MerkleTrees$MerkleTreesSerializer.deserialize(MerkleTrees.java:416)
> at 
> org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:100)
> at 
> org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:84)
> at 
> org.apache.cassandra.net.Message$Serializer.deserializePost40(Message.java:782)
> at org.apache.cassandra.net.Message$Serializer.deserialize(Message.java:642)
> at 
> org.apache.cassandra.net.InboundMessageHandler$LargeMessage.deserialize(InboundMessageHandler.java:364)
> at 
> org.apache.cassandra.net.InboundMessageHandler$LargeMessage.access$1100(InboundMessageHandler.java:317)
> at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessLargeMessage.provideMessage(InboundMessageHandler.java:504)
> at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:429)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> 

[jira] [Commented] (CASSANDRA-19120) local consistencies may get timeout if blocking read repair is sending the read repair mutation to other DC

2024-02-14 Thread Runtian Liu (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817582#comment-17817582
 ] 

Runtian Liu commented on CASSANDRA-19120:
-

Updated the four PRs:

4.0: https://github.com/apache/cassandra/pull/2981

4.1: [https://github.com/apache/cassandra/pull/3019]

5.0: [https://github.com/apache/cassandra/pull/3020]

trunk: [https://github.com/apache/cassandra/pull/3021]

> local consistencies may get timeout if blocking read repair is sending the 
> read repair mutation to other DC 
> 
>
> Key: CASSANDRA-19120
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19120
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Runtian Liu
>Assignee: Runtian Liu
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: image-2023-11-29-15-26-08-056.png, signature.asc
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For a two DCs cluster setup. When a new node is being added to DC1, for 
> blocking read repair triggered by local_quorum in DC1, it will require to 
> send read repair mutation to an extra node(1)(2). The selector for read 
> repair may select *ANY* node that has not been contacted before(3) instead of 
> selecting the DC1 nodes. If a node from DC2 is selected, this will cause 100% 
> timeout because of the bug described below:
> When we initialized the latch(4) for blocking read repair, the shouldBlockOn 
> function will only return true for local nodes(5), the blockFor value will be 
> reduced if a local node doesn't require repair(6). The blockFor is same as 
> the number of read repair mutation sent out. But when the coordinator node 
> receives the response from the target nodes, the latch only count down for 
> nodes in same DC(7). The latch will wait till timeout and the read request 
> will timeout.
> This can be reproduced if you have a constant load on a 3 + 3 cluster when 
> adding a node. If you have someway to trigger blocking read repair(maybe by 
> adding load using stress tool). If you use local_quorum consistency with a 
> constant read after write load in the same DC that you are adding node. You 
> will see read timeout issue from time to time because of the bug described 
> above
>  
> I think for read repair when selecting the extra node to do repair, we should 
> prefer local nodes than the nodes from other region. Also, we need to fix the 
> latch part so even if we send mutation to the nodes in other DC, we don't get 
> a timeout.
> (1)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/locator/ReplicaPlans.java#L455]
> (2)[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L183]
> (3)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/locator/ReplicaPlans.java#L458]
> (4)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L96]
> (5)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L71]
> (6)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L88]
> (7)[https://github.com/apache/cassandra/blob/cassandra-4.0.11/src/java/org/apache/cassandra/service/reads/repair/BlockingPartitionRepair.java#L113]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19400) IndexStatusManager needs to prioritize SUCCESS over UNKNOWN states to maximize availability

2024-02-14 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19400:

 Bug Category: Parent values: Availability(12983)Level 1 values: 
Unavailable(12994)
Discovered By: Fuzz Test
 Severity: Low

> IndexStatusManager needs to prioritize SUCCESS over UNKNOWN states to 
> maximize availability
> ---
>
> Key: CASSANDRA-19400
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19400
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Caleb Rackliffe
>Priority: Low
> Fix For: 5.0.x, 5.x
>
>
> {{IndexStatusManager}} is responsible for knowing what SAI indexes are 
> queryable across the ring, endpoint by endpoint. There are two statuses that 
> SAI treats as queryable, but it should not treat them equally. 
> {{BUILD_SUCCEEDED}} means the index is definitely available and should be 
> able to serve queries without issue. {{UNKNOWN}} indicates that the status of 
> the index hasn’t propagated yet to this coordinator. It may be just fine, or 
> it may not be. If it isn’t a query will not return incorrect results, but it 
> will fail. If there are enough {{BUILD_SUCCEEDED}} replicas, we should ignore 
> {{UNKNOWN}} replicas and maximize availability. If the UNKNOWN replica is 
> going to become {{BUILD_SUCCEEDED}} shortly, it will happily start taking 
> requests at that point and spread the load. If not, we’ll avoid futile 
> attempts to query it too early.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19400) IndexStatusManager needs to prioritize SUCCESS over UNKNOWN states to maximize availability

2024-02-14 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19400:

  Workflow: Copy of Cassandra Bug Workflow  (was: Copy of Cassandra Default 
Workflow)
Issue Type: Bug  (was: Improvement)

> IndexStatusManager needs to prioritize SUCCESS over UNKNOWN states to 
> maximize availability
> ---
>
> Key: CASSANDRA-19400
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19400
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> {{IndexStatusManager}} is responsible for knowing what SAI indexes are 
> queryable across the ring, endpoint by endpoint. There are two statuses that 
> SAI treats as queryable, but it should not treat them equally. 
> {{BUILD_SUCCEEDED}} means the index is definitely available and should be 
> able to serve queries without issue. {{UNKNOWN}} indicates that the status of 
> the index hasn’t propagated yet to this coordinator. It may be just fine, or 
> it may not be. If it isn’t a query will not return incorrect results, but it 
> will fail. If there are enough {{BUILD_SUCCEEDED}} replicas, we should ignore 
> {{UNKNOWN}} replicas and maximize availability. If the UNKNOWN replica is 
> going to become {{BUILD_SUCCEEDED}} shortly, it will happily start taking 
> requests at that point and spread the load. If not, we’ll avoid futile 
> attempts to query it too early.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19400) IndexStatusManager needs to prioritize SUCCESS over UNKNOWN states to maximize availability

2024-02-14 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19400:

Change Category: Operability
 Complexity: Normal
  Fix Version/s: 5.0.x
 5.x
 Status: Open  (was: Triage Needed)

> IndexStatusManager needs to prioritize SUCCESS over UNKNOWN states to 
> maximize availability
> ---
>
> Key: CASSANDRA-19400
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19400
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI
>Reporter: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> {{IndexStatusManager}} is responsible for knowing what SAI indexes are 
> queryable across the ring, endpoint by endpoint. There are two statuses that 
> SAI treats as queryable, but it should not treat them equally. 
> {{BUILD_SUCCEEDED}} means the index is definitely available and should be 
> able to serve queries without issue. {{UNKNOWN}} indicates that the status of 
> the index hasn’t propagated yet to this coordinator. It may be just fine, or 
> it may not be. If it isn’t a query will not return incorrect results, but it 
> will fail. If there are enough {{BUILD_SUCCEEDED}} replicas, we should ignore 
> {{UNKNOWN}} replicas and maximize availability. If the UNKNOWN replica is 
> going to become {{BUILD_SUCCEEDED}} shortly, it will happily start taking 
> requests at that point and spread the load. If not, we’ll avoid futile 
> attempts to query it too early.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19400) IndexStatusManager needs to prioritize SUCCESS over UNKNOWN states to maximize availability

2024-02-14 Thread Caleb Rackliffe (Jira)
Caleb Rackliffe created CASSANDRA-19400:
---

 Summary: IndexStatusManager needs to prioritize SUCCESS over 
UNKNOWN states to maximize availability
 Key: CASSANDRA-19400
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19400
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/SAI
Reporter: Caleb Rackliffe


{{IndexStatusManager}} is responsible for knowing what SAI indexes are 
queryable across the ring, endpoint by endpoint. There are two statuses that 
SAI treats as queryable, but it should not treat them equally. 
{{BUILD_SUCCEEDED}} means the index is definitely available and should be able 
to serve queries without issue. {{UNKNOWN}} indicates that the status of the 
index hasn’t propagated yet to this coordinator. It may be just fine, or it may 
not be. If it isn’t a query will not return incorrect results, but it will 
fail. If there are enough {{BUILD_SUCCEEDED}} replicas, we should ignore 
{{UNKNOWN}} replicas and maximize availability. If the UNKNOWN replica is going 
to become {{BUILD_SUCCEEDED}} shortly, it will happily start taking requests at 
that point and spread the load. If not, we’ll avoid futile attempts to query it 
too early.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18667) Add multi-threaded SAI read and write fuzz test

2024-02-14 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-18667:

Epic Link: CASSANDRA-19224  (was: CASSANDRA-18473)

> Add multi-threaded SAI read and write fuzz test
> ---
>
> Key: CASSANDRA-18667
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18667
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI
>Reporter: Mike Adamson
>Priority: Normal
>
> We currently don't have a basic unit test that does multi-threaded reads and 
> writes to the index. We should add one to avoid potential basic concurrency 
> errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18940) SAI post-filtering reads don't update local table latency metrics

2024-02-14 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-18940:

Epic Link: CASSANDRA-19224  (was: CASSANDRA-18473)

> SAI post-filtering reads don't update local table latency metrics
> -
>
> Key: CASSANDRA-18940
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18940
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/2i Index, Feature/SAI, Observability/Metrics
>Reporter: Caleb Rackliffe
>Assignee: Mike Adamson
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: 
> draft_fix_for_SAI_post-filtering_reads_not_updating_local_table_metrics.patch
>
>
> Once an SAI index finds matches (primary keys), it reads the associated rows 
> and post-filters them to incorporate partial writes, tombstones, etc. 
> However, those reads are not currently updating the local table latency 
> metrics. It should be simple enough to attach a metrics recording 
> transformation to the iterator produced by querying local storage. (I've 
> attached a patch that should apply cleanly to trunk, but there may be a 
> better way...)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRASC-107) Improve logging for slice restore task

2024-02-14 Thread Yifan Cai (Jira)
Yifan Cai created CASSANDRASC-107:
-

 Summary: Improve logging for slice restore task
 Key: CASSANDRASC-107
 URL: https://issues.apache.org/jira/browse/CASSANDRASC-107
 Project: Sidecar for Apache Cassandra
  Issue Type: Improvement
Reporter: Yifan Cai


I want to propose logging improvements. 
Add more logs to the individual steps during the restore task, i.e. 
RestoreSliceTask and StorageClient. 
In other places like retrying to poll the object existence, the stack trace can 
be omitted, as it provides no additional knowledge than object not found.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRASC-106) Add restore task watcher to report long running tasks

2024-02-14 Thread Yifan Cai (Jira)
Yifan Cai created CASSANDRASC-106:
-

 Summary: Add restore task watcher to report long running tasks
 Key: CASSANDRASC-106
 URL: https://issues.apache.org/jira/browse/CASSANDRASC-106
 Project: Sidecar for Apache Cassandra
  Issue Type: Improvement
Reporter: Yifan Cai


Having a watcher to report the long running  restore slice task can provide 
better insights.
The watcher can live inside the RestoreProcessor and periodically examine the 
futures of the running tasks.
Ideally, it signals the task to log the current stack trace.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRASC-105) RestoreSliceTask could be stuck due to missing exception handling

2024-02-14 Thread Yifan Cai (Jira)
Yifan Cai created CASSANDRASC-105:
-

 Summary: RestoreSliceTask could be stuck due to missing exception 
handling
 Key: CASSANDRASC-105
 URL: https://issues.apache.org/jira/browse/CASSANDRASC-105
 Project: Sidecar for Apache Cassandra
  Issue Type: Bug
  Components: Rest API
Reporter: Yifan Cai


In RestoreSliceTask, there are a few places could throw exceptions but missing 
exception handling in call-sites. As a result, the RetoreSliceTask never 
fulfill the promise, i.e. the task is stuck.
For example, downloadObjectIfAbsent could throw instead of returning a future, 
in such case, the task will never fail or complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) 01/01: Merge branch 'cassandra-5.0' into trunk

2024-02-14 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 48607b83952ab923c401fd7886d76a5a5a5b3c78
Merge: 8bdf2615bc a04dc83cfc
Author: Stefan Miklosovic 
AuthorDate: Wed Feb 14 21:19:10 2024 +0100

Merge branch 'cassandra-5.0' into trunk



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-5.0 updated (8b037a6c84 -> a04dc83cfc)

2024-02-14 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 8b037a6c84 Deprecate native_transport_port_ssl
 add a9a7dd0caf increment version to 4.1.5
 add a04dc83cfc Merge branch 'cassandra-4.1' into cassandra-5.0

No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch trunk updated (8bdf2615bc -> 48607b8395)

2024-02-14 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 8bdf2615bc Merge branch 'cassandra-5.0' into trunk
 add a9a7dd0caf increment version to 4.1.5
 add a04dc83cfc Merge branch 'cassandra-4.1' into cassandra-5.0
 new 48607b8395 Merge branch 'cassandra-5.0' into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-4.1 updated (89a8155916 -> a9a7dd0caf)

2024-02-14 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch cassandra-4.1
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 89a8155916 Merge branch 'cassandra-4.0' into cassandra-4.1
 add a9a7dd0caf increment version to 4.1.5

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt  | 6 ++
 build.xml| 2 +-
 debian/changelog | 6 ++
 3 files changed, 13 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19018) An SAI-specific mechanism to ensure consistency isn't violated for multi-column (i.e. AND) queries at CL > ONE

2024-02-14 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817515#comment-17817515
 ] 

Caleb Rackliffe commented on CASSANDRA-19018:
-

I've finally narrowed in on a concrete repro for the range tombstone problems...

{noformat}
@Test
public void testPartialUpdatesWithDeleteBetween()
{
CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.partial_updates (k int, 
c int, a int, b int, PRIMARY KEY (k, c)) WITH read_repair = 'NONE'"));
CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.partial_updates(a) 
USING 'sai'"));
CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.partial_updates(b) 
USING 'sai'"));
SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE);

// insert a split row w/ a range tombstone sandwiched in the middle 
temporally 
CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO 
%s.partial_updates(k, c, a) VALUES (0, 1, 1) USING TIMESTAMP 1"));
CLUSTER.get(2).executeInternal(withKeyspace("DELETE FROM %s.partial_updates 
USING TIMESTAMP 2 WHERE k = 0 AND c > 0"));
CLUSTER.get(2).executeInternal(withKeyspace("INSERT INTO 
%s.partial_updates(k, c, b) VALUES (0, 1, 2) USING TIMESTAMP 3"));

String select = withKeyspace("SELECT * FROM %s.partial_updates WHERE a = 1 
AND b = 2");
Object[][] initialRows = CLUSTER.coordinator(1).execute(select, 
ConsistencyLevel.ALL);
assertRows(initialRows);  <-- This returns a row when it shouldn't!
}
{noformat}

tl;dr Because we can degrade intersections to unions inside SAI on unrepaired 
data, RFP no longer implicitly covers all delete cases without sending range 
tombstones to the coordinator or identifying silent replicas purely at the row 
level. In the case above, RFP could be made to work if it identified "silent" 
columns rather than entire rows. (i.e. It would notice that "a" from node 1 has 
no corresponding value from node 2, so the response from node 2 needs to be 
protected. Assuming data isn't always horrifically out of date, this is likely 
better than trying to send mostly unnecessary RTs.)

> An SAI-specific mechanism to ensure consistency isn't violated for 
> multi-column (i.e. AND) queries at CL > ONE
> --
>
> Key: CASSANDRA-19018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Feature/SAI
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: ci_summary-1.html, ci_summary.html, 
> result_details.tar-1.gz, result_details.tar.gz
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around 
> filtering/index queries that use intersection/AND over partially updated 
> non-key columns. (ex. Restricting one clustering column and one normal column 
> does not cause a consistency problem, as primary keys cannot be partially 
> updated.) This issue exists to attempt to fix this specifically for SAI in 
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config -> 
> config.with(GOSSIP).with(NETWORK)).start()))
> {
> cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int 
> PRIMARY KEY, a int, b int)"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING 
> 'sai'"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING 
> 'sai'"));
> // insert a split row
> cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> a) VALUES (0, 1)"));
> cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, 
> b) VALUES (0, 2)"));
> // Uncomment this line and test succeeds w/ partial writes 
> completed...
> //cluster.get(1).nodetoolResult("repair", 
> KEYSPACE).asserts().success();
> String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND 
> b = 2");
> Object[][] initialRows = cluster.coordinator(1).execute(select, 
> ConsistencyLevel.ALL);
> assertRows(initialRows, row(0, 1, 2)); // not found!!
> }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial 
> matches from the coordinator that would combine there to form full matches. 
> Simple non-index filtering queries also suffer from this problem, but they 
> hide the partial matches in a different way. I'll outline a possible solution 
> for this in the comments that takes advantage of replica filtering protection 
> and 

[jira] [Comment Edited] (CASSANDRA-19168) Test Failure: VectorUpdateDeleteTest fails with heap_buffers

2024-02-14 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817046#comment-17817046
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-19168 at 2/14/24 8:07 PM:
--

Thanks!

The patch was squashed and updated with your suggestion (new branch for 5.0 so 
we easily compare with the PR):

5.0 - [https://github.com/ekaterinadimitrova2/cassandra/tree/C-19168-5.0-final]

trunk - [https://github.com/ekaterinadimitrova2/cassandra/tree/C-19168-trunk]

Running CI at the moment:

5.0 - 
[https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra?branch=C-19168-5.0-final]

trunk - 
[https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra?branch=C-19168-trunk]

I also ran the updateTest with all possible options for 
memtable_allocation_type locally on both branches; it completed successfully. 

*5.0 CI failures:*
truncateWhileUpgrading-_jdk17 - -possible related to CASSANDRA-18635, checking 
with Berenguer-  Different one - CASSANDRA-19398

*trunk CI failures:*
 - test_consistent_range_movement_true_with_replica_down_should_fail - seems  
unrelated, I will check and open a ticket
 - 
testOptionalMtlsModeDoNotAllowNonSSLConnections-cassandra.testtag_IS_UNDEFINED 
- known from CASSANDRA-19239
 - test_move_single_node_localhost - known from CASSANDRA-19226
 - test_authorization_handle_unavailable - known from CASSANDRA-19217
 - org.apache.cassandra.simulator.test.HarrySimulatorTest - known from 
CASSANDRA-19279
 - test_stop_failure_policy - known from CASSANDRA-19100
 - optionalTlsConnectionAllowedToRegularPortTest-cassandra.testtag_IS_UNDEFINED 
and 
testOptionalMtlsModeDoNotAllowNonSSLConnections-cassandra.testtag_IS_UNDEFINED 
- known from CASSANDRA-19239


was (Author: e.dimitrova):
Thanks!

The patch was squashed and updated with your suggestion (new branch for 5.0 so 
we easily compare with the PR):

5.0 - [https://github.com/ekaterinadimitrova2/cassandra/tree/C-19168-5.0-final]

trunk - [https://github.com/ekaterinadimitrova2/cassandra/tree/C-19168-trunk]

Running CI at the moment:

5.0 - 
[https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra?branch=C-19168-5.0-final]

trunk - 
[https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra?branch=C-19168-trunk]

I also ran the updateTest with all possible options for 
memtable_allocation_type locally on both branches; it completed successfully. 

*5.0 CI failures:*
truncateWhileUpgrading-_jdk17 - possible related to CASSANDRA-18635, checking 
with Berenguer

*trunk CI failures:*
- test_consistent_range_movement_true_with_replica_down_should_fail - seems  
unrelated, I will check and open a ticket
- 
testOptionalMtlsModeDoNotAllowNonSSLConnections-cassandra.testtag_IS_UNDEFINED 
- known from CASSANDRA-19239
- test_move_single_node_localhost - known from CASSANDRA-19226
- test_authorization_handle_unavailable - known from CASSANDRA-19217
- org.apache.cassandra.simulator.test.HarrySimulatorTest - known from 
CASSANDRA-19279
- test_stop_failure_policy - known from CASSANDRA-19100
- optionalTlsConnectionAllowedToRegularPortTest-cassandra.testtag_IS_UNDEFINED 
and 
testOptionalMtlsModeDoNotAllowNonSSLConnections-cassandra.testtag_IS_UNDEFINED 
- known from CASSANDRA-19239

> Test Failure: VectorUpdateDeleteTest fails with heap_buffers
> 
>
> Key: CASSANDRA-19168
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19168
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Vector Search
>Reporter: Branimir Lambov
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} 
> fails with
> {code}
> junit.framework.AssertionFailedError: Result set does not contain a row with 
> pk = 0
>   at 
> org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133)
>   at 
> org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18635) Test failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest

2024-02-14 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817513#comment-17817513
 ] 

Ekaterina Dimitrova commented on CASSANDRA-18635:
-

Thanks, I opened CASSANDRA-19398

My testing shows that the test was not failing when introduced, but it failed 
on the current 5.0. 

> Test failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest
> ---
>
> Key: CASSANDRA-18635
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18635
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/java
>Reporter: Brandon Williams
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> Seen here: 
> https://app.circleci.com/pipelines/github/driftx/cassandra/1095/workflows/6114e2e3-8dcc-4bb0-b664-ae7d82c3349f/jobs/33405/tests
> {noformat}
> junit.framework.AssertionFailedError: expected:<0> but was:<2>
>   at 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.upgradeSSTablesInterruptsOngoingCompaction(UpgradeSSTablesTest.java:86)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading

2024-02-14 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19398:

Fix Version/s: 5.x

> Test Failure: 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
> --
>
> Key: CASSANDRA-19398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19398
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0]
> {code:java}
> junit.framework.AssertionFailedError at 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-builds) branch trunk updated: Ninja fix the ninja fix

2024-02-14 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git


The following commit(s) were added to refs/heads/trunk by this push:
 new d995eb4  Ninja fix the ninja fix
d995eb4 is described below

commit d995eb4d9440c9752c93ea4692ea8ac0d42d46b5
Author: Brandon Williams 
AuthorDate: Wed Feb 14 14:04:32 2024 -0600

Ninja fix the ninja fix
---
 cassandra-release/finish_release.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cassandra-release/finish_release.sh 
b/cassandra-release/finish_release.sh
index 143a7ea..7a777d2 100755
--- a/cassandra-release/finish_release.sh
+++ b/cassandra-release/finish_release.sh
@@ -278,6 +278,6 @@ echo ' 7) update #cassandra topic on slack'
 echo ' 8) tweet from @cassandra'
 echo ' 9) release version in JIRA'
 echo ' 10) remove old version (eg: `svn rm 
https://dist.apache.org/repos/dist/release/cassandra/`)'
-echo ' 11) increment build.xml (base.version), CHANGES.txt, and  
ubuntu2004_test.docker (ccm's installed) for the next release'
+echo ' 11) increment build.xml (base.version), CHANGES.txt, and  
ubuntu2004_test.docker (ccm installed) for the next release'
 echo ' 12) Add release in 
https://reporter.apache.org/addrelease.html?cassandra (same as instructions in 
email you will receive from the \"Apache Reporter Service\")'
 echo ' 13) update current_ version in 
cassandra-dtest/upgrade_tests/upgrade_manifest.py'


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading

2024-02-14 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817510#comment-17817510
 ] 

Ekaterina Dimitrova commented on CASSANDRA-19398:
-

Not reproduced on the commit that introduced the test:

 
{code:java}
.circleci/generate.sh -h \    
  -e REPEATED_UTEST_TARGET=test-jvm-dtest-some \
  -e 
REPEATED_UTEST_CLASS=org.apache.cassandra.distributed.test.UpgradeSSTablesTest \
  -e REPEATED_UTEST_METHODS=truncateWhileUpgrading \
  -e REPEATED_UTEST_COUNT=2000
{code}
[https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2649/workflows/b9524d18-a142-4394-b6ee-f41fef6da93d]

Reproduced on current 5.0:
{code:java}
.circleci/generate.sh -ps {code}
{code:java}
-e 
REPEATED_JVM_DTESTS=org.apache.cassandra.distributed.test.UpgradeSSTablesTest#truncateWhileUpgrading
 -e REPEATED_JVM_DTESTS_COUNT=2000{code}

https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra?branch=19398-5.0

 

> Test Failure: 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
> --
>
> Key: CASSANDRA-19398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19398
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0.x
>
>
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0]
> {code:java}
> junit.framework.AssertionFailedError at 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading

2024-02-14 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19398:

 Bug Category: Parent values: Correctness(12982)Level 1 values: Test 
Failure(12990)
   Complexity: Normal
  Component/s: CI
Discovered By: User Report
Fix Version/s: 5.0-rc
   (was: 5.0.x)
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Test Failure: 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
> --
>
> Key: CASSANDRA-19398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19398
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc
>
>
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0]
> {code:java}
> junit.framework.AssertionFailedError at 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19180) Support reloading certificate stores in cassandra-java-driver

2024-02-14 Thread Bret McGuire (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817502#comment-17817502
 ] 

Bret McGuire commented on CASSANDRA-19180:
--

Thanks [~brandon.williams] !  With your +1 we have two approvals from 
committers so we're all set!

> Support reloading certificate stores in cassandra-java-driver
> -
>
> Key: CASSANDRA-19180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19180
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Client/java-driver
>Reporter: Abe Ratnofsky
>Assignee: Abe Ratnofsky
>Priority: Normal
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, apache/cassandra-java-driver does not reload SSLContext when the 
> underlying certificate store files change. When the DefaultSslEngineFactory 
> (and the other factories) are set up, they build a fixed instance of 
> javax.net.ssl.SSLContext that doesn't change: 
> https://github.com/apache/cassandra-java-driver/blob/12e3e3ea027c51c5807e5e46ba542f894edfa4e7/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java#L74
> This fixed SSLContext is used to negotiate SSL with the cluster, and if a 
> keystore is reloaded on disk it isn't picked up by the driver, and future 
> reconnections will fail if the keystore certificates have expired by the time 
> they're used to handshake a new connection.
> We should reload client certificates so that applications that provide them 
> can use short-lived certificates and not require a bounce to pick up new 
> certificates. This is especially relevant in a world with CASSANDRA-18554 and 
> broad use of mTLS.
> I have a patch for this that is nearly ready. Now that the project has moved 
> under apache/ - who can I work with to understand how CI works now?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-java-driver) 01/03: CASSANDRA-19180: Support reloading keystore in cassandra-java-driver

2024-02-14 Thread absurdfarce
This is an automated email from the ASF dual-hosted git repository.

absurdfarce pushed a commit to branch 4.x
in repository https://gitbox.apache.org/repos/asf/cassandra-java-driver.git

commit 8e73232102d6275b4f13de9d089d3a9b224c9727
Author: Abe Ratnofsky 
AuthorDate: Thu Jan 18 14:20:44 2024 -0500

CASSANDRA-19180: Support reloading keystore in cassandra-java-driver
---
 .../api/core/config/DefaultDriverOption.java   |   6 +
 .../driver/api/core/config/TypedDriverOption.java  |   6 +
 .../internal/core/ssl/DefaultSslEngineFactory.java |  35 +--
 .../core/ssl/ReloadingKeyManagerFactory.java   | 257 +++
 core/src/main/resources/reference.conf |   7 +
 .../core/ssl/ReloadingKeyManagerFactoryTest.java   | 272 +
 .../ReloadingKeyManagerFactoryTest/README.md   |  39 +++
 .../certs/client-alternate.keystore| Bin 0 -> 2467 bytes
 .../certs/client-original.keystore | Bin 0 -> 2457 bytes
 .../certs/client.truststore| Bin 0 -> 1002 bytes
 .../certs/server.keystore  | Bin 0 -> 2407 bytes
 .../certs/server.truststore| Bin 0 -> 1890 bytes
 manual/core/ssl/README.md  |  10 +-
 upgrade_guide/README.md|  11 +
 14 files changed, 627 insertions(+), 16 deletions(-)

diff --git 
a/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
 
b/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
index 4c0668570..c10a8237c 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
@@ -255,6 +255,12 @@ public enum DefaultDriverOption implements DriverOption {
* Value-type: {@link String}
*/
   SSL_KEYSTORE_PASSWORD("advanced.ssl-engine-factory.keystore-password"),
+  /**
+   * The duration between attempts to reload the keystore.
+   *
+   * Value-type: {@link java.time.Duration}
+   */
+  
SSL_KEYSTORE_RELOAD_INTERVAL("advanced.ssl-engine-factory.keystore-reload-interval"),
   /**
* The location of the truststore file.
*
diff --git 
a/core/src/main/java/com/datastax/oss/driver/api/core/config/TypedDriverOption.java
 
b/core/src/main/java/com/datastax/oss/driver/api/core/config/TypedDriverOption.java
index ec3607973..88c012fa3 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/api/core/config/TypedDriverOption.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/api/core/config/TypedDriverOption.java
@@ -235,6 +235,12 @@ public class TypedDriverOption {
   /** The keystore password. */
   public static final TypedDriverOption SSL_KEYSTORE_PASSWORD =
   new TypedDriverOption<>(DefaultDriverOption.SSL_KEYSTORE_PASSWORD, 
GenericType.STRING);
+
+  /** The duration between attempts to reload the keystore. */
+  public static final TypedDriverOption SSL_KEYSTORE_RELOAD_INTERVAL 
=
+  new TypedDriverOption<>(
+  DefaultDriverOption.SSL_KEYSTORE_RELOAD_INTERVAL, 
GenericType.DURATION);
+
   /** The location of the truststore file. */
   public static final TypedDriverOption SSL_TRUSTSTORE_PATH =
   new TypedDriverOption<>(DefaultDriverOption.SSL_TRUSTSTORE_PATH, 
GenericType.STRING);
diff --git 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
index 085b36dc5..55a6e9c7d 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
@@ -27,11 +27,12 @@ import java.io.InputStream;
 import java.net.InetSocketAddress;
 import java.net.SocketAddress;
 import java.nio.file.Files;
+import java.nio.file.Path;
 import java.nio.file.Paths;
 import java.security.KeyStore;
 import java.security.SecureRandom;
+import java.time.Duration;
 import java.util.List;
-import javax.net.ssl.KeyManagerFactory;
 import javax.net.ssl.SSLContext;
 import javax.net.ssl.SSLEngine;
 import javax.net.ssl.SSLParameters;
@@ -54,6 +55,7 @@ import net.jcip.annotations.ThreadSafe;
  * truststore-password = password123
  * keystore-path = /path/to/client.keystore
  * keystore-password = password123
+ * keystore-reload-interval = 30 minutes
  *   }
  * }
  * 
@@ -66,6 +68,7 @@ public class DefaultSslEngineFactory implements 
SslEngineFactory {
   private final SSLContext sslContext;
   private final String[] cipherSuites;
   private final boolean requireHostnameValidation;
+  private ReloadingKeyManagerFactory kmf;
 
   /** Builds a new instance from the driver configuration. */
   public DefaultSslEngineFactory(DriverContext driverContext) {
@@ -132,20 +135,8 @@ public class DefaultSslEngineFactory implements 
SslEngineFactory {
   }
 
  

(cassandra-java-driver) 03/03: Address PR feedback: reload-interval to use Optional internally and null in config, rather than using sentinel Duration.ZERO

2024-02-14 Thread absurdfarce
This is an automated email from the ASF dual-hosted git repository.

absurdfarce pushed a commit to branch 4.x
in repository https://gitbox.apache.org/repos/asf/cassandra-java-driver.git

commit ea2e475185b5863ef6eed347f57286d6a3bfd8a9
Author: Abe Ratnofsky 
AuthorDate: Fri Feb 2 14:56:22 2024 -0500

Address PR feedback: reload-interval to use Optional internally and null in 
config, rather than using sentinel Duration.ZERO
---
 .../internal/core/ssl/DefaultSslEngineFactory.java | 14 +--
 .../core/ssl/ReloadingKeyManagerFactory.java   | 29 +++---
 .../core/ssl/ReloadingKeyManagerFactoryTest.java   |  4 +--
 3 files changed, 27 insertions(+), 20 deletions(-)

diff --git 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
index adf23f8e8..bb95dc738 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
@@ -33,6 +33,7 @@ import java.security.KeyStore;
 import java.security.SecureRandom;
 import java.time.Duration;
 import java.util.List;
+import java.util.Optional;
 import javax.net.ssl.SSLContext;
 import javax.net.ssl.SSLEngine;
 import javax.net.ssl.SSLParameters;
@@ -153,14 +154,11 @@ public class DefaultSslEngineFactory implements 
SslEngineFactory {
   private ReloadingKeyManagerFactory 
buildReloadingKeyManagerFactory(DriverExecutionProfile config)
   throws Exception {
 Path keystorePath = 
Paths.get(config.getString(DefaultDriverOption.SSL_KEYSTORE_PATH));
-String password =
-config.isDefined(DefaultDriverOption.SSL_KEYSTORE_PASSWORD)
-? config.getString(DefaultDriverOption.SSL_KEYSTORE_PASSWORD)
-: null;
-Duration reloadInterval =
-config.isDefined(DefaultDriverOption.SSL_KEYSTORE_RELOAD_INTERVAL)
-? 
config.getDuration(DefaultDriverOption.SSL_KEYSTORE_RELOAD_INTERVAL)
-: Duration.ZERO;
+String password = 
config.getString(DefaultDriverOption.SSL_KEYSTORE_PASSWORD, null);
+Optional reloadInterval =
+Optional.ofNullable(
+
config.getDuration(DefaultDriverOption.SSL_KEYSTORE_RELOAD_INTERVAL, null));
+
 return ReloadingKeyManagerFactory.create(keystorePath, password, 
reloadInterval);
   }
 
diff --git 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
index 540ddfd79..8a9e11bb2 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
@@ -36,6 +36,7 @@ import java.security.cert.CertificateException;
 import java.security.cert.X509Certificate;
 import java.time.Duration;
 import java.util.Arrays;
+import java.util.Optional;
 import java.util.concurrent.Executors;
 import java.util.concurrent.ScheduledExecutorService;
 import java.util.concurrent.TimeUnit;
@@ -68,12 +69,12 @@ public class ReloadingKeyManagerFactory extends 
KeyManagerFactory implements Aut
*
* @param keystorePath the keystore file to reload
* @param keystorePassword the keystore password
-   * @param reloadInterval the duration between reload attempts. Set to {@link
-   * java.time.Duration#ZERO} to disable scheduled reloading.
+   * @param reloadInterval the duration between reload attempts. Set to {@link 
Optional#empty()} to
+   * disable scheduled reloading.
* @return
*/
-  public static ReloadingKeyManagerFactory create(
-  Path keystorePath, String keystorePassword, Duration reloadInterval)
+  static ReloadingKeyManagerFactory create(
+  Path keystorePath, String keystorePassword, Optional 
reloadInterval)
   throws UnrecoverableKeyException, KeyStoreException, 
NoSuchAlgorithmException,
   CertificateException, IOException {
 KeyManagerFactory kmf = 
KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
@@ -103,14 +104,24 @@ public class ReloadingKeyManagerFactory extends 
KeyManagerFactory implements Aut
 this.spi = spi;
   }
 
-  private void start(Path keystorePath, String keystorePassword, Duration 
reloadInterval) {
+  private void start(
+  Path keystorePath, String keystorePassword, Optional 
reloadInterval) {
 this.keystorePath = keystorePath;
 this.keystorePassword = keystorePassword;
 
 // Ensure that reload is called once synchronously, to make sure the file 
exists etc.
 reload();
 
-if (!reloadInterval.isZero()) {
+if (!reloadInterval.isPresent() || reloadInterval.get().isZero()) {
+  final String msg =
+  "KeyStore reloading is disabled. If your Cassandra cluster requires 
client certificates, "
+ 

(cassandra-java-driver) 02/03: PR feedback: avoid extra exception wrapping, provide thread naming, improve error messages, etc.

2024-02-14 Thread absurdfarce
This is an automated email from the ASF dual-hosted git repository.

absurdfarce pushed a commit to branch 4.x
in repository https://gitbox.apache.org/repos/asf/cassandra-java-driver.git

commit c7719aed14705b735571ecbfbda23d3b8506eb11
Author: Abe Ratnofsky 
AuthorDate: Tue Jan 23 16:09:35 2024 -0500

PR feedback: avoid extra exception wrapping, provide thread naming, improve 
error messages, etc.
---
 .../api/core/config/DefaultDriverOption.java   | 12 +++---
 .../internal/core/ssl/DefaultSslEngineFactory.java |  4 +-
 .../core/ssl/ReloadingKeyManagerFactory.java   | 44 ++
 3 files changed, 28 insertions(+), 32 deletions(-)

diff --git 
a/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
 
b/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
index c10a8237c..afe16e968 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/api/core/config/DefaultDriverOption.java
@@ -255,12 +255,6 @@ public enum DefaultDriverOption implements DriverOption {
* Value-type: {@link String}
*/
   SSL_KEYSTORE_PASSWORD("advanced.ssl-engine-factory.keystore-password"),
-  /**
-   * The duration between attempts to reload the keystore.
-   *
-   * Value-type: {@link java.time.Duration}
-   */
-  
SSL_KEYSTORE_RELOAD_INTERVAL("advanced.ssl-engine-factory.keystore-reload-interval"),
   /**
* The location of the truststore file.
*
@@ -982,6 +976,12 @@ public enum DefaultDriverOption implements DriverOption {
* Value-type: boolean
*/
   
METRICS_GENERATE_AGGREGABLE_HISTOGRAMS("advanced.metrics.histograms.generate-aggregable"),
+  /**
+   * The duration between attempts to reload the keystore.
+   *
+   * Value-type: {@link java.time.Duration}
+   */
+  
SSL_KEYSTORE_RELOAD_INTERVAL("advanced.ssl-engine-factory.keystore-reload-interval"),
   ;
 
   private final String path;
diff --git 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
index 55a6e9c7d..adf23f8e8 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/DefaultSslEngineFactory.java
@@ -150,8 +150,8 @@ public class DefaultSslEngineFactory implements 
SslEngineFactory {
 }
   }
 
-  private ReloadingKeyManagerFactory buildReloadingKeyManagerFactory(
-  DriverExecutionProfile config) {
+  private ReloadingKeyManagerFactory 
buildReloadingKeyManagerFactory(DriverExecutionProfile config)
+  throws Exception {
 Path keystorePath = 
Paths.get(config.getString(DefaultDriverOption.SSL_KEYSTORE_PATH));
 String password =
 config.isDefined(DefaultDriverOption.SSL_KEYSTORE_PASSWORD)
diff --git 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
index 9aaee7011..540ddfd79 100644
--- 
a/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
+++ 
b/core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
@@ -73,26 +73,17 @@ public class ReloadingKeyManagerFactory extends 
KeyManagerFactory implements Aut
* @return
*/
   public static ReloadingKeyManagerFactory create(
-  Path keystorePath, String keystorePassword, Duration reloadInterval) {
-KeyManagerFactory kmf;
-try {
-  kmf = 
KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
-} catch (NoSuchAlgorithmException e) {
-  throw new RuntimeException(e);
-}
+  Path keystorePath, String keystorePassword, Duration reloadInterval)
+  throws UnrecoverableKeyException, KeyStoreException, 
NoSuchAlgorithmException,
+  CertificateException, IOException {
+KeyManagerFactory kmf = 
KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
 
 KeyStore ks;
 try (InputStream ksf = Files.newInputStream(keystorePath)) {
   ks = KeyStore.getInstance(KEYSTORE_TYPE);
   ks.load(ksf, keystorePassword.toCharArray());
-} catch (IOException | CertificateException | KeyStoreException | 
NoSuchAlgorithmException e) {
-  throw new RuntimeException(e);
-}
-try {
-  kmf.init(ks, keystorePassword.toCharArray());
-} catch (KeyStoreException | NoSuchAlgorithmException | 
UnrecoverableKeyException e) {
-  throw new RuntimeException(e);
 }
+kmf.init(ks, keystorePassword.toCharArray());
 
 ReloadingKeyManagerFactory reloadingKeyManagerFactory = new 
ReloadingKeyManagerFactory(kmf);
 reloadingKeyManagerFactory.start(keystorePath, keystorePassword, 
reloadInterval);
@@ -115,24 +106,26 @@ public class 

(cassandra-java-driver) branch 4.x updated (8d5849cb3 -> ea2e47518)

2024-02-14 Thread absurdfarce
This is an automated email from the ASF dual-hosted git repository.

absurdfarce pushed a change to branch 4.x
in repository https://gitbox.apache.org/repos/asf/cassandra-java-driver.git


from 8d5849cb3 Remove ASL header from test resource files (that was 
breaking integration tests)
 new 8e7323210 CASSANDRA-19180: Support reloading keystore in 
cassandra-java-driver
 new c7719aed1 PR feedback: avoid extra exception wrapping, provide thread 
naming, improve error messages, etc.
 new ea2e47518 Address PR feedback: reload-interval to use Optional 
internally and null in config, rather than using sentinel Duration.ZERO

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../api/core/config/DefaultDriverOption.java   |   6 +
 .../driver/api/core/config/TypedDriverOption.java  |   6 +
 .../internal/core/ssl/DefaultSslEngineFactory.java |  33 +--
 .../core/ssl/ReloadingKeyManagerFactory.java   | 264 
 core/src/main/resources/reference.conf |   7 +
 .../core/ssl/ReloadingKeyManagerFactoryTest.java   | 270 +
 .../ReloadingKeyManagerFactoryTest/README.md   |  39 +++
 .../certs/client-alternate.keystore| Bin 0 -> 2467 bytes
 .../certs/client-original.keystore | Bin 0 -> 2457 bytes
 .../certs/client.truststore| Bin 0 -> 1002 bytes
 .../certs/server.keystore  | Bin 0 -> 2407 bytes
 .../certs/server.truststore| Bin 0 -> 1890 bytes
 manual/core/ssl/README.md  |  10 +-
 upgrade_guide/README.md|  11 +
 14 files changed, 630 insertions(+), 16 deletions(-)
 create mode 100644 
core/src/main/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactory.java
 create mode 100644 
core/src/test/java/com/datastax/oss/driver/internal/core/ssl/ReloadingKeyManagerFactoryTest.java
 create mode 100644 
core/src/test/resources/ReloadingKeyManagerFactoryTest/README.md
 create mode 100644 
core/src/test/resources/ReloadingKeyManagerFactoryTest/certs/client-alternate.keystore
 create mode 100644 
core/src/test/resources/ReloadingKeyManagerFactoryTest/certs/client-original.keystore
 create mode 100644 
core/src/test/resources/ReloadingKeyManagerFactoryTest/certs/client.truststore
 create mode 100644 
core/src/test/resources/ReloadingKeyManagerFactoryTest/certs/server.keystore
 create mode 100644 
core/src/test/resources/ReloadingKeyManagerFactoryTest/certs/server.truststore


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19180: Support reloading keystore in cassandra-java-driver [cassandra-java-driver]

2024-02-14 Thread via GitHub


absurdfarce merged PR #1907:
URL: https://github.com/apache/cassandra-java-driver/pull/1907


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19399) Zombie repair session blocks further incremental repairs due to SSTable lock

2024-02-14 Thread Sebastian Marsching (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817479#comment-17817479
 ] 

Sebastian Marsching commented on CASSANDRA-19399:
-

I considered that it might be the same as CASSANDRA-19182, but I ran 
{{sstablemetadata}} for all SSTables in the affected keyspace and none of them 
had the pending repair flag set, so I it seems to be something else.

> Zombie repair session blocks further incremental repairs due to SSTable lock
> 
>
> Key: CASSANDRA-19399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19399
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Sebastian Marsching
>Priority: Normal
> Fix For: 4.1.x
>
> Attachments: system.log.txt
>
>
> We have experienced the following bug in C* 4.1.3 at least twice:
> Somtimes, a failed incremental repair session keeps future incremental repair 
> sessions from running. These future sessions fail with the following message 
> in the log file:
> {code:java}
> PendingAntiCompaction.java:210 - Prepare phase for incremental repair session 
> c8b65260-cb53-11ee-a219-3d5d7e5cdec7 has failed because it encountered 
> intersecting sstables belonging to another incremental repair session 
> (02d7c1a0-cb3a-11ee-aa89-a1b2ad548382). This is caused by starting an 
> incremental repair session before a previous one has completed. Check 
> nodetool repair_admin for hung sessions and fix them. {code}
> This happens, even though there are no active repair sessions on any node 
> ({{{}nodetool repair_admin list{}}} prints {{{}no sessions{}}}).
> When running {{{}nodetool repair_admin list --all{}}}, the offending session 
> is listed as failed:
> {code:java}
> id                                   | state     | last activity | 
> coordinator           | participants                                          
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                 | participants_wp             
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                    
> 02d7c1a0-cb3a-11ee-aa89-a1b2ad548382 | FAILED    | 5454 (s)      | 
> /192.168.108.235:7000 | 
> 192.168.108.224,192.168.108.96,192.168.108.97,192.168.108.225,192.168.108.226,192.168.108.98,192.168.108.99,192.168.108.227,192.168.108.100,192.168.108.228,192.168.108.229,192.168.108.101,192.168.108.230,192.168.108.102,192.168.108.103,192.168.108.231,192.168.108.221,192.168.108.94,192.168.108.222,192.168.108.95,192.168.108.223,192.168.108.241,192.168.108.242,192.168.108.243,192.168.108.244,192.168.108.104,192.168.108.105,192.168.108.235
>                             
> {code}
> This still happens after canceling the repair session, regardless of whether 
> it is canceled on the coordinator node or on all nodes (using 
> {{{}--force{}}}).
> I attached all lines from the C* system log that refer to the offending 
> session. It seems like another repair session was started while this session 
> was still running (possibly due to a bug in Cassandra Reaper), but the 
> session was failed right after that but still seems to hold a lock on some of 
> the SSTables.
> The problem can be resolved by restarting the nodes affected by this (which 
> typically means doing a rolling restart of the whole cluster), but this is 
> obviously not ideal...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19399) Zombie repair session blocks further incremental repairs due to SSTable lock

2024-02-14 Thread Andy Tolbert (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817476#comment-17817476
 ] 

Andy Tolbert commented on CASSANDRA-19399:
--

Could this be the same as [CASSANDRA-19182]?

> Zombie repair session blocks further incremental repairs due to SSTable lock
> 
>
> Key: CASSANDRA-19399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19399
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Sebastian Marsching
>Priority: Normal
> Fix For: 4.1.x
>
> Attachments: system.log.txt
>
>
> We have experienced the following bug in C* 4.1.3 at least twice:
> Somtimes, a failed incremental repair session keeps future incremental repair 
> sessions from running. These future sessions fail with the following message 
> in the log file:
> {code:java}
> PendingAntiCompaction.java:210 - Prepare phase for incremental repair session 
> c8b65260-cb53-11ee-a219-3d5d7e5cdec7 has failed because it encountered 
> intersecting sstables belonging to another incremental repair session 
> (02d7c1a0-cb3a-11ee-aa89-a1b2ad548382). This is caused by starting an 
> incremental repair session before a previous one has completed. Check 
> nodetool repair_admin for hung sessions and fix them. {code}
> This happens, even though there are no active repair sessions on any node 
> ({{{}nodetool repair_admin list{}}} prints {{{}no sessions{}}}).
> When running {{{}nodetool repair_admin list --all{}}}, the offending session 
> is listed as failed:
> {code:java}
> id                                   | state     | last activity | 
> coordinator           | participants                                          
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                 | participants_wp             
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                    
> 02d7c1a0-cb3a-11ee-aa89-a1b2ad548382 | FAILED    | 5454 (s)      | 
> /192.168.108.235:7000 | 
> 192.168.108.224,192.168.108.96,192.168.108.97,192.168.108.225,192.168.108.226,192.168.108.98,192.168.108.99,192.168.108.227,192.168.108.100,192.168.108.228,192.168.108.229,192.168.108.101,192.168.108.230,192.168.108.102,192.168.108.103,192.168.108.231,192.168.108.221,192.168.108.94,192.168.108.222,192.168.108.95,192.168.108.223,192.168.108.241,192.168.108.242,192.168.108.243,192.168.108.244,192.168.108.104,192.168.108.105,192.168.108.235
>                             
> {code}
> This still happens after canceling the repair session, regardless of whether 
> it is canceled on the coordinator node or on all nodes (using 
> {{{}--force{}}}).
> I attached all lines from the C* system log that refer to the offending 
> session. It seems like another repair session was started while this session 
> was still running (possibly due to a bug in Cassandra Reaper), but the 
> session was failed right after that but still seems to hold a lock on some of 
> the SSTables.
> The problem can be resolved by restarting the nodes affected by this (which 
> typically means doing a rolling restart of the whole cluster), but this is 
> obviously not ideal...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19399) Zombie repair session blocks further incremental repairs due to SSTable lock

2024-02-14 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817468#comment-17817468
 ] 

Brandon Williams commented on CASSANDRA-19399:
--

/cc [~dcapwell]

> Zombie repair session blocks further incremental repairs due to SSTable lock
> 
>
> Key: CASSANDRA-19399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19399
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Sebastian Marsching
>Priority: Normal
> Fix For: 4.1.x
>
> Attachments: system.log.txt
>
>
> We have experienced the following bug in C* 4.1.3 at least twice:
> Somtimes, a failed incremental repair session keeps future incremental repair 
> sessions from running. These future sessions fail with the following message 
> in the log file:
> {code:java}
> PendingAntiCompaction.java:210 - Prepare phase for incremental repair session 
> c8b65260-cb53-11ee-a219-3d5d7e5cdec7 has failed because it encountered 
> intersecting sstables belonging to another incremental repair session 
> (02d7c1a0-cb3a-11ee-aa89-a1b2ad548382). This is caused by starting an 
> incremental repair session before a previous one has completed. Check 
> nodetool repair_admin for hung sessions and fix them. {code}
> This happens, even though there are no active repair sessions on any node 
> ({{{}nodetool repair_admin list{}}} prints {{{}no sessions{}}}).
> When running {{{}nodetool repair_admin list --all{}}}, the offending session 
> is listed as failed:
> {code:java}
> id                                   | state     | last activity | 
> coordinator           | participants                                          
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                 | participants_wp             
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                    
> 02d7c1a0-cb3a-11ee-aa89-a1b2ad548382 | FAILED    | 5454 (s)      | 
> /192.168.108.235:7000 | 
> 192.168.108.224,192.168.108.96,192.168.108.97,192.168.108.225,192.168.108.226,192.168.108.98,192.168.108.99,192.168.108.227,192.168.108.100,192.168.108.228,192.168.108.229,192.168.108.101,192.168.108.230,192.168.108.102,192.168.108.103,192.168.108.231,192.168.108.221,192.168.108.94,192.168.108.222,192.168.108.95,192.168.108.223,192.168.108.241,192.168.108.242,192.168.108.243,192.168.108.244,192.168.108.104,192.168.108.105,192.168.108.235
>                             
> {code}
> This still happens after canceling the repair session, regardless of whether 
> it is canceled on the coordinator node or on all nodes (using 
> {{{}--force{}}}).
> I attached all lines from the C* system log that refer to the offending 
> session. It seems like another repair session was started while this session 
> was still running (possibly due to a bug in Cassandra Reaper), but the 
> session was failed right after that but still seems to hold a lock on some of 
> the SSTables.
> The problem can be resolved by restarting the nodes affected by this (which 
> typically means doing a rolling restart of the whole cluster), but this is 
> obviously not ideal...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19399) Zombie repair session blocks further incremental repairs due to SSTable lock

2024-02-14 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19399:
-
 Bug Category: Parent values: Degradation(12984)Level 1 values: Resource 
Management(12995)
   Complexity: Normal
Discovered By: User Report
Fix Version/s: 4.1.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Zombie repair session blocks further incremental repairs due to SSTable lock
> 
>
> Key: CASSANDRA-19399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19399
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Sebastian Marsching
>Priority: Normal
> Fix For: 4.1.x
>
> Attachments: system.log.txt
>
>
> We have experienced the following bug in C* 4.1.3 at least twice:
> Somtimes, a failed incremental repair session keeps future incremental repair 
> sessions from running. These future sessions fail with the following message 
> in the log file:
> {code:java}
> PendingAntiCompaction.java:210 - Prepare phase for incremental repair session 
> c8b65260-cb53-11ee-a219-3d5d7e5cdec7 has failed because it encountered 
> intersecting sstables belonging to another incremental repair session 
> (02d7c1a0-cb3a-11ee-aa89-a1b2ad548382). This is caused by starting an 
> incremental repair session before a previous one has completed. Check 
> nodetool repair_admin for hung sessions and fix them. {code}
> This happens, even though there are no active repair sessions on any node 
> ({{{}nodetool repair_admin list{}}} prints {{{}no sessions{}}}).
> When running {{{}nodetool repair_admin list --all{}}}, the offending session 
> is listed as failed:
> {code:java}
> id                                   | state     | last activity | 
> coordinator           | participants                                          
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                 | participants_wp             
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                                                                               
>                    
> 02d7c1a0-cb3a-11ee-aa89-a1b2ad548382 | FAILED    | 5454 (s)      | 
> /192.168.108.235:7000 | 
> 192.168.108.224,192.168.108.96,192.168.108.97,192.168.108.225,192.168.108.226,192.168.108.98,192.168.108.99,192.168.108.227,192.168.108.100,192.168.108.228,192.168.108.229,192.168.108.101,192.168.108.230,192.168.108.102,192.168.108.103,192.168.108.231,192.168.108.221,192.168.108.94,192.168.108.222,192.168.108.95,192.168.108.223,192.168.108.241,192.168.108.242,192.168.108.243,192.168.108.244,192.168.108.104,192.168.108.105,192.168.108.235
>                             
> {code}
> This still happens after canceling the repair session, regardless of whether 
> it is canceled on the coordinator node or on all nodes (using 
> {{{}--force{}}}).
> I attached all lines from the C* system log that refer to the offending 
> session. It seems like another repair session was started while this session 
> was still running (possibly due to a bug in Cassandra Reaper), but the 
> session was failed right after that but still seems to hold a lock on some of 
> the SSTables.
> The problem can be resolved by restarting the nodes affected by this (which 
> typically means doing a rolling restart of the whole cluster), but this is 
> obviously not ideal...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19399) Zombie repair session blocks further incremental repairs due to SSTable lock

2024-02-14 Thread Sebastian Marsching (Jira)
Sebastian Marsching created CASSANDRA-19399:
---

 Summary: Zombie repair session blocks further incremental repairs 
due to SSTable lock
 Key: CASSANDRA-19399
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19399
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Repair
Reporter: Sebastian Marsching
 Attachments: system.log.txt

We have experienced the following bug in C* 4.1.3 at least twice:

Somtimes, a failed incremental repair session keeps future incremental repair 
sessions from running. These future sessions fail with the following message in 
the log file:
{code:java}
PendingAntiCompaction.java:210 - Prepare phase for incremental repair session 
c8b65260-cb53-11ee-a219-3d5d7e5cdec7 has failed because it encountered 
intersecting sstables belonging to another incremental repair session 
(02d7c1a0-cb3a-11ee-aa89-a1b2ad548382). This is caused by starting an 
incremental repair session before a previous one has completed. Check nodetool 
repair_admin for hung sessions and fix them. {code}
This happens, even though there are no active repair sessions on any node 
({{{}nodetool repair_admin list{}}} prints {{{}no sessions{}}}).

When running {{{}nodetool repair_admin list --all{}}}, the offending session is 
listed as failed:
{code:java}
id                                   | state     | last activity | coordinator  
         | participants                                                         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                     | participants_wp                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                    
02d7c1a0-cb3a-11ee-aa89-a1b2ad548382 | FAILED    | 5454 (s)      | 
/192.168.108.235:7000 | 
192.168.108.224,192.168.108.96,192.168.108.97,192.168.108.225,192.168.108.226,192.168.108.98,192.168.108.99,192.168.108.227,192.168.108.100,192.168.108.228,192.168.108.229,192.168.108.101,192.168.108.230,192.168.108.102,192.168.108.103,192.168.108.231,192.168.108.221,192.168.108.94,192.168.108.222,192.168.108.95,192.168.108.223,192.168.108.241,192.168.108.242,192.168.108.243,192.168.108.244,192.168.108.104,192.168.108.105,192.168.108.235
                            
{code}
This still happens after canceling the repair session, regardless of whether it 
is canceled on the coordinator node or on all nodes (using {{{}--force{}}}).

I attached all lines from the C* system log that refer to the offending 
session. It seems like another repair session was started while this session 
was still running (possibly due to a bug in Cassandra Reaper), but the session 
was failed right after that but still seems to hold a lock on some of the 
SSTables.

The problem can be resolved by restarting the nodes affected by this (which 
typically means doing a rolling restart of the whole cluster), but this is 
obviously not ideal...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-site updated (2d778def9 -> 95d5d1e87)

2024-02-14 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 2d778def9 generate docs for aa8a03c7
 add c4b35db18 Minor release 4.1.4
 add 95d5d1e87 generate docs for c4b35db1

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (2d778def9)
\
 N -- N -- N   refs/heads/asf-site (95d5d1e87)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 content/_/download.html|   8 +-
 .../managing/configuration/cass_yaml_file.html |   3 +-
 .../managing/configuration/cass_yaml_file.html |   3 +-
 .../5.1/cassandra/managing/operating/metrics.html  | 180 -
 .../managing/tools/nodetool/clientstats.html   |   8 +-
 .../managing/tools/nodetool/reconfigurecms.html|  11 +-
 .../managing/configuration/cass_yaml_file.html |   3 +-
 .../managing/configuration/cass_yaml_file.html |   3 +-
 .../cassandra/managing/operating/metrics.html  | 180 -
 .../managing/tools/nodetool/clientstats.html   |   8 +-
 .../managing/tools/nodetool/reconfigurecms.html|  11 +-
 content/search-index.js|   2 +-
 .../source/modules/ROOT/pages/download.adoc|   8 +-
 site-ui/build/ui-bundle.zip| Bin 4883646 -> 4883646 
bytes
 14 files changed, 387 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19397) Remove all code around native_transport_port_ssl

2024-02-14 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19397:
--
Test and Documentation Plan: CI
 Status: Patch Available  (was: In Progress)

> Remove all code around native_transport_port_ssl
> 
>
> Key: CASSANDRA-19397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19397
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We deprecated native_transport_port_ssl in CASSANDRA-19392 and we told we go 
> to remove it next. This ticket is about that removal. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19391) Flush metadata snapshot table on every write

2024-02-14 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817452#comment-17817452
 ] 

Marcus Eriksson edited comment on CASSANDRA-19391 at 2/14/24 5:09 PM:
--

flushing showed that we couldn't really read the metadata_snapshots sstables 
due to the reversed longtoken localpartitioner we added in CASSANDRA-19189, so 
here we add a reverse ordered partitioner (for long keys) which calculates 
tokens by Long.MAX_VALUE - key.

CI a bit shaky, but looks like unrelated failures, will rerun (includes both 
CASSANDRA-19390 and CASSANDRA-19391)

https://github.com/apache/cassandra/pull/3104


was (Author: krummas):
flushing showed that we couldn't really read the metadata_snapshots sstables 
due to the reversed longtoken localpartitioner we added in CASSANDRA-19189, so 
here we add a reverse ordered partitioner (for long keys) which calculates 
tokens by Long.MAX_VALUE - key.

CI a bit shaky, but looks like unrelated failures, will rerun (includes both 
CASSANDRA-19390 and CASSANDRA-19391)

> Flush metadata snapshot table on every write
> 
>
> Key: CASSANDRA-19391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19391
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>
> We depend on the latest snapshot when starting up, flushing avoids gaps 
> between latest snapshot and the most recent local log entry



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19391) Flush metadata snapshot table on every write

2024-02-14 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817452#comment-17817452
 ] 

Marcus Eriksson edited comment on CASSANDRA-19391 at 2/14/24 5:06 PM:
--

flushing showed that we couldn't really read the metadata_snapshots sstables 
due to the reversed longtoken localpartitioner we added in CASSANDRA-19189, so 
here we add a reverse ordered partitioner (for long keys) which calculates 
tokens by Long.MAX_VALUE - key.

CI a bit shaky, but looks like unrelated failures, will rerun (includes both 
CASSANDRA-19390 and CASSANDRA-19391)


was (Author: krummas):
flushing showed that we couldn't really read the metadata_snapshots sstables 
due to the reversed longtoken partitioner we added in CASSANDRA-19189, so here 
we add a reverse ordered partitioner (for long keys) which calculates tokens by 
Long.MAX_VALUE - key.

CI a bit shaky, but looks like unrelated failures, will rerun (includes both 
CASSANDRA-19390 and CASSANDRA-19391)

> Flush metadata snapshot table on every write
> 
>
> Key: CASSANDRA-19391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19391
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>
> We depend on the latest snapshot when starting up, flushing avoids gaps 
> between latest snapshot and the most recent local log entry



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19391) Flush metadata snapshot table on every write

2024-02-14 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817452#comment-17817452
 ] 

Marcus Eriksson commented on CASSANDRA-19391:
-

flushing showed that we couldn't really read the metadata_snapshots sstables 
due to the reversed longtoken partitioner we added in CASSANDRA-19189, so here 
we add a reverse ordered partitioner (for long keys) which calculates tokens by 
Long.MAX_VALUE - key.

CI a bit shaky, but looks like unrelated failures, will rerun (includes both 
CASSANDRA-19390 and CASSANDRA-19391)

> Flush metadata snapshot table on every write
> 
>
> Key: CASSANDRA-19391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19391
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>
> We depend on the latest snapshot when starting up, flushing avoids gaps 
> between latest snapshot and the most recent local log entry



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19391) Flush metadata snapshot table on every write

2024-02-14 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19391:

Attachment: ci_summary.html
result_details.tar.gz

> Flush metadata snapshot table on every write
> 
>
> Key: CASSANDRA-19391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19391
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>
> We depend on the latest snapshot when starting up, flushing avoids gaps 
> between latest snapshot and the most recent local log entry



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRASC-104) Relocate Sidecar common classes in vertx-client-shaded

2024-02-14 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRASC-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRASC-104:
---
  Fix Version/s: 1.0
Source Control Link: 
https://github.com/apache/cassandra-sidecar/commit/b5570109c19acaf91281fd7901041c0c2b1f3b6c
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Relocate Sidecar common classes in vertx-client-shaded
> --
>
> Key: CASSANDRASC-104
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-104
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 1.0
>
>
> It is desirable to relocate the common classes 
> {{org.apache.cassandra.sidecar.common.*}} in the {{vertx-client-shaded}} 
> subproject. The benefits are the following:
> - Better isolation of the shared classes when loading them in downstream 
> projects (i.e Analytics)
> - Avoids having two classes loaded in the same classpath, but with different 
> internal definition (for example when annotations are relocated but the class 
> itself is not)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-sidecar) branch trunk updated: CASSANDRASC-104 Relocate Sidecar common classes in vertx-client-shaded

2024-02-14 Thread frankgh
This is an automated email from the ASF dual-hosted git repository.

frankgh pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-sidecar.git


The following commit(s) were added to refs/heads/trunk by this push:
 new b557010  CASSANDRASC-104 Relocate Sidecar common classes in 
vertx-client-shaded
b557010 is described below

commit b5570109c19acaf91281fd7901041c0c2b1f3b6c
Author: Francisco Guerrero 
AuthorDate: Mon Feb 12 21:13:23 2024 -0800

CASSANDRASC-104 Relocate Sidecar common classes in vertx-client-shaded

Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-104
---
 CHANGES.txt  | 1 +
 vertx-client-shaded/build.gradle | 1 +
 2 files changed, 2 insertions(+)

diff --git a/CHANGES.txt b/CHANGES.txt
index c12d3ca..e1ac034 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,5 +1,6 @@
 1.0.0
 -
+ * Relocate Sidecar common classes in vertx-client-shaded (CASSANDRASC-104)
  * Automated yaml type binding for deserialization (CASSANDRASC-103)
  * Upgrade Vert.x version in Sidecar to 4.5 (CASSANDRASC-101)
  * Break restore job into stage and import phases and persist restore slice 
status on phase completion (CASSANDRASC-99)
diff --git a/vertx-client-shaded/build.gradle b/vertx-client-shaded/build.gradle
index 189a82a..24519e8 100644
--- a/vertx-client-shaded/build.gradle
+++ b/vertx-client-shaded/build.gradle
@@ -69,6 +69,7 @@ shadowJar {
 archiveClassifier.set('')
 // Our use of Jackson should be an implementation detail - shade 
everything so no matter what
 // version of Jackson is available in the classpath we don't break 
consumers of the client
+relocate 'org.apache.cassandra.sidecar.common', 
'o.a.c.sidecar.client.shaded.common'
 relocate 'com.fasterxml.jackson', 
'o.a.c.sidecar.client.shaded.com.fasterxml.jackson'
 relocate 'io.netty', 'o.a.c.sidecar.client.shaded.io.netty'
 relocate 'io.vertx', 'o.a.c.sidecar.client.shaded.io.vertx'


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRASC-104) Relocate Sidecar common classes in vertx-client-shaded

2024-02-14 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRASC-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817451#comment-17817451
 ] 

ASF subversion and git services commented on CASSANDRASC-104:
-

Commit b5570109c19acaf91281fd7901041c0c2b1f3b6c in cassandra-sidecar's branch 
refs/heads/trunk from Francisco Guerrero
[ https://gitbox.apache.org/repos/asf?p=cassandra-sidecar.git;h=b557010 ]

CASSANDRASC-104 Relocate Sidecar common classes in vertx-client-shaded

Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRASC-104


> Relocate Sidecar common classes in vertx-client-shaded
> --
>
> Key: CASSANDRASC-104
> URL: https://issues.apache.org/jira/browse/CASSANDRASC-104
> Project: Sidecar for Apache Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Labels: pull-request-available
>
> It is desirable to relocate the common classes 
> {{org.apache.cassandra.sidecar.common.*}} in the {{vertx-client-shaded}} 
> subproject. The benefits are the following:
> - Better isolation of the shared classes when loading them in downstream 
> projects (i.e Analytics)
> - Avoids having two classes loaded in the same classpath, but with different 
> internal definition (for example when annotations are relocated but the class 
> itself is not)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19390) Transformation.Kind should contain an explicit integer id

2024-02-14 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817450#comment-17817450
 ] 

Marcus Eriksson commented on CASSANDRA-19390:
-

a bit shaky ci results, a few timeouts etc, don't think any are related, but 
will rerun, both 19390+19391 in this run

> Transformation.Kind should contain an explicit integer id
> -
>
> Key: CASSANDRA-19390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19390
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19390) Transformation.Kind should contain an explicit integer id

2024-02-14 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-19390:

Attachment: ci_summary.html
result_details.tar.gz

> Transformation.Kind should contain an explicit integer id
> -
>
> Key: CASSANDRA-19390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19390
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Transactional Cluster Metadata
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-staging updated (367c839bb -> 95d5d1e87)

2024-02-14 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 367c839bb generate docs for aa8a03c7
 add c4b35db18 Minor release 4.1.4
 new 95d5d1e87 generate docs for c4b35db1

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (367c839bb)
\
 N -- N -- N   refs/heads/asf-staging (95d5d1e87)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/_/download.html|   8 
 .../managing/configuration/cass_yaml_file.html |   3 ++-
 .../managing/configuration/cass_yaml_file.html |   3 ++-
 .../managing/configuration/cass_yaml_file.html |   3 ++-
 .../managing/configuration/cass_yaml_file.html |   3 ++-
 content/search-index.js|   2 +-
 .../source/modules/ROOT/pages/download.adoc|   8 
 site-ui/build/ui-bundle.zip| Bin 4883646 -> 4883646 
bytes
 8 files changed, 17 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19285 Fix flaky Host replacement tests and shrink tests [cassandra-analytics]

2024-02-14 Thread via GitHub


yifan-c closed pull request #39: CASSANDRA-19285 Fix flaky Host replacement 
tests and shrink tests
URL: https://github.com/apache/cassandra-analytics/pull/39


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch trunk updated: Minor release 4.1.4

2024-02-14 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


The following commit(s) were added to refs/heads/trunk by this push:
 new c4b35db18 Minor release 4.1.4
c4b35db18 is described below

commit c4b35db1813a7f6b2e6e7021c42e2dde44e66b3b
Author: Brandon Williams 
AuthorDate: Wed Feb 14 10:29:13 2024 -0600

Minor release 4.1.4

ref: https://lists.apache.org/thread/r42ksoxt4kqfoxcok9r0pjy11w1lmd3l
---
 site-content/source/modules/ROOT/pages/download.adoc | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/site-content/source/modules/ROOT/pages/download.adoc 
b/site-content/source/modules/ROOT/pages/download.adoc
index b7d35ee2d..17ebab8fb 100644
--- a/site-content/source/modules/ROOT/pages/download.adoc
+++ b/site-content/source/modules/ROOT/pages/download.adoc
@@ -36,15 +36,15 @@ 
https://www.apache.org/dyn/closer.lua/cassandra/5.0-beta1/apache-cassandra-5.0-b
 [discrete]
  Apache Cassandra 4.1
 [discrete]
- Latest release on 2023-07-24
+ Latest release on 2024-02-14
 [discrete]
  Maintained until 5.2.0 release (~July 2025)
 
 [.btn.btn--alt]
-https://www.apache.org/dyn/closer.lua/cassandra/4.1.3/apache-cassandra-4.1.3-bin.tar.gz[4.1.3,window=blank]
+https://www.apache.org/dyn/closer.lua/cassandra/4.1.4/apache-cassandra-4.1.4-bin.tar.gz[4.1.4,window=blank]
 
-(https://downloads.apache.org/cassandra/4.1.3/apache-cassandra-4.1.3-bin.tar.gz.asc[pgp,window=blank],
 
https://downloads.apache.org/cassandra/4.1.3/apache-cassandra-4.1.3-bin.tar.gz.sha256[sha256,window=blank],
 
https://downloads.apache.org/cassandra/4.1.3/apache-cassandra-4.1.3-bin.tar.gz.sha512[sha512,window=blank])
 +
-(https://www.apache.org/dyn/closer.lua/cassandra/4.1.3/apache-cassandra-4.1.3-src.tar.gz[source,window=blank]:
 
https://downloads.apache.org/cassandra/4.1.3/apache-cassandra-4.1.3-src.tar.gz.asc[pgp,window=blank],
 
https://downloads.apache.org/cassandra/4.1.3/apache-cassandra-4.1.3-src.tar.gz.sha256[sha256,window=blank],
 
https://downloads.apache.org/cassandra/4.1.3/apache-cassandra-4.1.3-src.tar.gz.sha512[sha512,window=blank])
+(https://downloads.apache.org/cassandra/4.1.4/apache-cassandra-4.1.4-bin.tar.gz.asc[pgp,window=blank],
 
https://downloads.apache.org/cassandra/4.1.4/apache-cassandra-4.1.4-bin.tar.gz.sha256[sha256,window=blank],
 
https://downloads.apache.org/cassandra/4.1.4/apache-cassandra-4.1.4-bin.tar.gz.sha512[sha512,window=blank])
 +
+(https://www.apache.org/dyn/closer.lua/cassandra/4.1.4/apache-cassandra-4.1.4-src.tar.gz[source,window=blank]:
 
https://downloads.apache.org/cassandra/4.1.4/apache-cassandra-4.1.4-src.tar.gz.asc[pgp,window=blank],
 
https://downloads.apache.org/cassandra/4.1.4/apache-cassandra-4.1.4-src.tar.gz.sha256[sha256,window=blank],
 
https://downloads.apache.org/cassandra/4.1.4/apache-cassandra-4.1.4-src.tar.gz.sha512[sha512,window=blank])
 --
 
 [openblock, inline50 inline-top]


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



svn commit: r67347 - /release/cassandra/4.1.4/redhat/

2024-02-14 Thread brandonwilliams
Author: brandonwilliams
Date: Wed Feb 14 16:27:11 2024
New Revision: 67347

Log:
Apache Cassandra 4.1.4 redhat artifacts

Removed:
release/cassandra/4.1.4/redhat/


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



svn commit: r67346 - /release/cassandra/4.1.4/debian/

2024-02-14 Thread brandonwilliams
Author: brandonwilliams
Date: Wed Feb 14 16:23:48 2024
New Revision: 67346

Log:
Apache Cassandra 4.1.4 debian artifacts

Removed:
release/cassandra/4.1.4/debian/


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



svn commit: r67344 - /dev/cassandra/4.1.4/ /release/cassandra/4.1.4/

2024-02-14 Thread brandonwilliams
Author: brandonwilliams
Date: Wed Feb 14 16:20:38 2024
New Revision: 67344

Log:
Apache Cassandra 4.1.4 release

Added:
release/cassandra/4.1.4/
  - copied from r67343, dev/cassandra/4.1.4/
Removed:
dev/cassandra/4.1.4/


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) tag 4.1.4-tentative deleted (was 99d9faeef5)

2024-02-14 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a change to tag 4.1.4-tentative
in repository https://gitbox.apache.org/repos/asf/cassandra.git


*** WARNING: tag 4.1.4-tentative was deleted! ***

 was 99d9faeef5 Prepare debian changelog for 4.1.4

The revisions that were on this tag are still contained in
other references; therefore, this change does not discard any commits
from the repository.


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) annotated tag cassandra-4.1.4 created (now e7c2a5c1cb)

2024-02-14 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a change to annotated tag cassandra-4.1.4
in repository https://gitbox.apache.org/repos/asf/cassandra.git


  at e7c2a5c1cb (tag)
 tagging 99d9faeef57c9cf5240d11eac9db5b283e45a4f9 (commit)
 replaces cassandra-4.0.12
  by Brandon Williams
  on Wed Feb 14 10:20:25 2024 -0600

- Log -
Apache Cassandra 4.1.4 release
---

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading

2024-02-14 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19398:

Fix Version/s: 5.0.x

> Test Failure: 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
> --
>
> Key: CASSANDRA-19398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19398
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0.x
>
>
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0]
> {code:java}
> junit.framework.AssertionFailedError at 
> org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading

2024-02-14 Thread Ekaterina Dimitrova (Jira)
Ekaterina Dimitrova created CASSANDRA-19398:
---

 Summary: Test Failure: 
org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
 Key: CASSANDRA-19398
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19398
 Project: Cassandra
  Issue Type: Bug
Reporter: Ekaterina Dimitrova


[https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0]


{code:java}
junit.framework.AssertionFailedError at 
org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220)
 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19395) Warn when native_transport_port_ssl is set

2024-02-14 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-19395:
-

Assignee: Stefan Miklosovic

> Warn when native_transport_port_ssl is set
> --
>
> Key: CASSANDRA-19395
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19395
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Brandon Williams
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x
>
>
> In CASSANDRA-19392 this was deprecated, however Stefan notes that if you set 
> this it will work in a single node cluster because the peers table isn't 
> needed to distribute the information.  This sounds like a recipe for "this 
> worked when we tested in development, but not in production" so it would be 
> good to warn users when this is set to avoid future confusion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19397) Remove all code around native_transport_port_ssl

2024-02-14 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19397:
--
Change Category: Code Clarity
 Complexity: Normal
  Fix Version/s: 5.x
 Status: Open  (was: Triage Needed)

> Remove all code around native_transport_port_ssl
> 
>
> Key: CASSANDRA-19397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19397
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.x
>
>
> We deprecated native_transport_port_ssl in CASSANDRA-19392 and we told we go 
> to remove it next. This ticket is about that removal. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19397) Remove all code around native_transport_port_ssl

2024-02-14 Thread Stefan Miklosovic (Jira)
Stefan Miklosovic created CASSANDRA-19397:
-

 Summary: Remove all code around native_transport_port_ssl
 Key: CASSANDRA-19397
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19397
 Project: Cassandra
  Issue Type: Task
  Components: Legacy/Core
Reporter: Stefan Miklosovic
Assignee: Stefan Miklosovic


We deprecated native_transport_port_ssl in CASSANDRA-19392 and we told we go to 
remove it next. This ticket is about that removal. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18762) Repair triggers OOM with direct buffer memory

2024-02-14 Thread Brad Schoening (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817421#comment-17817421
 ] 

Brad Schoening edited comment on CASSANDRA-18762 at 2/14/24 3:38 PM:
-

It seems setting -XX:MaxDirectMemorySize might be useful to prevent this.

In [Java 
17|https://docs.oracle.com/en/java/javase/17/docs/specs/man/java.html], the JVM 
picks something based on some opaque heuristic:
{quote}By default, the size (MaxDirectMemorySize) is set to 0, meaning that the 
JVM chooses the size for NIO direct-buffer allocations automatically.
{quote}


was (Author: bschoeni):
It seems setting -XX:MaxDirectMemorySize might be useful to prevent this.

> Repair triggers OOM with direct buffer memory
> -
>
> Key: CASSANDRA-18762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18762
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Brad Schoening
>Priority: Normal
>  Labels: OutOfMemoryError
> Attachments: Cluster-dm-metrics-1.PNG, 
> image-2023-12-06-15-28-05-459.png, image-2023-12-06-15-29-31-491.png, 
> image-2023-12-06-15-58-55-007.png
>
>
> We are seeing repeated failures of nodes with 16GB of heap on a VM with 32GB 
> of physical RAM due to direct memory.  This seems to be related to 
> CASSANDRA-15202 which moved Merkel trees off-heap in 4.0.   Using Cassandra 
> 4.0.6 with Java 11.
> {noformat}
> 2023-08-09 04:30:57,470 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e55a3b0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_a from 
> /169.102.200.241:7000
> 2023-08-09 04:30:57,567 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e0d2900-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
> /169.93.192.29:7000
> 2023-08-09 04:30:57,568 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e1dcad0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_c from 
> /169.104.171.134:7000
> 2023-08-09 04:30:57,591 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e69a0e0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
> /169.79.232.67:7000
> 2023-08-09 04:30:57,876 [INFO ] [Service Thread] cluster_id=101 
> ip_address=169.0.0.1 GCInspector.java:294 - G1 Old Generation GC in 282ms. 
> Compressed Class Space: 8444560 -> 8372152; G1 Eden Space: 7809794048 -> 0; 
> G1 Old Gen: 1453478400 -> 820942800; G1 Survivor Space: 419430400 -> 0; 
> Metaspace: 80411136 -> 80176528
> 2023-08-09 04:30:58,387 [ERROR] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 JVMStabilityInspector.java:102 - OutOfMemory error 
> letting the JVM handle the error:
> java.lang.OutOfMemoryError: Direct buffer memory
> at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
> at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:118)
> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:318)
> at org.apache.cassandra.utils.MerkleTree.allocate(MerkleTree.java:742)
> at 
> org.apache.cassandra.utils.MerkleTree.deserializeOffHeap(MerkleTree.java:780)
> at org.apache.cassandra.utils.MerkleTree.deserializeTree(MerkleTree.java:751)
> at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:720)
> at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:698)
> at 
> org.apache.cassandra.utils.MerkleTrees$MerkleTreesSerializer.deserialize(MerkleTrees.java:416)
> at 
> org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:100)
> at 
> org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:84)
> at 
> org.apache.cassandra.net.Message$Serializer.deserializePost40(Message.java:782)
> at org.apache.cassandra.net.Message$Serializer.deserialize(Message.java:642)
> at 
> org.apache.cassandra.net.InboundMessageHandler$LargeMessage.deserialize(InboundMessageHandler.java:364)
> at 
> org.apache.cassandra.net.InboundMessageHandler$LargeMessage.access$1100(InboundMessageHandler.java:317)
> at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessLargeMessage.provideMessage(InboundMessageHandler.java:504)
> at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:429)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> 

[jira] [Comment Edited] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817423#comment-17817423
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19394 at 2/14/24 3:25 PM:


What I am afraid of is that we just have assumptions how this is going to be 
used and if you guys think that "this is escape hatch not meant to be abused", 
well, good for you, but this is going to be misused / abused. People forget 
stuff etc ... these dumps will be just rotting there until that node is 
restarted again. 


was (Author: smiklosovic):
What I am afraid of is that we just have assumptions how this is going to be 
used and if you guys think that "this is escape hatch not meant to be abused", 
well, good for you, but this is going to be misused / abused. People forget 
stuff etc ... these dump will be just rotting there until that node is 
restarted again. 

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817423#comment-17817423
 ] 

Stefan Miklosovic commented on CASSANDRA-19394:
---

What I am afraid of is that we just have assumptions how this is going to be 
used and if you guys think that "this is escape hatch not meant to be abused", 
well, good for you, but this is going to be misused / abused. People forget 
stuff etc ... these dump will be just rotting there until that node is 
restarted again. 

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Sam Tunnicliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817422#comment-17817422
 ] 

Sam Tunnicliffe commented on CASSANDRA-19394:
-

A simple visual representation of the current metadata, without any ability or 
expectation to be able to parse, roundtrip, or pipe it into tooling would be 
very useful right now for debugging and development, but that's a totally 
different use case than what the current 
\{{CMSOperations::dumpClusterMetadata}} is for. 

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19393) nodetool: group CMS-related commands into one command

2024-02-14 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe reassigned CASSANDRA-19393:
---

Assignee: Sam Tunnicliffe  (was: n.v.harikrishna)

> nodetool: group CMS-related commands into one command
> -
>
> Key: CASSANDRA-19393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19393
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: n.v.harikrishna
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is to group all CMS-related commands under one 
> "nodetool cms" command where existing command would be subcommands of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19393) nodetool: group CMS-related commands into one command

2024-02-14 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe reassigned CASSANDRA-19393:
---

Assignee: n.v.harikrishna  (was: Sam Tunnicliffe)

> nodetool: group CMS-related commands into one command
> -
>
> Key: CASSANDRA-19393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19393
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: n.v.harikrishna
>Assignee: n.v.harikrishna
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is to group all CMS-related commands under one 
> "nodetool cms" command where existing command would be subcommands of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19392) deprecate dual ports support (native_transport_port_ssl)

2024-02-14 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19392:
--
Status: Review In Progress  (was: Needs Committer)

> deprecate dual ports support (native_transport_port_ssl) 
> -
>
> Key: CASSANDRA-19392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19392
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-beta
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We decided (1) to deprecate dual ports support in 5.0 (and eventually remove 
> it in trunk). This ticket will track the work towards the deprecation for 5.0.
> (1) https://lists.apache.org/thread/dow196gspwgp2og576zh3lotvt6mc3lv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19392) deprecate dual ports support (native_transport_port_ssl)

2024-02-14 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19392:
--
  Fix Version/s: 5.0-beta2
 5.1
 (was: 5.0-beta)
Source Control Link: 
https://github.com/apache/cassandra/commit/8b037a6c846402296a2984eb1fbbdd441bdece19
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> deprecate dual ports support (native_transport_port_ssl) 
> -
>
> Key: CASSANDRA-19392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19392
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-beta2, 5.1
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We decided (1) to deprecate dual ports support in 5.0 (and eventually remove 
> it in trunk). This ticket will track the work towards the deprecation for 5.0.
> (1) https://lists.apache.org/thread/dow196gspwgp2og576zh3lotvt6mc3lv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18762) Repair triggers OOM with direct buffer memory

2024-02-14 Thread Brad Schoening (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817421#comment-17817421
 ] 

Brad Schoening commented on CASSANDRA-18762:


It seems setting -XX:MaxDirectMemorySize might be useful to prevent this.

> Repair triggers OOM with direct buffer memory
> -
>
> Key: CASSANDRA-18762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18762
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Brad Schoening
>Priority: Normal
>  Labels: OutOfMemoryError
> Attachments: Cluster-dm-metrics-1.PNG, 
> image-2023-12-06-15-28-05-459.png, image-2023-12-06-15-29-31-491.png, 
> image-2023-12-06-15-58-55-007.png
>
>
> We are seeing repeated failures of nodes with 16GB of heap on a VM with 32GB 
> of physical RAM due to direct memory.  This seems to be related to 
> CASSANDRA-15202 which moved Merkel trees off-heap in 4.0.   Using Cassandra 
> 4.0.6 with Java 11.
> {noformat}
> 2023-08-09 04:30:57,470 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e55a3b0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_a from 
> /169.102.200.241:7000
> 2023-08-09 04:30:57,567 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e0d2900-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
> /169.93.192.29:7000
> 2023-08-09 04:30:57,568 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e1dcad0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_c from 
> /169.104.171.134:7000
> 2023-08-09 04:30:57,591 [INFO ] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 RepairSession.java:202 - [repair 
> #5e69a0e0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
> /169.79.232.67:7000
> 2023-08-09 04:30:57,876 [INFO ] [Service Thread] cluster_id=101 
> ip_address=169.0.0.1 GCInspector.java:294 - G1 Old Generation GC in 282ms. 
> Compressed Class Space: 8444560 -> 8372152; G1 Eden Space: 7809794048 -> 0; 
> G1 Old Gen: 1453478400 -> 820942800; G1 Survivor Space: 419430400 -> 0; 
> Metaspace: 80411136 -> 80176528
> 2023-08-09 04:30:58,387 [ERROR] [AntiEntropyStage:1] cluster_id=101 
> ip_address=169.0.0.1 JVMStabilityInspector.java:102 - OutOfMemory error 
> letting the JVM handle the error:
> java.lang.OutOfMemoryError: Direct buffer memory
> at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
> at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:118)
> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:318)
> at org.apache.cassandra.utils.MerkleTree.allocate(MerkleTree.java:742)
> at 
> org.apache.cassandra.utils.MerkleTree.deserializeOffHeap(MerkleTree.java:780)
> at org.apache.cassandra.utils.MerkleTree.deserializeTree(MerkleTree.java:751)
> at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:720)
> at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:698)
> at 
> org.apache.cassandra.utils.MerkleTrees$MerkleTreesSerializer.deserialize(MerkleTrees.java:416)
> at 
> org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:100)
> at 
> org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:84)
> at 
> org.apache.cassandra.net.Message$Serializer.deserializePost40(Message.java:782)
> at org.apache.cassandra.net.Message$Serializer.deserialize(Message.java:642)
> at 
> org.apache.cassandra.net.InboundMessageHandler$LargeMessage.deserialize(InboundMessageHandler.java:364)
> at 
> org.apache.cassandra.net.InboundMessageHandler$LargeMessage.access$1100(InboundMessageHandler.java:317)
> at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessLargeMessage.provideMessage(InboundMessageHandler.java:504)
> at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:429)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)no* further _formatting_ is 
> done here{noformat}
>  
> -XX:+AlwaysPreTouch
> -XX:+CrashOnOutOfMemoryError
> -XX:+ExitOnOutOfMemoryError
> -XX:+HeapDumpOnOutOfMemoryError
> -XX:+ParallelRefProcEnabled
> -XX:+PerfDisableSharedMem
> -XX:+ResizeTLAB
> -XX:+UseG1GC
> -XX:+UseNUMA
> -XX:+UseTLAB
> 

[jira] [Updated] (CASSANDRA-19392) deprecate dual ports support (native_transport_port_ssl)

2024-02-14 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-19392:
--
Status: Ready to Commit  (was: Review In Progress)

> deprecate dual ports support (native_transport_port_ssl) 
> -
>
> Key: CASSANDRA-19392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19392
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-beta
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We decided (1) to deprecate dual ports support in 5.0 (and eventually remove 
> it in trunk). This ticket will track the work towards the deprecation for 5.0.
> (1) https://lists.apache.org/thread/dow196gspwgp2og576zh3lotvt6mc3lv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cassandra-5.0 updated (69f735d61f -> 8b037a6c84)

2024-02-14 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch cassandra-5.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 69f735d61f Update packaging shell includes for j17
 add 8b037a6c84 Deprecate native_transport_port_ssl

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt  |  1 +
 NEWS.txt |  4 
 conf/cassandra.yaml  |  1 +
 src/java/org/apache/cassandra/config/Config.java |  2 ++
 .../org/apache/cassandra/config/DatabaseDescriptor.java  | 16 
 5 files changed, 20 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) 01/01: Merge branch 'cassandra-5.0' into trunk

2024-02-14 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 8bdf2615bcca6eacc7fd9debc7a68a917048df83
Merge: 3acec3c28e 8b037a6c84
Author: Stefan Miklosovic 
AuthorDate: Wed Feb 14 15:53:38 2024 +0100

Merge branch 'cassandra-5.0' into trunk

 CHANGES.txt  |  1 +
 NEWS.txt |  6 ++
 conf/cassandra.yaml  |  1 +
 src/java/org/apache/cassandra/config/Config.java |  2 ++
 .../org/apache/cassandra/config/DatabaseDescriptor.java  | 16 
 5 files changed, 22 insertions(+), 4 deletions(-)

diff --cc CHANGES.txt
index d73539808e,30413804a5..d470d8f813
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,22 -1,5 +1,23 @@@
 -5.0-beta2
 +5.1
 + * Make nodetool reconfigurecms sync by default and add --cancel to be able 
to cancel ongoing reconfigurations (CASSANDRA-19216)
 + * Expose auth mode in system_views.clients, nodetool clientstats, metrics 
(CASSANDRA-19366)
 + * Remove sealed_periods and last_sealed_period tables (CASSANDRA-19189)
 + * Improve setup and initialisation of LocalLog/LogSpec (CASSANDRA-19271)
 + * Refactor structure of caching metrics and expose auth cache metrics via 
JMX (CASSANDRA-17062)
 + * Allow CQL client certificate authentication to work without sending an 
AUTHENTICATE request (CASSANDRA-18857)
 + * Extend nodetool tpstats and system_views.thread_pools with detailed pool 
parameters (CASSANDRA-19289) 
 + * Remove dependency on Sigar in favor of OSHI (CASSANDRA-16565)
 + * Simplify the bind marker and Term logic (CASSANDRA-18813)
 + * Limit cassandra startup to supported JDKs, allow higher JDKs by setting 
CASSANDRA_JDK_UNSUPPORTED (CASSANDRA-18688)
 + * Standardize nodetool tablestats formatting of data units (CASSANDRA-19104)
 + * Make nodetool tablestats use number of significant digits for time and 
average values consistently (CASSANDRA-19015)
 + * Upgrade jackson to 2.15.3 and snakeyaml to 2.1 (CASSANDRA-18875)
 + * Transactional Cluster Metadata [CEP-21] (CASSANDRA-18330)
 + * Add ELAPSED command to cqlsh (CASSANDRA-18861)
 + * Add the ability to disable bulk loading of SSTables (CASSANDRA-18781)
 + * Clean up obsolete functions and simplify cql_version handling in cqlsh 
(CASSANDRA-18787)
 +Merged from 5.0:
+  * Deprecate native_transport_port_ssl (CASSANDRA-19392)
   * Update packaging shell includes (CASSANDRA-19283)
   * Fix data corruption in VectorCodec when using heap buffers 
(CASSANDRA-19167)
   * Avoid over-skipping of key iterators from static column indexes during 
mixed intersections (CASSANDRA-19278)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch trunk updated (3acec3c28e -> 8bdf2615bc)

2024-02-14 Thread smiklosovic
This is an automated email from the ASF dual-hosted git repository.

smiklosovic pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


from 3acec3c28e Make nodetool reconfigurecms sync by default and add 
--cancel to be able to cancel ongoing reconfigurations
 add 8b037a6c84 Deprecate native_transport_port_ssl
 new 8bdf2615bc Merge branch 'cassandra-5.0' into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt  |  1 +
 NEWS.txt |  6 ++
 conf/cassandra.yaml  |  1 +
 src/java/org/apache/cassandra/config/Config.java |  2 ++
 .../org/apache/cassandra/config/DatabaseDescriptor.java  | 16 
 5 files changed, 22 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18762) Repair triggers OOM with direct buffer memory

2024-02-14 Thread Brad Schoening (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brad Schoening updated CASSANDRA-18762:
---
Description: 
We are seeing repeated failures of nodes with 16GB of heap on a VM with 32GB of 
physical RAM due to direct memory with Java 11.  This seems to be related to 
CASSANDRA-15202 which moved merkel trees off-heap in 4.0.   Using Cassandra 
4.0.6.
{noformat}
2023-08-09 04:30:57,470 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e55a3b0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_a from 
/169.102.200.241:7000
2023-08-09 04:30:57,567 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e0d2900-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
/169.93.192.29:7000
2023-08-09 04:30:57,568 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e1dcad0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_c from 
/169.104.171.134:7000
2023-08-09 04:30:57,591 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e69a0e0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
/169.79.232.67:7000
2023-08-09 04:30:57,876 [INFO ] [Service Thread] cluster_id=101 
ip_address=169.0.0.1 GCInspector.java:294 - G1 Old Generation GC in 282ms. 
Compressed Class Space: 8444560 -> 8372152; G1 Eden Space: 7809794048 -> 0; G1 
Old Gen: 1453478400 -> 820942800; G1 Survivor Space: 419430400 -> 0; Metaspace: 
80411136 -> 80176528
2023-08-09 04:30:58,387 [ERROR] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 JVMStabilityInspector.java:102 - OutOfMemory error letting 
the JVM handle the error:
java.lang.OutOfMemoryError: Direct buffer memory
at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:118)
at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:318)
at org.apache.cassandra.utils.MerkleTree.allocate(MerkleTree.java:742)
at org.apache.cassandra.utils.MerkleTree.deserializeOffHeap(MerkleTree.java:780)
at org.apache.cassandra.utils.MerkleTree.deserializeTree(MerkleTree.java:751)
at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:720)
at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:698)
at 
org.apache.cassandra.utils.MerkleTrees$MerkleTreesSerializer.deserialize(MerkleTrees.java:416)
at 
org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:100)
at 
org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:84)
at 
org.apache.cassandra.net.Message$Serializer.deserializePost40(Message.java:782)
at org.apache.cassandra.net.Message$Serializer.deserialize(Message.java:642)
at 
org.apache.cassandra.net.InboundMessageHandler$LargeMessage.deserialize(InboundMessageHandler.java:364)
at 
org.apache.cassandra.net.InboundMessageHandler$LargeMessage.access$1100(InboundMessageHandler.java:317)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessLargeMessage.provideMessage(InboundMessageHandler.java:504)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:429)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)no* further _formatting_ is 
done here{noformat}
 
-XX:+AlwaysPreTouch
-XX:+CrashOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ParallelRefProcEnabled
-XX:+PerfDisableSharedMem
-XX:+ResizeTLAB
-XX:+UseG1GC
-XX:+UseNUMA
-XX:+UseTLAB
-XX:+UseThreadPriorities
-XX:-UseBiasedLocking
-XX:CompileCommandFile=/opt/nosql/clusters/cassandra-101/conf/hotspot_compiler
-XX:G1RSetUpdatingPauseTimePercent=5
-XX:G1ReservePercent=20
-XX:HeapDumpPath=/opt/nosql/data/cluster_101/cassandra-1691623098-pid2804737.hprof
-XX:InitiatingHeapOccupancyPercent=70
-XX:MaxGCPauseMillis=200
-XX:StringTableSize=60013
-Xlog:gc*:file=/opt/nosql/clusters/cassandra-101/logs/gc.log:time,uptime:filecount=10,filesize=10485760
-Xms16G
-Xmx16G
-Xss256k
 
>From our Prometheus metrics, the behavior shows the direct buffer memory 
>ramping up until it reaches the max and then causes an OOM.  It would appear 
>that direct memory is never being released by the JVM until its exhausted.
 
!Cluster-dm-metrics.PNG!

An Eclipse Memory Analyzer

Class Histogram:
||Class Name||Objects||Shallow Heap||Retained Heap||

[jira] [Updated] (CASSANDRA-18762) Repair triggers OOM with direct buffer memory

2024-02-14 Thread Brad Schoening (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brad Schoening updated CASSANDRA-18762:
---
Description: 
We are seeing repeated failures of nodes with 16GB of heap on a VM with 32GB of 
physical RAM due to direct memory.  This seems to be related to CASSANDRA-15202 
which moved Merkel trees off-heap in 4.0.   Using Cassandra 4.0.6 with Java 11.
{noformat}
2023-08-09 04:30:57,470 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e55a3b0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_a from 
/169.102.200.241:7000
2023-08-09 04:30:57,567 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e0d2900-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
/169.93.192.29:7000
2023-08-09 04:30:57,568 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e1dcad0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_c from 
/169.104.171.134:7000
2023-08-09 04:30:57,591 [INFO ] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 RepairSession.java:202 - [repair 
#5e69a0e0-366d-11ee-a644-d91df26add5e] Received merkle tree for table_b from 
/169.79.232.67:7000
2023-08-09 04:30:57,876 [INFO ] [Service Thread] cluster_id=101 
ip_address=169.0.0.1 GCInspector.java:294 - G1 Old Generation GC in 282ms. 
Compressed Class Space: 8444560 -> 8372152; G1 Eden Space: 7809794048 -> 0; G1 
Old Gen: 1453478400 -> 820942800; G1 Survivor Space: 419430400 -> 0; Metaspace: 
80411136 -> 80176528
2023-08-09 04:30:58,387 [ERROR] [AntiEntropyStage:1] cluster_id=101 
ip_address=169.0.0.1 JVMStabilityInspector.java:102 - OutOfMemory error letting 
the JVM handle the error:
java.lang.OutOfMemoryError: Direct buffer memory
at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:118)
at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:318)
at org.apache.cassandra.utils.MerkleTree.allocate(MerkleTree.java:742)
at org.apache.cassandra.utils.MerkleTree.deserializeOffHeap(MerkleTree.java:780)
at org.apache.cassandra.utils.MerkleTree.deserializeTree(MerkleTree.java:751)
at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:720)
at org.apache.cassandra.utils.MerkleTree.deserialize(MerkleTree.java:698)
at 
org.apache.cassandra.utils.MerkleTrees$MerkleTreesSerializer.deserialize(MerkleTrees.java:416)
at 
org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:100)
at 
org.apache.cassandra.repair.messages.ValidationResponse$1.deserialize(ValidationResponse.java:84)
at 
org.apache.cassandra.net.Message$Serializer.deserializePost40(Message.java:782)
at org.apache.cassandra.net.Message$Serializer.deserialize(Message.java:642)
at 
org.apache.cassandra.net.InboundMessageHandler$LargeMessage.deserialize(InboundMessageHandler.java:364)
at 
org.apache.cassandra.net.InboundMessageHandler$LargeMessage.access$1100(InboundMessageHandler.java:317)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessLargeMessage.provideMessage(InboundMessageHandler.java:504)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:429)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)no* further _formatting_ is 
done here{noformat}
 
-XX:+AlwaysPreTouch
-XX:+CrashOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ParallelRefProcEnabled
-XX:+PerfDisableSharedMem
-XX:+ResizeTLAB
-XX:+UseG1GC
-XX:+UseNUMA
-XX:+UseTLAB
-XX:+UseThreadPriorities
-XX:-UseBiasedLocking
-XX:CompileCommandFile=/opt/nosql/clusters/cassandra-101/conf/hotspot_compiler
-XX:G1RSetUpdatingPauseTimePercent=5
-XX:G1ReservePercent=20
-XX:HeapDumpPath=/opt/nosql/data/cluster_101/cassandra-1691623098-pid2804737.hprof
-XX:InitiatingHeapOccupancyPercent=70
-XX:MaxGCPauseMillis=200
-XX:StringTableSize=60013
-Xlog:gc*:file=/opt/nosql/clusters/cassandra-101/logs/gc.log:time,uptime:filecount=10,filesize=10485760
-Xms16G
-Xmx16G
-Xss256k
 
>From our Prometheus metrics, the behavior shows the direct buffer memory 
>ramping up until it reaches the max and then causes an OOM.  It would appear 
>that direct memory is never being released by the JVM until its exhausted.
 
!Cluster-dm-metrics.PNG!

An Eclipse Memory Analyzer

Class Histogram:
||Class Name||Objects||Shallow Heap||Retained Heap||

[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817419#comment-17817419
 ] 

Stefan Miklosovic commented on CASSANDRA-19394:
---

Oh yeah, that is wrong too :D

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817415#comment-17817415
 ] 

Stefan Miklosovic commented on CASSANDRA-19394:
---

[~samt] what's wrong with 

nodetool cms dump > /tmp/dump.txt 

executed locally?

The benefit is that we can also inspect and display it remotely for diagnostic 
purposes etc.

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817416#comment-17817416
 ] 

Brandon Williams commented on CASSANDRA-19394:
--

I'll just note there is precedence (for better or worse) for JMX dumping to 
local files: 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/FailureDetector.java#L272

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Sam Tunnicliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817414#comment-17817414
 ] 

Sam Tunnicliffe commented on CASSANDRA-19394:
-

Part of the benefit of dumping only to a binary format is precisely that it is 
opaque and has a very limited set of uses. For now these include reloading a 
binary dump into a new or existing cluster (e.g. for DR, debugging or cloning 
purposes), or writing low level custom code to explore and modify the metadata. 
Like Marcus said, this is really intended as an escape hatch for when (if) 
things go catastrophically wrong and I agree with him that we should not change 
this yet.
{quote}consume a lot of disk space if dumps are done frequently and they are 
big.
{quote}
Dump files are current pretty tiny, even for clusters with many members and 
large schema.
{quote}An adversary might just dump cluster metadata until no disk space is 
left.
{quote}
Nodetool / JMX should be properly secured to prevent this. An adversary could 
simply run {{nodetool assassinate}} if they had access.

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817413#comment-17817413
 ] 

Stefan Miklosovic commented on CASSANDRA-19394:
---

Also, if it is meant to be run locally, an operator can just do

nodetool cms dump > /tmp/dump.txt 

on the very same machine a node runs at? 

I just dont see why it has to be persisted into /tmp by Cassandra.

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817411#comment-17817411
 ] 

Stefan Miklosovic commented on CASSANDRA-19394:
---

The existence of nodetool cms dump command is not necessary then? I do not like 
that we would make exceptions like "well but here nodetool just has to run on 
the same machine where your node runs". 

Could we at least provide some way how these files are cleaned up? Like they 
would be removed after 1 hour? Giving how busy operators are with other stuff, 
they will most probably just forget to remove it. Sure, you say that "well but 
node will be restarted so it will be removed" but in real life they might just 
inspect the file and never restart. We are clearly making assumptions around 
how this is going to be used and I think that it is safer to do it bullet-proof 
as possible. 

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19396) Fix Contributing Code Changes page

2024-02-14 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19396:

Change Category: Operability
 Complexity: Low Hanging Fruit
Component/s: Legacy/Documentation and Website
  Fix Version/s: 4.0.x
 4.1.x
 5.0.x
 5.x
   Priority: Low  (was: Normal)
 Status: Open  (was: Triage Needed)

> Fix Contributing Code Changes page
> --
>
> Key: CASSANDRA-19396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19396
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Documentation and Website
>Reporter: Ekaterina Dimitrova
>Priority: Low
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> Fix "Choosing the Right Branches to Work on" section. Currently it says 2.1 
> and 2.2 critical bug fixes when the community does not maintain at this point 
> those versions. Also, we already release 4.0 and 4.1, the code freeze should 
> move to 5.0. 
> I would like to suggest we update the page by saying - any version which is 
> not EOL and it is already GA - critical bug fixes, any post-alpha version 
> which is not GA yet - code freeze, no new features or improvements; 
> stabilization period. 
> We can also mention this info can be inferred from the Downloads page as that 
> one is updated on every release. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19396) Fix Contributing Code Changes page

2024-02-14 Thread Ekaterina Dimitrova (Jira)
Ekaterina Dimitrova created CASSANDRA-19396:
---

 Summary: Fix Contributing Code Changes page
 Key: CASSANDRA-19396
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19396
 Project: Cassandra
  Issue Type: Task
Reporter: Ekaterina Dimitrova


Fix "Choosing the Right Branches to Work on" section. Currently it says 2.1 and 
2.2 critical bug fixes when the community does not maintain at this point those 
versions. Also, we already release 4.0 and 4.1, the code freeze should move to 
5.0. 

I would like to suggest we update the page by saying - any version which is not 
EOL and it is already GA - critical bug fixes, any post-alpha version which is 
not GA yet - code freeze, no new features or improvements; stabilization 
period. 

We can also mention this info can be inferred from the Downloads page as that 
one is updated on every release. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19392) deprecate dual ports support (native_transport_port_ssl)

2024-02-14 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817406#comment-17817406
 ] 

Brandon Williams commented on CASSANDRA-19392:
--

I created CASSANDRA-19395 to add the warning.

> deprecate dual ports support (native_transport_port_ssl) 
> -
>
> Key: CASSANDRA-19392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19392
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-beta
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We decided (1) to deprecate dual ports support in 5.0 (and eventually remove 
> it in trunk). This ticket will track the work towards the deprecation for 5.0.
> (1) https://lists.apache.org/thread/dow196gspwgp2og576zh3lotvt6mc3lv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19395) Warn when native_transport_port_ssl is set

2024-02-14 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19395:
-
 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Definition(13162)
   Complexity: Normal
  Component/s: Legacy/CQL
Discovered By: User Report
Fix Version/s: 4.0.x
   4.1.x
   5.0.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Warn when native_transport_port_ssl is set
> --
>
> Key: CASSANDRA-19395
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19395
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x
>
>
> In CASSANDRA-19392 this was deprecated, however Stefan notes that if you set 
> this it will work in a single node cluster because the peers table isn't 
> needed to distribute the information.  This sounds like a recipe for "this 
> worked when we tested in development, but not in production" so it would be 
> good to warn users when this is set to avoid future confusion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19395) Warn when native_transport_port_ssl is set

2024-02-14 Thread Brandon Williams (Jira)
Brandon Williams created CASSANDRA-19395:


 Summary: Warn when native_transport_port_ssl is set
 Key: CASSANDRA-19395
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19395
 Project: Cassandra
  Issue Type: Bug
Reporter: Brandon Williams


In CASSANDRA-19392 this was deprecated, however Stefan notes that if you set 
this it will work in a single node cluster because the peers table isn't needed 
to distribute the information.  This sounds like a recipe for "this worked when 
we tested in development, but not in production" so it would be good to warn 
users when this is set to avoid future confusion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817404#comment-17817404
 ] 

Marcus Eriksson commented on CASSANDRA-19394:
-

This is used for emergencies - you dump a metadata, modify it and then boot an 
instance with it - it requires local access to the machine to be able to start 
with the modified cluster metadata.

Don't think we should change this.

We should at some point add a way to dump the cluster metadata in a human 
readable format though

> Rethink dumping of cluster metadata via CMSOperationsMBean
> --
>
> Key: CASSANDRA-19394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> I think there are two problems in the implementation of dumping 
> ClusterMetadata in CMSOperationsMBean
> 1) A dump is saved in a file and dumpClusterMetadata methods will return just 
> a file name where that dump is. However, nodetool / JMX call to MBean (or any 
> place this method is invoked from, we would like to offer a command in 
> nodetool which returns the dump) is meant to be used from anywhere, remotely, 
> so what happens when we execute nodetool or call these methods on a machine 
> different from a machine a node runs on? E.g. admins can just have some 
> jumpbox to a cluster they manage, they do not necessarily have access to 
> nodes themselves. So they would not be able to read it.
> 2) It creates temp file which is not deleted so /tmp will be populated with 
> these dumps until node is turned off which might take a lot of time and can 
> consume a lot of disk space if dumps are done frequently and they are big. An 
> adversary might just dump cluster metadata until no disk space is left.
> What I propose is that we would return all dump string, not just a filename 
> where we save it. We can also format the output on the client or we can tell 
> server what format we want the dump to be returned in. 
> If there is a concern about size of data to be returned, we might optionally 
> allow dumps to be returned as compressed by simple zipping on server and 
> unzipping on client where "zipper" is a standard java.util.zip so it 
> basically doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19392) deprecate dual ports support (native_transport_port_ssl)

2024-02-14 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817402#comment-17817402
 ] 

Brandon Williams commented on CASSANDRA-19392:
--

+1 here too just in case, heh

> deprecate dual ports support (native_transport_port_ssl) 
> -
>
> Key: CASSANDRA-19392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19392
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-beta
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We decided (1) to deprecate dual ports support in 5.0 (and eventually remove 
> it in trunk). This ticket will track the work towards the deprecation for 5.0.
> (1) https://lists.apache.org/thread/dow196gspwgp2og576zh3lotvt6mc3lv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakub Zytka updated CASSANDRA-19365:

Description: 
`DecayingEstimatedHistogramReservoir` has a race condition between `update` and 
`rescaleIfNeeded`.
A sample which ends up (`update`) in an already scaled decayingBucket 
(`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
has not been updated yet at the moment of `update`.
 
The observed consequence was flooding of the cluster with speculative retries 
(we happened to hit low-percentile buckets with overweight samples, which drove 
p99 below true p50 for a long time).

Please note that despite the manifestation being similar to CASSANDRA-19330, 
these are two distinct bugs in their own right.

This bug affects versions 4.0+
On 3.11 there's locking in DEHR. I did not check earlier versions.

  was:
`DecayingEstimatedHistogramReservoir` has a race condition between `update` and 
`rescaleIfNeeded`.
A sample which ends up (`update`) in an already scaled decayingBucket 
(`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
has not been updated yet at the moment of `update`.
 
The observed consequence was flooding of the cluster with speculative retries 
(we happened to hit low-percentile buckets with overweight samples, which drove 
p99 below true p50 for a long time).

Please note that despite the manifestation being similar to CASSANDRA-19330, 
these are two distinct bugs in their own right.

This bug affects 


> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.
> This bug affects versions 4.0+
> On 3.11 there's locking in DEHR. I did not check earlier versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakub Zytka updated CASSANDRA-19365:

Fix Version/s: 5.x
   (was: 5.1)

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19392) deprecate dual ports support (native_transport_port_ssl)

2024-02-14 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817399#comment-17817399
 ] 

Stefan Miklosovic commented on CASSANDRA-19392:
---

Brandon +1ed privately. I am going to send it.

> deprecate dual ports support (native_transport_port_ssl) 
> -
>
> Key: CASSANDRA-19392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19392
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-beta
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We decided (1) to deprecate dual ports support in 5.0 (and eventually remove 
> it in trunk). This ticket will track the work towards the deprecation for 5.0.
> (1) https://lists.apache.org/thread/dow196gspwgp2og576zh3lotvt6mc3lv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakub Zytka updated CASSANDRA-19365:

Fix Version/s: 5.1

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
> Fix For: 5.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19394) Rethink dumping of cluster metadata via CMSOperationsMBean

2024-02-14 Thread Stefan Miklosovic (Jira)
Stefan Miklosovic created CASSANDRA-19394:
-

 Summary: Rethink dumping of cluster metadata via CMSOperationsMBean
 Key: CASSANDRA-19394
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19394
 Project: Cassandra
  Issue Type: Improvement
  Components: Tool/nodetool, Transactional Cluster Metadata
Reporter: Stefan Miklosovic


I think there are two problems in the implementation of dumping ClusterMetadata 
in CMSOperationsMBean

1) A dump is saved in a file and dumpClusterMetadata methods will return just a 
file name where that dump is. However, nodetool / JMX call to MBean (or any 
place this method is invoked from, we would like to offer a command in nodetool 
which returns the dump) is meant to be used from anywhere, remotely, so what 
happens when we execute nodetool or call these methods on a machine different 
from a machine a node runs on? E.g. admins can just have some jumpbox to a 
cluster they manage, they do not necessarily have access to nodes themselves. 
So they would not be able to read it.

2) It creates temp file which is not deleted so /tmp will be populated with 
these dumps until node is turned off which might take a lot of time and can 
consume a lot of disk space if dumps are done frequently and they are big. An 
adversary might just dump cluster metadata until no disk space is left.

What I propose is that we would return all dump string, not just a filename 
where we save it. We can also format the output on the client or we can tell 
server what format we want the dump to be returned in. 

If there is a concern about size of data to be returned, we might optionally 
allow dumps to be returned as compressed by simple zipping on server and 
unzipping on client where "zipper" is a standard java.util.zip so it basically 
doesn't matter what jvm runs on client and server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-website) branch asf-staging updated (8a641fca7 -> 367c839bb)

2024-02-14 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


 discard 8a641fca7 generate docs for aa8a03c7
 new 367c839bb generate docs for aa8a03c7

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (8a641fca7)
\
 N -- N -- N   refs/heads/asf-staging (367c839bb)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../managing/tools/nodetool/reconfigurecms.html|  11 +--
 .../managing/tools/nodetool/reconfigurecms.html|  11 +--
 site-ui/build/ui-bundle.zip| Bin 4883646 -> 4883646 
bytes
 3 files changed, 10 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817370#comment-17817370
 ] 

Jakub Zytka edited comment on CASSANDRA-19365 at 2/14/24 1:04 PM:
--

[https://github.com/apache/cassandra/pull/3102/files]

The proposed PR keeps changes to almost a minimum to stay consistent with the 
current state of DEHR. 

The solution doesn't introduce synchronization between updates and rescales for 
the sake of update performance. Instead, it introduces an atomic change of 
decay landmark and decaying buckets together. This lets us keep updates 
non-synchronized at the price of letting some updates be missed during rescale. 
It also prevents the creation of snapshots that are half-rescaled.


was (Author: jakubzytka):
[https://github.com/apache/cassandra/pull/3102/files]

The proposed PR keeps changes to almost a minimum to stay consistent with the 
current state of DEHR. 

The solution doesn't introduce synchronization between updates and rescales for 
the sake of update performance. Instead, it introduces an atomic change of 
decay landmark and decaying buckets together. This lets us keep updates 
non-synchronized at the price of letting some updates be missed during rescale. 
It also allows the creation of snapshots that are not half-rescaled.

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817370#comment-17817370
 ] 

Jakub Zytka edited comment on CASSANDRA-19365 at 2/14/24 1:02 PM:
--

[https://github.com/apache/cassandra/pull/3102/files]

The proposed PR keeps changes to almost a minimum to stay consistent with the 
current state of DEHR. 

The solution doesn't introduce synchronization between updates and rescales for 
the sake of update performance. Instead, it introduces an atomic change of 
decay landmark and decaying buckets together. This lets us keep updates 
non-synchronized at the price of letting some updates be missed during rescale. 
It also allows the creation of snapshots that are not half-rescaled.


was (Author: jakubzytka):
[https://github.com/apache/cassandra/pull/3102/files]

The proposed PR keeps changes to almost a minimum to stay consistent with the 
current state of DEHR. 

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817370#comment-17817370
 ] 

Jakub Zytka edited comment on CASSANDRA-19365 at 2/14/24 12:58 PM:
---

[https://github.com/apache/cassandra/pull/3102/files]

The proposed PR keeps changes to almost a minimum to stay consistent with the 
current state of DEHR. 


was (Author: jakubzytka):
https://github.com/apache/cassandra/pull/3102/files

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakub Zytka updated CASSANDRA-19365:

 Bug Category: Parent values: Correctness(12982)
   Complexity: Normal
  Component/s: Observability/Metrics
Discovered By: Adhoc Test
 Severity: Normal
   Status: Open  (was: Triage Needed)

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19365) invalid EstimatedHistogramReservoirSnapshot::getValue values due to race condition in DecayingEstimatedHistogramReservoir

2024-02-14 Thread Jakub Zytka (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakub Zytka updated CASSANDRA-19365:

Test and Documentation Plan: unit test 
 Status: Patch Available  (was: Open)

https://github.com/apache/cassandra/pull/3102/files

> invalid EstimatedHistogramReservoirSnapshot::getValue values due to race 
> condition in DecayingEstimatedHistogramReservoir
> -
>
> Key: CASSANDRA-19365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jakub Zytka
>Assignee: Jakub Zytka
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `DecayingEstimatedHistogramReservoir` has a race condition between `update` 
> and `rescaleIfNeeded`.
> A sample which ends up (`update`) in an already scaled decayingBucket 
> (`rescaleIfNeeded`) may still use a non-scaled weight because `decayLandmark` 
> has not been updated yet at the moment of `update`.
>  
> The observed consequence was flooding of the cluster with speculative retries 
> (we happened to hit low-percentile buckets with overweight samples, which 
> drove p99 below true p50 for a long time).
> Please note that despite the manifestation being similar to CASSANDRA-19330, 
> these are two distinct bugs in their own right.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19393) nodetool: group CMS-related commands into one command

2024-02-14 Thread n.v.harikrishna (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817363#comment-17817363
 ] 

n.v.harikrishna commented on CASSANDRA-19393:
-

Thanks you all for the inputs! I will update the PR as per the discussion.

> nodetool: group CMS-related commands into one command
> -
>
> Key: CASSANDRA-19393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19393
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: n.v.harikrishna
>Assignee: n.v.harikrishna
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is to group all CMS-related commands under one 
> "nodetool cms" command where existing command would be subcommands of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19393) nodetool: group CMS-related commands into one command

2024-02-14 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817360#comment-17817360
 ] 

Brandon Williams commented on CASSANDRA-19393:
--

bq. While it does make sense to group this, there is some kind of a habit in 
nodetool that each command is standalone.

I agree with Marcus, we have repair_admin already and I think going to 
git-style subcommands is the inevitable evolution as the number of commands we 
have grows.

> nodetool: group CMS-related commands into one command
> -
>
> Key: CASSANDRA-19393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19393
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool, Transactional Cluster Metadata
>Reporter: n.v.harikrishna
>Assignee: n.v.harikrishna
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The purpose of this ticket is to group all CMS-related commands under one 
> "nodetool cms" command where existing command would be subcommands of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19335) Default nodetool tablestats to Human-Readable Output

2024-02-14 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817357#comment-17817357
 ] 

Brandon Williams edited comment on CASSANDRA-19335 at 2/14/24 12:09 PM:


The dtests are broken as a consequence of CCM being broken.  I thought we were 
on the same page there earlier when you indicated your preference is to use the 
JSON output in CCM, but in any case CCM's 
[data_size|https://github.com/riptano/ccm/blob/master/ccmlib/node.py#L1536] 
function being broken is the crux of the problems here, which trickles down 
into the dtests.


was (Author: brandon.williams):
The dtests are broken as a consequence of CCM being broken.  I thought we were 
on the same page there earlier when you indicated your preference is to use the 
JSON output in CCM, but in any case CCM's data_size function being broken is 
the crux of the problems here, which trickles down into the dtests.

> Default nodetool tablestats to Human-Readable Output
> 
>
> Key: CASSANDRA-19335
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19335
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Leo Toff
>Assignee: Leo Toff
>Priority: Low
> Fix For: 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *Current Behavior*
> The current implementation of nodetool tablestats in Apache Cassandra outputs 
> statistics in a format that is not immediately human-readable. This output 
> primarily includes raw byte counts, which require additional calculation or 
> conversion to be easily understood by users. This can be inefficient and 
> time-consuming, especially for users who frequently monitor these statistics 
> for performance tuning or maintenance purposes.
> *Proposed Change*
> We propose that nodetool tablestats should, by default, provide its output in 
> a human-readable format. This change would involve converting byte counts 
> into more understandable units (KiB, MiB, GiB). The tool could still retain 
> the option to display raw data for those who need it, perhaps through a flag 
> such as --no-human-readable or --raw.
> *Considerations*
> The change should maintain backward compatibility, ensuring that scripts or 
> tools relying on the current output format can continue to function correctly.
> We should provide adequate documentation and examples of both the new default 
> output and how to access the raw data format, if needed.
> *Alignment*
> Discussion in the dev mailing list: 
> [https://lists.apache.org/thread/mlp715kxho5b6f1ql9omlzmmnh4qfby9] 
> *Related work*
> Previous work in the series:
>  # https://issues.apache.org/jira/browse/CASSANDRA-19015 
>  # https://issues.apache.org/jira/browse/CASSANDRA-19104



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



  1   2   >