[jira] [Created] (CASSANDRA-19647) Instances of UnfilteredRowIterator need try-with-resources as it is Autocloseable

2024-05-18 Thread Dmitrii Kriukov (Jira)
Dmitrii Kriukov created CASSANDRA-19647:
---

 Summary: Instances of UnfilteredRowIterator need 
try-with-resources as it is Autocloseable
 Key: CASSANDRA-19647
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19647
 Project: Cassandra
  Issue Type: Bug
Reporter: Dmitrii Kriukov
Assignee: Dmitrii Kriukov


Affected files: Ballots and CompactionController



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16364) Joining nodes simultaneously with auto_bootstrap:false can cause token collision

2024-05-18 Thread Jon Haddad (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847572#comment-17847572
 ] 

Jon Haddad commented on CASSANDRA-16364:


bq. Seems the fault to fix here is preventing/detecting this problem as early 
as possible (and better docs) per the original description of the ticket. 100% 
agree that the feature can and should be made safer. Changing the design to 
non-deterministic may work but is hacky, inappropriate in patch versions and 
i'm sure will introduce breakages (/more work) elsewhere given our assumptions 
on the design.

I agree that detecting and preventing as early as possible is preferable.

The primary drawback of using jitter here is that nodes will bootstrap with 
very slim owned ranges.  That's fine for reducing contention in locks but here 
t would result in significant ring imbalance, defeating the purpose of the 
token allocation algo.  Maybe that's what you mean by "hacky"?  

> Joining nodes simultaneously with auto_bootstrap:false can cause token 
> collision
> 
>
> Key: CASSANDRA-16364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16364
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Paulo Motta
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x
>
>
> While raising a 6-node ccm cluster to test 4.0-beta4, 2 nodes chosen the same 
> tokens using the default {{allocate_tokens_for_local_rf}}. However they both 
> succeeded bootstrap with colliding tokens.
> We were familiar with this issue from CASSANDRA-13701 and CASSANDRA-16079, 
> and the workaround to fix this is to avoid parallel bootstrap when using 
> {{allocate_tokens_for_local_rf}}.
> However, since this is the default behavior, we should try to detect and 
> prevent this situation when possible, since it can break users relying on 
> parallel bootstrap behavior.
> I think we could prevent this as following:
> 1. announce intent to bootstrap via gossip (ie. add node on gossip without 
> token information)
> 2. wait for gossip to settle for a longer period (ie. ring delay)
> 3. allocate tokens (if multiple bootstrap attempts are detected, tie break 
> via node-id)
> 4. broadcast tokens and move on with bootstrap



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19646) Array diskStartRangeIndex in ShardManagerDiskAware is updated, but never read

2024-05-18 Thread Dmitrii Kriukov (Jira)
Dmitrii Kriukov created CASSANDRA-19646:
---

 Summary: Array diskStartRangeIndex in ShardManagerDiskAware is 
updated, but never read
 Key: CASSANDRA-19646
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19646
 Project: Cassandra
  Issue Type: Improvement
Reporter: Dmitrii Kriukov
Assignee: Dmitrii Kriukov


It can be safely removed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19645) Mismatch of number of args of String.format() in three classes

2024-05-18 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847549#comment-17847549
 ] 

Brandon Williams commented on CASSANDRA-19645:
--

Looks good to me, checking CI.

[!https://ci-cassandra.apache.org/job/Cassandra-devbranch-5/32/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch-5/detail/Cassandra-devbranch/32/pipeline]


> Mismatch of number of args of String.format() in three classes
> --
>
> Key: CASSANDRA-19645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19645
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Dmitrii Kriukov
>Assignee: Dmitrii Kriukov
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Affected classes:
> GossipHelper lines 196-197
> SchemaGenerators line 488
> StorageService line 1087
> I'm goind to provide a PR



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19645) Mismatch of number of args of String.format() in three classes

2024-05-18 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19645:
-
Reviewers: Brandon Williams

> Mismatch of number of args of String.format() in three classes
> --
>
> Key: CASSANDRA-19645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19645
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Dmitrii Kriukov
>Assignee: Dmitrii Kriukov
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Affected classes:
> GossipHelper lines 196-197
> SchemaGenerators line 488
> StorageService line 1087
> I'm goind to provide a PR



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19645) Mismatch of number of args of String.format() in three classes

2024-05-18 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19645:
-
 Bug Category: Parent values: Correctness(12982)
   Complexity: Low Hanging Fruit
  Component/s: Local/Other
Discovered By: User Report
Fix Version/s: 5.x
 Severity: Low
   Status: Open  (was: Triage Needed)

> Mismatch of number of args of String.format() in three classes
> --
>
> Key: CASSANDRA-19645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19645
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Dmitrii Kriukov
>Assignee: Dmitrii Kriukov
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Affected classes:
> GossipHelper lines 196-197
> SchemaGenerators line 488
> StorageService line 1087
> I'm goind to provide a PR



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19645) Mismatch of number of args of String.format() in three classes

2024-05-18 Thread Dmitrii Kriukov (Jira)
Dmitrii Kriukov created CASSANDRA-19645:
---

 Summary: Mismatch of number of args of String.format() in three 
classes
 Key: CASSANDRA-19645
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19645
 Project: Cassandra
  Issue Type: Bug
Reporter: Dmitrii Kriukov
Assignee: Dmitrii Kriukov


Affected classes:

GossipHelper lines 196-197
SchemaGenerators line 488
StorageService line 1087
I'm goind to provide a PR



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16364) Joining nodes simultaneously with auto_bootstrap:false can cause token collision

2024-05-18 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847505#comment-17847505
 ] 

Brandon Williams commented on CASSANDRA-16364:
--

bq. Does this apply in trunk with tcm? Think we should be removing fixVersion 
5.x

I removed it, I'm the one who errantly added it when I added 5.0.x (muscle 
memory or something)

> Joining nodes simultaneously with auto_bootstrap:false can cause token 
> collision
> 
>
> Key: CASSANDRA-16364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16364
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Paulo Motta
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x
>
>
> While raising a 6-node ccm cluster to test 4.0-beta4, 2 nodes chosen the same 
> tokens using the default {{allocate_tokens_for_local_rf}}. However they both 
> succeeded bootstrap with colliding tokens.
> We were familiar with this issue from CASSANDRA-13701 and CASSANDRA-16079, 
> and the workaround to fix this is to avoid parallel bootstrap when using 
> {{allocate_tokens_for_local_rf}}.
> However, since this is the default behavior, we should try to detect and 
> prevent this situation when possible, since it can break users relying on 
> parallel bootstrap behavior.
> I think we could prevent this as following:
> 1. announce intent to bootstrap via gossip (ie. add node on gossip without 
> token information)
> 2. wait for gossip to settle for a longer period (ie. ring delay)
> 3. allocate tokens (if multiple bootstrap attempts are detected, tie break 
> via node-id)
> 4. broadcast tokens and move on with bootstrap



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16364) Joining nodes simultaneously with auto_bootstrap:false can cause token collision

2024-05-18 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16364:
-
Fix Version/s: (was: 5.x)

> Joining nodes simultaneously with auto_bootstrap:false can cause token 
> collision
> 
>
> Key: CASSANDRA-16364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16364
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Paulo Motta
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x
>
>
> While raising a 6-node ccm cluster to test 4.0-beta4, 2 nodes chosen the same 
> tokens using the default {{allocate_tokens_for_local_rf}}. However they both 
> succeeded bootstrap with colliding tokens.
> We were familiar with this issue from CASSANDRA-13701 and CASSANDRA-16079, 
> and the workaround to fix this is to avoid parallel bootstrap when using 
> {{allocate_tokens_for_local_rf}}.
> However, since this is the default behavior, we should try to detect and 
> prevent this situation when possible, since it can break users relying on 
> parallel bootstrap behavior.
> I think we could prevent this as following:
> 1. announce intent to bootstrap via gossip (ie. add node on gossip without 
> token information)
> 2. wait for gossip to settle for a longer period (ie. ring delay)
> 3. allocate tokens (if multiple bootstrap attempts are detected, tie break 
> via node-id)
> 4. broadcast tokens and move on with bootstrap



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch CASSANDRA-19640 deleted (was 132b5311d5)

2024-05-18 Thread brads
This is an automated email from the ASF dual-hosted git repository.

brads pushed a change to branch CASSANDRA-19640
in repository https://gitbox.apache.org/repos/asf/cassandra.git


 was 132b5311d5 Added summary to storage-engine doc

This change permanently discards the following revisions:

 discard 132b5311d5 Added summary to storage-engine doc


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) 01/01: Added summary to storage-engine doc

2024-05-18 Thread brads
This is an automated email from the ASF dual-hosted git repository.

brads pushed a commit to branch CASSANDRA-19640
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 132b5311d523c1404560e07864681842473ff1e3
Author: Brad Schoening 
AuthorDate: Sat May 18 04:29:39 2024 -0400

Added summary to storage-engine doc
---
 doc/modules/cassandra/pages/architecture/storage-engine.adoc | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/doc/modules/cassandra/pages/architecture/storage-engine.adoc 
b/doc/modules/cassandra/pages/architecture/storage-engine.adoc
index 51e6f6e9dc..abb31cbd60 100644
--- a/doc/modules/cassandra/pages/architecture/storage-engine.adoc
+++ b/doc/modules/cassandra/pages/architecture/storage-engine.adoc
@@ -1,6 +1,14 @@
 = Storage Engine
 
-{cassandra} processes data at several stages on the write path, starting with 
the immediate logging of a write and ending in with a write of data to disk:
+The {Cassandra} storage engine is optimized for high performance, 
write-oriented workloads.  It employs Log Structured Merge (LSM) trees, which 
utilize an append-only approach instead of the traditional relational database 
design with B-trees. This creates a write path free of read lookups and 
bottlenecks.
+
+While the write path is highly optimized, it comes with tradeoffs in terms of 
read performance and write amplification. To enhance read operations, Cassandra 
uses Bloom filters when accessing data from stables. Bloom filters are 
remarkably efficient, leading to a generally balanced performance for both 
reads and writes. 
+
+Compaction is a necessary background activity required by the ‘merge’ 
component of Log Structured Merge trees. Compaction creates write amplification 
when several small SSTables on disk are read, merged, updates and deletes 
processed, and a new ssstable is re-written. Every write of data in Cassandra 
is re-written multiple times, known as write amplification, and this adds 
background I/O to the database workload.
+
+The storage engine consists of memtables for in-memory data and immutable 
SSTables (Sorted String Tables) on disk.  Additionally, a write-ahead log 
(WAL), referred to as the commit log, ensures  resiliency for crash and 
transaction recovery.
+
+The stages in the write path:
 
 * Logging data in the commit log
 * Writing data to the memtable


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch CASSANDRA-19640 created (now 132b5311d5)

2024-05-18 Thread brads
This is an automated email from the ASF dual-hosted git repository.

brads pushed a change to branch CASSANDRA-19640
in repository https://gitbox.apache.org/repos/asf/cassandra.git


  at 132b5311d5 Added summary to storage-engine doc

This branch includes the following new commits:

 new 132b5311d5 Added summary to storage-engine doc

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19640) Enhance documentation on storage engine with leading summary

2024-05-18 Thread Brad Schoening (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brad Schoening reassigned CASSANDRA-19640:
--

Assignee: Brad Schoening

> Enhance documentation on storage engine with leading summary
> 
>
> Key: CASSANDRA-19640
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19640
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Brad Schoening
>Assignee: Brad Schoening
>Priority: Low
>
> The storage engine 
> [documentation|https://github.com/apache/cassandra/blob/trunk/doc/modules/cassandra/pages/architecture/storage-engine.adoc]
>   would benefit from an abstract or summary which mentions key points that it 
> uses a Log-structured merge (LSM) tree design, is write-oriented, and relies 
> upon bloom filters (not B-trees) to optimize the read path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19640) Enhance documentation on storage engine with leading summary

2024-05-18 Thread Brad Schoening (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brad Schoening updated CASSANDRA-19640:
---
Change Category: Code Clarity
 Complexity: Low Hanging Fruit
Component/s: Documentation
   Priority: Low  (was: Normal)
 Status: Open  (was: Triage Needed)

> Enhance documentation on storage engine with leading summary
> 
>
> Key: CASSANDRA-19640
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19640
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Brad Schoening
>Priority: Low
>
> The storage engine 
> [documentation|https://github.com/apache/cassandra/blob/trunk/doc/modules/cassandra/pages/architecture/storage-engine.adoc]
>   would benefit from an abstract or summary which mentions key points that it 
> uses a Log-structured merge (LSM) tree design, is write-oriented, and relies 
> upon bloom filters (not B-trees) to optimize the read path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16364) Joining nodes simultaneously with auto_bootstrap:false can cause token collision

2024-05-18 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847494#comment-17847494
 ] 

Michael Semb Wever commented on CASSANDRA-16364:


Backing up [~jjordan]'s statement, token allocation is designed to be 
deterministic, and we don't support simultaneous bootstraps.

Seems the fault to fix here is preventing/detecting this problem as early as 
possible (and better docs) per the original description of the ticket.  100% 
agree that the feature can and should be made safer.Changing the design to 
non-deterministic may work but is hacky, inappropriate in patch versions and 
i'm sure will introduce breakages (/more work) elsewhere given our assumptions 
on the design.

Does this apply in trunk with tcm?  Think we should be removing fixVersion 5.x

> Joining nodes simultaneously with auto_bootstrap:false can cause token 
> collision
> 
>
> Key: CASSANDRA-16364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16364
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Paulo Motta
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> While raising a 6-node ccm cluster to test 4.0-beta4, 2 nodes chosen the same 
> tokens using the default {{allocate_tokens_for_local_rf}}. However they both 
> succeeded bootstrap with colliding tokens.
> We were familiar with this issue from CASSANDRA-13701 and CASSANDRA-16079, 
> and the workaround to fix this is to avoid parallel bootstrap when using 
> {{allocate_tokens_for_local_rf}}.
> However, since this is the default behavior, we should try to detect and 
> prevent this situation when possible, since it can break users relying on 
> parallel bootstrap behavior.
> I think we could prevent this as following:
> 1. announce intent to bootstrap via gossip (ie. add node on gossip without 
> token information)
> 2. wait for gossip to settle for a longer period (ie. ring delay)
> 3. allocate tokens (if multiple bootstrap attempts are detected, tie break 
> via node-id)
> 4. broadcast tokens and move on with bootstrap



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org