[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-04-05 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18753:

Source Control Link: https://github.com/apache/cassandra/pull/2896
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes

2024-03-22 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829805#comment-17829805
 ] 

Branimir Lambov commented on CASSANDRA-19471:
-

They are only for the IAE, which is a a serious issue and IMHO a blocker for 
5.0.

I have not investigated the commitlog being written with durable writes off 
which is a much more benign issue. It is likely caused by the preparation of 
the direct I/O segments writing and flushing the header and first sync marker 
in advance of any use of the segment.

> Commitlog with direct io fails test_change_durable_writes
> -
>
> Key: CASSANDRA-19471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19471
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> With the commitlog_disk_access_mode set to direct, and the improved 
> configuration_test.py::TestConfiguration::test_change_durable_writes from 
> CASSANDRA-19465, this fails with either:
> {noformat}
>  AssertionError: Commitlog was written with durable writes disabled
> {noformat}
> Or what appears to be the original exception reported in CASSANDRA-19465:
> {noformat}
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 
> StorageService.java:631 - Stopping native transport
>   node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 
> StorageProxy.java:1670 - Failed to apply mutation locally :
>   java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576)
> at java.base/java.nio.Buffer.createPositionException(Buffer.java:341)
> at java.base/java.nio.Buffer.position(Buffer.java:316)
> at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52)
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53)
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:244)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:264)
> at 
> org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664)
> at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624)
> at 
> org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:833)
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 
> StorageService.java:636 - Stopping gossiper
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes

2024-03-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828364#comment-17828364
 ] 

Branimir Lambov edited comment on CASSANDRA-19471 at 3/19/24 2:43 PM:
--

I believe the problem is that the buffer's limit (set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208])
 is not the same as the buffer's capacity (from which {{endOfBuffer}} is set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]).

I guess what we want is to change the former to set the limit first and then 
apply {{{}slice{}}}. We probably also want the aligning path above it to go 
through this slicing to set the capacity appropriately. I'd also change the 
assertions that follow to make sure the limit and capacity of the prepared 
buffer match, and are equal to the segment size.


was (Author: blambov):
I believe the problem is that the buffer's limit (set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208])
 is not the same as the buffer's capacity (from which `endOfBuffer` is set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]).

I guess what we want is to change the former to set the limit first and then 
apply `slice`.

> Commitlog with direct io fails test_change_durable_writes
> -
>
> Key: CASSANDRA-19471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19471
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> With the commitlog_disk_access_mode set to direct, and the improved 
> configuration_test.py::TestConfiguration::test_change_durable_writes from 
> CASSANDRA-19465, this fails with either:
> {noformat}
>  AssertionError: Commitlog was written with durable writes disabled
> {noformat}
> Or what appears to be the original exception reported in CASSANDRA-19465:
> {noformat}
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 
> StorageService.java:631 - Stopping native transport
>   node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 
> StorageProxy.java:1670 - Failed to apply mutation locally :
>   java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576)
> at java.base/java.nio.Buffer.createPositionException(Buffer.java:341)
> at java.base/java.nio.Buffer.position(Buffer.java:316)
> at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52)
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53)
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:244)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:264)
> at 
> org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664)
> at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624)
> at 
> org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:833)
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 
> StorageService.java:636 - Stopping gossiper
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes

2024-03-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828364#comment-17828364
 ] 

Branimir Lambov commented on CASSANDRA-19471:
-

I believe the problem is that the buffer's limit (set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208])
 is not the same as the buffer's capacity (from which `endOfBuffer` is set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]).

I guess what we want is to change the former to set the limit first and then 
apply `slice`.

> Commitlog with direct io fails test_change_durable_writes
> -
>
> Key: CASSANDRA-19471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19471
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> With the commitlog_disk_access_mode set to direct, and the improved 
> configuration_test.py::TestConfiguration::test_change_durable_writes from 
> CASSANDRA-19465, this fails with either:
> {noformat}
>  AssertionError: Commitlog was written with durable writes disabled
> {noformat}
> Or what appears to be the original exception reported in CASSANDRA-19465:
> {noformat}
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 
> StorageService.java:631 - Stopping native transport
>   node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 
> StorageProxy.java:1670 - Failed to apply mutation locally :
>   java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576)
> at java.base/java.nio.Buffer.createPositionException(Buffer.java:341)
> at java.base/java.nio.Buffer.position(Buffer.java:316)
> at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52)
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53)
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:244)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:264)
> at 
> org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664)
> at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624)
> at 
> org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:833)
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 
> StorageService.java:636 - Stopping gossiper
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml

2024-03-08 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824756#comment-17824756
 ] 

Branimir Lambov commented on CASSANDRA-19460:
-

LGTM

> Fix tests to work with ULID SSTable identifiers to enable 
> uuid_sstable_identifiers_enabled in cassandra-latest.yaml
> ---
>
> Key: CASSANDRA-19460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19460
> Project: Cassandra
>  Issue Type: Task
>  Components: CI, Test/dtest/java, Test/unit
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-18753 we identified that we want to also set 
> uuid_sstable_identifiers_enabled to true, while running a CI with it turned 
> on, it failed (1).
> Errors do not seem to be serious, it is just the test suite we have is not 
> prepared for the case when uuid_sstable_identifiers_enabled is set to true by 
> default.
> We need to fix all these tests so we can have cassandra-latest.yaml 
> containing that property.
> https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml

2024-03-08 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19460:

Reviewers: Branimir Lambov
   Status: Review In Progress  (was: Needs Committer)

> Fix tests to work with ULID SSTable identifiers to enable 
> uuid_sstable_identifiers_enabled in cassandra-latest.yaml
> ---
>
> Key: CASSANDRA-19460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19460
> Project: Cassandra
>  Issue Type: Task
>  Components: CI, Test/dtest/java, Test/unit
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-18753 we identified that we want to also set 
> uuid_sstable_identifiers_enabled to true, while running a CI with it turned 
> on, it failed (1).
> Errors do not seem to be serious, it is just the test suite we have is not 
> prepared for the case when uuid_sstable_identifiers_enabled is set to true by 
> default.
> We need to fix all these tests so we can have cassandra-latest.yaml 
> containing that property.
> https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml

2024-03-08 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19460:

Status: Ready to Commit  (was: Review In Progress)

> Fix tests to work with ULID SSTable identifiers to enable 
> uuid_sstable_identifiers_enabled in cassandra-latest.yaml
> ---
>
> Key: CASSANDRA-19460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19460
> Project: Cassandra
>  Issue Type: Task
>  Components: CI, Test/dtest/java, Test/unit
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-18753 we identified that we want to also set 
> uuid_sstable_identifiers_enabled to true, while running a CI with it turned 
> on, it failed (1).
> Errors do not seem to be serious, it is just the test suite we have is not 
> prepared for the case when uuid_sstable_identifiers_enabled is set to true by 
> default.
> We need to fix all these tests so we can have cassandra-latest.yaml 
> containing that property.
> https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-03-07 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824394#comment-17824394
 ] 

Branimir Lambov commented on CASSANDRA-18753:
-

Committed to 5.0 as 
[06ed1afc34128523298020e7601dad148f2b2fb6|https://github.com/apache/cassandra/commit/06ed1afc34128523298020e7601dad148f2b2fb6]
 and trunk as 
[28efb63df52bafaf51cd458da021f6050900017a|https://github.com/apache/cassandra/commit/28efb63df52bafaf51cd458da021f6050900017a].

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-03-06 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823998#comment-17823998
 ] 

Branimir Lambov commented on CASSANDRA-18753:
-

That test is apparently already fixed. 

[Latest 
run|https://app.circleci.com/pipelines/github/blambov/cassandra/606/workflows/628459f1-f3fe-449c-a047-a784cc9711f5/jobs/24959/tests]
 had only a timeout of {{ActiveCompactionsTest}} -- reduced the number of 
iterations in the test to fix this.

Uploaded final version; I'm ready to commit it but I'd like one last review of 
the wording in {{NEWS.txt}} and {{cassandra(-latest).yaml}}.

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI

2024-03-06 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19459:

Resolution: Fixed
Status: Resolved  (was: Triage Needed)

Fixed by CASSANDRA-19018.

> test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
>  fails with SAI
> ---
>
> Key: CASSANDRA-19459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19459
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Branimir Lambov
>Priority: Normal
>
> The dtest 
> {{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
>  fails when the default secondary index is switched to SAI with
> {code}
> test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
>  failed; it passed 0 out of the required 1 times.
>   
>   Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] 
> exited with non-zero status; exit status: 2; 
> stderr: error: null
> -- StackTrace --
> java.lang.NullPointerException
>   at java.base/java.util.Objects.requireNonNull(Objects.java:209)
>   at 
> org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102)
>   at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166)
>   at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125)
>   at 
> org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
>   at 
> java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289)
>   at 
> org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Thread.java:840)
> {code}
> Discovered while testing CASSANDRA-18753.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-03-06 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823946#comment-17823946
 ] 

Branimir Lambov edited comment on CASSANDRA-18753 at 3/6/24 10:07 AM:
--

Well, tests [look much better 
now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7].

We have only one failure, 
{{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
 with SAI. Opened CASSANDRA-19459 for this, and proceeding to merge this ticket.


was (Author: blambov):
Well, tests [look much better 
now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7].

We have only one failure, 
{{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
 with SAI. Opened CASSANDRA- 19459 for this, and proceeding to merge this 
ticket.

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI

2024-03-06 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19459:
---

 Summary: 
test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
 fails with SAI
 Key: CASSANDRA-19459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19459
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/SAI
Reporter: Branimir Lambov


The dtest 
{{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
 fails when the default secondary index is switched to SAI with
{code}
test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
 failed; it passed 0 out of the required 1 times.

Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] 
exited with non-zero status; exit status: 2; 
stderr: error: null
-- StackTrace --
java.lang.NullPointerException
at java.base/java.util.Objects.requireNonNull(Objects.java:209)
at 
org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102)
at 
org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166)
at 
org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125)
at 
org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
at 
java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092)
at 
org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289)
at 
org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90)
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354)
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:840)
{code}

Discovered while testing CASSANDRA-18753.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-02-29 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822034#comment-17822034
 ] 

Branimir Lambov commented on CASSANDRA-18753:
-

I don't mind removing it, especially if we have a plan for adding it back.

I'll remove it and re-run CI.

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-01-16 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799053#comment-17799053
 ] 

Branimir Lambov edited comment on CASSANDRA-18753 at 1/16/24 8:49 AM:
--

Merged CCM and DTest patches (they do not change anything unless the 
{{--configuration-yaml}} flag is used).

[The state of failing tests at the 
moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]:
 - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}):
 -- {{CQLVectorTest}} (CASSANDRA-19167)
 -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168)
 - JUnit tests in latest mode:
 -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, 
{{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, 
{{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042)
 -- {{RepairJobTest}} (CASSANDRA-19043)
 -- {{ClientRequestMetricsTest}} (CASSANDRA-19046)
 - JVM dtests in latest mode:
 -- {{RepairTest}} (CASSANDRA-19085)
 -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126)
 -- {{QueriesTableTest}} (CASSANDRA-19046)
 - Python dtests in latest mode:
 -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145)
 -- {{TestReplaceAddress}} (CASSANDRA-19144)
 -- {{TestSnapshot}} (CASSANDRA-19126)
 -- {{TestClientRequestMetrics}} (CASSANDRA-19046)

Several {{TestBootstrap}} tests seems to be failing in all configurations, some 
already marked as flaky; this likely is not caused by this patch. There are 
also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run 
repeatedly due to longer 
{{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}).

Please review [the PR|https://github.com/apache/cassandra/pull/2896].


was (Author: blambov):
Merged CCM and DTest patches (they do not change anything unless the 
{{--configuration-yaml}} flag is used).

[The state of failing tests at the 
moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]:
 - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}):
 -- {{CQLVectorTest}} (CASSANDRA-19167)
 -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168)
 - JUnit tests in latest mode:
 -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, 
{{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, 
{{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042)
 -- {{RepairJobTest}} (CASSANDRA-19043)
 - JVM dtests in latest mode:
 -- {{RepairTest}} (CASSANDRA-19085)
 -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126)
 -- {{QueriesTableTest}} (CASSANDRA-19046)
 - Python dtests in latest mode:
 -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145)
 -- {{TestReplaceAddress}} (CASSANDRA-19144)
 -- {{TestSnapshot}} (CASSANDRA-19126)
 -- {{TestClientRequestMetrics}} (CASSANDRA-19046)

Several {{TestBootstrap}} tests seems to be failing in all configurations, some 
already marked as flaky; this likely is not caused by this patch. There are 
also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run 
repeatedly due to longer 
{{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}).

Please review [the PR|https://github.com/apache/cassandra/pull/2896].

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate 

[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2024-01-09 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804661#comment-17804661
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

It is to me.

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798565#comment-17798565
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

{quote}
{code}
private static final String MIXED_MODE_ERROR = "Some nodes involved in 
repair are on an incompatible major version. " +
   "Repair is not supported in 
mixed major version clusters.";
{code}
{quote}

_To me_ this message in the context of a 5.0 cluster where something is in the 
wrong compatibility mode would be quite confusing. At the very least we need to 
state very clearly that a 5.x node in compatibility mode is considered a 4.x 
node for all intents and purposes, including being a "same major version" for 
the message above. Also, does this not mean we can't ever drop 4.0 support 
because e.g. 6.0 must be compatible with 5.0, including in its compatibility 
mode?

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-11 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795406#comment-17795406
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

In other words, you both feel that it is okay for {{BulkLoader}} to not work if 
it is not the corresponding version or is not configured exactly like the 
database is?

Separately, that a node in e.g. {{UPGRADING}} mode should not be able to stream 
sstables to one in {{NONE}}?

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-10 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795089#comment-17795089
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

> Precise fix for this would be to use the same compatibility mode for bulk 
> loader and the node.

While this would fix the test, it would not do anything about the underlying 
problem. C* 5 nodes in different compatibility mode should be able to stream 
with each other. One should at least be able to stream whole sstables from 
legacy mode to current.

Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it 
might violate the downgradability promise while such data is not compacted. We 
probably need a warning if current-format data is streamed to a node in legacy 
mode (e.g. suggesting one does upgradesstables before downgrading below 5.0).

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-10 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795089#comment-17795089
 ] 

Branimir Lambov edited comment on CASSANDRA-19126 at 12/10/23 4:57 PM:
---

bq. Precise fix for this would be to use the same compatibility mode for bulk 
loader and the node.

While this would fix the test, it would not do anything about the underlying 
problem. C* 5 nodes in different compatibility mode should be able to stream 
with each other. One should at least be able to stream whole sstables from 
legacy mode to current.

Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it 
might violate the downgradability promise while such data is not compacted. We 
probably need a warning if current-format data is streamed to a node in legacy 
mode (e.g. suggesting one does upgradesstables before downgrading below 5.0).


was (Author: blambov):
> Precise fix for this would be to use the same compatibility mode for bulk 
> loader and the node.

While this would fix the test, it would not do anything about the underlying 
problem. C* 5 nodes in different compatibility mode should be able to stream 
with each other. One should at least be able to stream whole sstables from 
legacy mode to current.

Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it 
might violate the downgradability promise while such data is not compacted. We 
probably need a warning if current-format data is streamed to a node in legacy 
mode (e.g. suggesting one does upgradesstables before downgrading below 5.0).

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19168:

Fix Version/s: 5.0-rc

> VectorUpdateDeleteTest fails with heap_buffers
> --
>
> Key: CASSANDRA-19168
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19168
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Vector Search
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} 
> fails with
> {code}
> junit.framework.AssertionFailedError: Result set does not contain a row with 
> pk = 0
>   at 
> org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133)
>   at 
> org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19168:
---

 Summary: VectorUpdateDeleteTest fails with heap_buffers
 Key: CASSANDRA-19168
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19168
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/Vector Search
Reporter: Branimir Lambov


When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} 
fails with
{code}
junit.framework.AssertionFailedError: Result set does not contain a row with pk 
= 0
at 
org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133)
at 
org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19167:

Fix Version/s: 5.0-rc

> CQLVectorTest fails with heap_buffers
> -
>
> Key: CASSANDRA-19167
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19167
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Vector Search
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} 
> test fails with
> {code}
> org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: 
> Invalid 32-bits integer value, expecting 4 bytes but got 6
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819)
>   at 
> org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135)
>   at 
> org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082)
>   at 
> org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180)
>   at 
> org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142)
>   at 
> org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277)
>   at 
> org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58)
>   at 
> org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605)
>   at 
> org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175)
>   at 
> org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445)
>   at 
> org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597)
>   at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576)
>   at 
> org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19167:
---

 Summary: CQLVectorTest fails with heap_buffers
 Key: CASSANDRA-19167
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19167
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/Vector Search
Reporter: Branimir Lambov


When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} test 
fails with
{code}
org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: 
Invalid 32-bits integer value, expecting 4 bytes but got 6
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819)
at 
org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135)
at 
org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082)
at 
org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180)
at 
org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142)
at 
org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277)
at 
org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58)
at 
org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605)
at 
org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175)
at 
org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162)
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999)
at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564)
at 
org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600)
at 
org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570)
at 
org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108)
at 
org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445)
at 
org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597)
at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576)
at 
org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19145) Python dtest TestWriteFailures.test_paxos is failing with Paxos V2

2023-12-01 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19145:
---

 Summary: Python dtest TestWriteFailures.test_paxos is failing with 
Paxos V2
 Key: CASSANDRA-19145
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19145
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/Lightweight Transactions
Reporter: Branimir Lambov


With configuration changed to engage Paxos V2 with repaired state purging, the 
dtest fails with:
{code}
test_paxos
write_failures_test.TestWriteFailures

self = 

def test_paxos(self):
"""
A light transaction receives a WriteFailure
"""
>   exc = self._perform_cql_statement("INSERT INTO mytable (key, value) 
> VALUES ('key1', 'Value 1') IF NOT EXISTS")

write_failures_test.py:202: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
write_failures_test.py:88: in _perform_cql_statement
session.execute(statement)
../env3.7/src/cassandra-driver/cassandra/cluster.py:2618: in execute
return self.execute_async(query, parameters, trace, custom_payload, 
timeout, execution_profile, paging_state, host, execute_as).result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 

def result(self):
"""
Return the final result or raise an Exception if errors were
encountered.  If the final result or error has not been set
yet, this method will block until it is set, or the timeout
set for the request expires.

Timeout is specified in the Session request execution functions.
If the timeout is exceeded, an :exc:`cassandra.OperationTimedOut` will 
be raised.
This is a client-side timeout. For more information
about server-side coordinator timeouts, see 
:class:`.policies.RetryPolicy`.

Example usage::

>>> future = session.execute_async("SELECT * FROM mycf")
>>> # do other stuff...

>>> try:
... rows = future.result()
... for row in rows:
... ... # process results
... except Exception:
... log.exception("Operation failed:")

"""
self._event.wait()
if self._final_result is not _NOT_SET:
return ResultSet(self, self._final_result)
else:
>   raise self._final_exception
E   cassandra.WriteTimeout: Error from server: code=1100 [Coordinator 
node timed out waiting for replica nodes' responses] message="CAS operation 
timed out: received 1 of 2 required responses after 0 contention retries" 
info={'consistency': 'SERIAL', 'required_responses': 2, 'received_responses': 
1, 'write_type': 'CAS'}

../env3.7/src/cassandra-driver/cassandra/cluster.py:4894: WriteTimeout
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19144) Python dtest replace_address_test.TestReplaceAddress is failing with Paxos V2

2023-12-01 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19144:
---

 Summary: Python dtest replace_address_test.TestReplaceAddress is 
failing with Paxos V2
 Key: CASSANDRA-19144
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19144
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Bootstrap and Decommission, 
Feature/Lightweight Transactions
Reporter: Branimir Lambov


Paxos repair is causing an unexpected failure:
{code}
test_replace_with_insufficient_replicas
replace_address_test.TestReplaceAddress

failed on teardown with "Failed: Unexpected error found in node logs (see 
stdout for full details). Errors: [[replacement] 'ERROR [main] 2023-11-29 
10:23:08,752 CassandraDaemon.java:878 - Exception encountered during 
startup\njava.lang.UnsupportedOperationException: null\n\tat 
org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat
 
org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat
 
org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat
 
org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat
 
org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat
 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)']"
Unexpected error found in node logs (see stdout for full details). Errors: 
[[replacement] 'ERROR [main] 2023-11-29 10:23:08,752 CassandraDaemon.java:878 - 
Exception encountered during startup\njava.lang.UnsupportedOperationException: 
null\n\tat 
org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat
 
org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat
 
org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat
 
org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat
 
org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat
 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)']
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792095#comment-17792095
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

Python dtest \{{snaphost_test}} is also failing because of this sstableloader 
problem:
{code:java}
Exception: sstableloader command '/home/cassandra/cassandra/bin/sstableloader 
-d 127.0.0.1 /tmp/tmpidg_8u3c/0/ks/cf' failed; exit status: 1'; stdout: 
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of /tmp/tmpidg_8u3c/0/ks/cf/da-1-bti-Data.db to 
[/127.0.0.1:7000]

progress: total: 100% 0.000B/s (avg: 0.000B/s)
; stderr: ERROR 10:16:01,391 [Stream #4bb85ff0-8ea0-11ee-94d3-3de6344de31d] 
Streaming error occurred on session with peer 127.0.0.1:7000
java.lang.ClassCastException: class 
org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible cannot 
be cast to class 
org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success 
(org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible and 
org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success are in 
unnamed module of loader 'app')
{code}

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics

2023-12-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792090#comment-17792090
 ] 

Branimir Lambov commented on CASSANDRA-19046:
-

Python dtest failure related to this: 
{{client_request_metrics_test.TestClientRequestMetrics}}
{code:java}
 >   self.cas_read_contention()

client_request_metrics_test.py:103: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
client_request_metrics_test.py:355: in cas_read_contention
consistency_level=CL.SERIAL))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
metric_factory = functools.partial(, 'CASRead')
statement = 

def cas_contention(self, metric_factory, statement):

query_count = 20
cassandra_version = self.dtest_config.cassandra_version_from_build

def sample():
baseline = metric_factory()
baseline.validate(cassandra_version)

execute_concurrent_with_args(self.session,
 statement,
 repeat([], query_count), 
raise_on_first_error=False)

updated = metric_factory()
updated.validate(cassandra_version)

return updated.diff(baseline)

for _ in range(10):
diff = sample()
if 'ContentionHistogram.Count' in diff:
break

assert diff['Latency.Count'] == query_count
assert diff['TotalLatency.Count'] > 0
>   assert 0 < diff['ContentionHistogram.Count'] <= query_count
E   KeyError: 'ContentionHistogram.Count'

client_request_metrics_test.py:382: KeyError{code}

> Paxos V2 does not update individual fields of readMetrics
> -
>
> Key: CASSANDRA-19046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19046
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Observability/Metrics
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with 
> {{paxos_variant: v2}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791958#comment-17791958
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

I believe what Brandon means is that we also need upgrade tests where only some 
nodes have changed {{storage_compatibility_mode}}.

[This 
line|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L259]
 is what appears to be preventing {{BulkLoader}} from working. I don't have 
enough knowledge in the area and have not dug deep enough to understand all 
implications.

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-11-30 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19126:

Description: 
In particular, SSTableLoader appears to be incompatible with 
storage_compatibility_mode: NONE, which manifests as a failure of 
{{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
{{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
help (according to the docs, this setting is not picked up).

This is likely a bigger problem as the acceptable streaming version for C* 5 is 
12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear 
to be able to stream with each other if their setting for the compatibility 
mode is different.

  was:
In particular, SSTableLoader appears to be incompatible with 
storage_compatibility_mode: NONE, which manifests as a failure of 
`org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when 
the flag is turned on (found during CASSANDRA-18753 testing). Setting 
`storage_compatibility_mode: NONE` in the tool configuration yaml does not help 
(according to the docs, this setting is not picked up).

This is likely a bigger problem as the acceptable streaming version for C* 5 is 
12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear 
to be able to stream with each other if their setting for the compatibility 
mode is different.


> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Priority: Normal
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-11-30 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19126:
---

 Summary: Streaming appears to be incompatible with different 
storage_compatibility_mode settings
 Key: CASSANDRA-19126
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
Messaging/Internode, Tool/bulk load
Reporter: Branimir Lambov


In particular, SSTableLoader appears to be incompatible with 
storage_compatibility_mode: NONE, which manifests as a failure of 
`org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when 
the flag is turned on (found during CASSANDRA-18753 testing). Setting 
`storage_compatibility_mode: NONE` in the tool configuration yaml does not help 
(according to the docs, this setting is not picked up).

This is likely a bigger problem as the acceptable streaming version for C* 5 is 
12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear 
to be able to stream with each other if their setting for the compatibility 
mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2023-11-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19085:

Fix Version/s: 5.0-rc

> In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
> ---
>
> Key: CASSANDRA-19085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
> the test fails with an exception that appears to be a genuine problem:
> {code:java}
> junit.framework.AssertionFailedError: Exception found expected null, but 
> was:   at 
> org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> >
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> org.apache.cassandra.distributed.shared.ShutdownException: Uncaught 
> exceptions were thrown during test
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   Suppressed: java.lang.IllegalStateException: complete already: 
> (failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
>   at 
> org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
>   at 
> org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
>   at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
>   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
>   at 
> 

[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2023-11-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18753:

Fix Version/s: 5.0-rc
   (was: 5.0.x)

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2023-11-24 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19085:

Description: 
More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
the test fails with an exception that appears to be a genuine problem:
{code:java}
junit.framework.AssertionFailedError: Exception found expected null, but 
was:
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
at 
org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)


org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions 
were thrown during test
at 
org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
at 
org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
at 
org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Suppressed: java.lang.IllegalStateException: complete already: 
(failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
at 
org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
at 
org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
at 
org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
at 
org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
at 
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
at 
org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at 
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833){code}
The updates to {{pending}} in ActiveRepairService are not concurrency-safe, but 
fixing them by doing e.g.
{code:java}
Index: src/java/org/apache/cassandra/service/ActiveRepairService.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java 
b/src/java/org/apache/cassandra/service/ActiveRepairService.java
--- a/src/java/org/apache/cassandra/service/ActiveRepairService.java    
(revision 04552046f74f596e69e2d98c3f3e522fb5888c99)
+++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java    (date 
1700839874092)
@@ -675,7 +675,7 @@
             if (promise.isDone())
                 return;
             String errorMsg = "Did not get replies from all endpoints.";
-            if (promise.tryFailure(new RuntimeException(errorMsg)))
+            if (pending.getAndSet(-1) > 0 && promise.tryFailure(new 
RuntimeException(errorMsg)))
                 participateFailed(parentRepairSession, errorMsg);
         }, timeoutMillis, MILLISECONDS);
 
@@ -703,8 +703,8 @@
                 failedNodes.add(from.toString());
                 if (failureReason == RequestFailureReason.TIMEOUT)
                 {
-                    

[jira] [Created] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2023-11-24 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19085:
---

 Summary: In-jvm dtest RepairTest fails with 
storage_compatibility_mode: NONE
 Key: CASSANDRA-19085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19085
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Repair
Reporter: Branimir Lambov


More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
the test fails with an exception that appears to be a genuine problem:
{code:java}
junit.framework.AssertionFailedError: Exception found expected null, but 
was:
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
at 
org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)


org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions 
were thrown during test
at 
org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
at 
org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
at 
org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Suppressed: java.lang.IllegalStateException: complete already: 
(failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
at 
org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
at 
org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
at 
org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
at 
org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
at 
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
at 
org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at 
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833){code}
The updates to {{pending}} in AbstractRepairService are not concurrency-safe, 
but fixing them by doing e.g.
{code:java}
Index: src/java/org/apache/cassandra/service/ActiveRepairService.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java 
b/src/java/org/apache/cassandra/service/ActiveRepairService.java
--- a/src/java/org/apache/cassandra/service/ActiveRepairService.java    
(revision 04552046f74f596e69e2d98c3f3e522fb5888c99)
+++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java    (date 
1700839874092)
@@ -675,7 +675,7 @@
             if (promise.isDone())
                 return;
             String errorMsg = "Did not get replies from all endpoints.";
-            if (promise.tryFailure(new RuntimeException(errorMsg)))
+            if (pending.getAndSet(-1) > 0 && promise.tryFailure(new 
RuntimeException(errorMsg)))
                 participateFailed(parentRepairSession, errorMsg);
         }, timeoutMillis, 

[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals

2023-11-23 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789130#comment-17789130
 ] 

Branimir Lambov commented on CASSANDRA-18757:
-

Tests look good, repeated test completed with no failures: 
[https://app.circleci.com/pipelines/github/blambov/cassandra?branch=CASSANDRA-18757]

[~smiklosovic], do you give a second approval so that I can commit this?

> UnifiedCompactionTask is incorrectly setting keepOriginals
> --
>
> Key: CASSANDRA-18757
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18757
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> super(cfs, txn, gcBefore, 
> strategy.getController().getIgnoreOverlapsInExpirationCheck());{code}
> in {{UnifiedCompactionTask}} is calling the base constructor
> {code:java}
>  public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long 
> gcBefore, boolean keepOriginals)
> {code}
> which can set {{keepOriginals}} to true when it should not be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18753) We should offer an option for optimized default configuration

2023-11-23 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789052#comment-17789052
 ] 

Branimir Lambov edited comment on CASSANDRA-18753 at 11/23/23 10:20 AM:


DTest support has been added.

The python dtests require pull requests for 
[CCM|https://github.com/riptano/ccm/pull/760] and 
[cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be 
merged. It works by passing an argument to ccm to make it read the 
configuration from "cassandra_latest.yaml". The new configuration replaces 
{{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on 
in the latest configuration.

I'm not happy at all with how the in-jvm dtests are configured at this point 
(directly including the settings in code), but I could not think of a quick way 
to get them to load a configuration file. The latest config is combined with 
vnodes to lighten the testing load.

Test results to appear 
[here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e].


was (Author: blambov):
DTest support has been added.

The python dtests require pull requests for 
[CCM|https://github.com/riptano/ccm/pull/760] and 
[cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be 
merged. It works by passing an argument to ccm to make it read the 
configuration from "cassandra_latest.yaml". The new configuration replaces 
{{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on 
in the latest configuration.

I'm not happy at all with how the in-jvm dtests are configured at this point 
(directly including the settings in code), but I could not think of a quick way 
to get them to load a configuration file.

Test results to appear 
[here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e].

> We should offer an option for optimized default configuration
> -
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0.x, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals

2023-11-22 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788730#comment-17788730
 ] 

Branimir Lambov commented on CASSANDRA-18757:
-

How about splitting this into separate tests for the 4 cases? I.e. have the 
four calls in {{testIgnoreOverlaps}} run in separate {{@Test}}-annotated 
methods?

> UnifiedCompactionTask is incorrectly setting keepOriginals
> --
>
> Key: CASSANDRA-18757
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18757
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> super(cfs, txn, gcBefore, 
> strategy.getController().getIgnoreOverlapsInExpirationCheck());{code}
> in {{UnifiedCompactionTask}} is calling the base constructor
> {code:java}
>  public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long 
> gcBefore, boolean keepOriginals)
> {code}
> which can set {{keepOriginals}} to true when it should not be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics

2023-11-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19046:

Summary: Paxos V2 does not update individual fields of readMetrics  (was: 
Paxos V2 does not individual fields of readMetrics)

> Paxos V2 does not update individual fields of readMetrics
> -
>
> Key: CASSANDRA-19046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19046
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Observability/Metrics
>Reporter: Branimir Lambov
>Priority: Normal
>
> As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with 
> {{paxos_variant: v2}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index

2023-11-17 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787323#comment-17787323
 ] 

Branimir Lambov commented on CASSANDRA-19034:
-

Yes, we have run the entire unit test suite (no dtests yet) with SAI as 
default, and these three are the only failures that aren't usecases that SAI 
can't support (ByteOrderedPartitioner and blobs).

With CASSANDRA-18753, we will have a test configuration run as part as the 
precommit tests that runs with SAI (plus tries, UCS, paxos v2...).

> SelectTest fails when run with SAI index
> 
>
> Key: CASSANDRA-19034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19034
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-beta
>
>
> When run with SAI index, the following two tests error out:
> {code}
> [junit-timeout] Testcase: 
> testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>FAILED
> [junit-timeout] Got less rows than expected. Expected 1 but got 0
> [junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
> expected. Expected 1 but got 0
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout] 
> [junit-timeout] 
> [junit-timeout] Testcase: 
> testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>   FAILED
> [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected 
> <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 
> column 0 (k1 of type int), expected <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
> The latter seems to be giving the results in the wrong order, and the order 
> flips when the data is flushed.
> Caught during preparation of _latest config that would switch default to SAI 
> (CASSANDRA-18753).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index

2023-11-17 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787279#comment-17787279
 ] 

Branimir Lambov commented on CASSANDRA-19034:
-

A further failure of this kind:

{code}
[junit-timeout] Testcase: 
testStaticIndexAndNonStaticIndex(org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest)-_jdk11:
  FAILED
[junit-timeout] Got less rows than expected. Expected 1 but got 0
[junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
expected. Expected 1 but got 0
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest.testStaticIndexAndNonStaticIndex(SecondaryIndexOnStaticColumnTest.java:191)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit-timeout] 
[junit-timeout] 
[junit-timeout] Test 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest 
FAILED
{code} 

> SelectTest fails when run with SAI index
> 
>
> Key: CASSANDRA-19034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19034
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Branimir Lambov
>Priority: Normal
>
> When run with SAI index, the following two tests error out:
> {code}
> [junit-timeout] Testcase: 
> testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>FAILED
> [junit-timeout] Got less rows than expected. Expected 1 but got 0
> [junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
> expected. Expected 1 but got 0
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout] 
> [junit-timeout] 
> [junit-timeout] Testcase: 
> testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>   FAILED
> [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected 
> <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 
> column 0 (k1 of type int), expected <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
> The latter seems to be giving the results in the wrong order, and the order 
> flips when the data is flushed.
> Caught during preparation of _latest config that would switch default to SAI 
> (CASSANDRA-18753).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To 

[jira] [Created] (CASSANDRA-19034) SelectTest fails when run with SAI index

2023-11-17 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19034:
---

 Summary: SelectTest fails when run with SAI index
 Key: CASSANDRA-19034
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19034
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/SAI
Reporter: Branimir Lambov


When run with SAI index, the following two tests error out:

{code}
[junit-timeout] Testcase: 
testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
 FAILED
[junit-timeout] Got less rows than expected. Expected 1 but got 0
[junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
expected. Expected 1 but got 0
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625)
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit-timeout] 
[junit-timeout] 
[junit-timeout] Testcase: 
testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
FAILED
[junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected <1> 
but got <0>
[junit-timeout] Invalid value for row 1 column 2 (v of type set), expected 
<{4, 5, 6}> but got <{2, 3, 4}>
[junit-timeout] 
[junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 
column 0 (k1 of type int), expected <1> but got <0>
[junit-timeout] Invalid value for row 1 column 2 (v of type set), expected 
<{4, 5, 6}> but got <{2, 3, 4}>
[junit-timeout] 
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543)
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
{code}

The latter seems to be giving the results in the wrong order, and the order 
flips when the data is flushed.

Caught during preparation of _latest config that would switch default to SAI 
(CASSANDRA-18753).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-11-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786290#comment-17786290
 ] 

Branimir Lambov edited comment on CASSANDRA-18710 at 11/15/23 10:15 AM:


{quote}So perhaps the expected value should be calculated as a moving average 
by updating it with subsequent table sizes.
{quote}
This makes sense. Sorting the sstable files by name should give them in the 
correct order, so we can easily calculate the moving average from them.

Actually, that would solve the extra flush problem as well, wouldn't it?


was (Author: blambov):
{quote}So perhaps the expected value should be calculated as a moving average 
by updating it with subsequent table sizes.
{quote}
This makes sense. Sorting the sstable files by name should give them in the 
correct order, so we can easily calculate the moving average from them.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-11-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786290#comment-17786290
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

{quote}So perhaps the expected value should be calculated as a moving average 
by updating it with subsequent table sizes.
{quote}
This makes sense. Sorting the sstable files by name should give them in the 
correct order, so we can easily calculate the moving average from them.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals

2023-11-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786282#comment-17786282
 ] 

Branimir Lambov commented on CASSANDRA-18757:
-

I think it is a leftover from a refactoring that (among other things) fixed 
CASSANDRA-18756 in DSE.

Fix LGTM, but it's a shame that no test caught it.

> UnifiedCompactionTask is incorrectly setting keepOriginals
> --
>
> Key: CASSANDRA-18757
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18757
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> super(cfs, txn, gcBefore, 
> strategy.getController().getIgnoreOverlapsInExpirationCheck());{code}
> in {{UnifiedCompactionTask}} is calling the base constructor
> {code:java}
>  public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long 
> gcBefore, boolean keepOriginals)
> {code}
> which can set {{keepOriginals}} to true when it should not be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782692#comment-17782692
 ] 

Branimir Lambov edited comment on CASSANDRA-18945 at 11/3/23 6:15 PM:
--

{quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should 
translate to baseShardCount

Review Comment:
@ethan-brown2022 `count >= 0` is more natural to me
{quote}
I can't find this to reply to it directly. The comment at the end of the line 
says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass 
{{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which 
would fail count >= 0, but is acceptable and should translate to 
baseShardCount)" or something similar?


was (Author: blambov):
{quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should 
translate to baseShardCount

Review Comment:
@ethan-brown2022 `count >= 0` is more natural to me
{quote}
I can't find this to reply to it directly. The comment at the end of the line 
says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass 
{{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which 
would fail {{{}count >= 0,{}}}", but is acceptable and should translate to 
baseShardCount)" or something similar?

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782692#comment-17782692
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

{quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should 
translate to baseShardCount

Review Comment:
@ethan-brown2022 `count >= 0` is more natural to me
{quote}
I can't find this to reply to it directly. The comment at the end of the line 
says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass 
{{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which 
would fail {{{}count >= 0,{}}}", but is acceptable and should translate to 
baseShardCount)" or something similar?

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18232) Write docs for CEP-26 Unified Compaction Strategy (UCS)

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782640#comment-17782640
 ] 

Branimir Lambov commented on CASSANDRA-18232:
-

There are some additional options coming with CASSANDRA-18945. The details can 
be found in [the developer-side markdown 
doc|https://github.com/datastax/cassandra/blob/CASSANDRA-18945/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#full-sharding-scheme].

> Write docs for CEP-26 Unified Compaction Strategy (UCS)
> ---
>
> Key: CASSANDRA-18232
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18232
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Lorina Poland
>Assignee: Lorina Poland
>Priority: High
> Fix For: 5.x
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782638#comment-17782638
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

We will handle the docs in the documentation ticket, CASSANDRA-18232. I will 
reach out to Lorina make her aware of the changes.

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18997) Unified Compaction Strategy is missing documentation

2023-11-03 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-18997:
---

 Summary: Unified Compaction Strategy is missing documentation
 Key: CASSANDRA-18997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18997
 Project: Cassandra
  Issue Type: Task
  Components: Documentation
Reporter: Branimir Lambov


UCS is missing from [the CQL documentation for 
5.0|https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/ddl.html#cql-compaction-options]
 and [the compaction 
page|https://cassandra.apache.org/doc/5.0/cassandra/managing/operating/compaction/index.html#compaction-options].

We need to create a documentation page for UCS and link it from both.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782610#comment-17782610
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

Yes, this looks like a 4.1 regression that is affecting all tests that are 
sensitive to the number of sstables. Such tests usually run in a separate 
keyspace (using {{KEYSPACE_PER_TEST}}) to avoid the keyspace flush that 
dropping a table triggers, but this new commit log recycling is triggering 
another flush that is not restricted to the affected keyspace.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0-beta, 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration

2023-11-02 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782193#comment-17782193
 ] 

Branimir Lambov commented on CASSANDRA-18533:
-

I would keep it simple and not add a common settings entry under options. If 
necessary, the user can copy the value to both.

> Move format-specific sstable options into the format configuration
> --
>
> Key: CASSANDRA-18533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18533
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>
> This mainly concerns cassandra yaml settings:
> - {{column_index_size}}, which should also be renamed to 
> {{row_index_granularity}}
> - {{column_index_cache_size}}
> - {{index_summary_capacity}}
> - {{index_summary_resize_interval}}
> and possibly
> - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, 
> {{key_cache_migrate_during_compaction}}
> - {{sstable_preemptive_open_interval}}
> Existing settings should be deprecated but still picked up if defined.
> At this point we will not consider table-level options that make better sense 
> as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, 
> {{crc_check_chance}} and possibly {{compression}}), because we do not yet 
> support per-table format selection/configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration

2023-11-02 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782184#comment-17782184
 ] 

Branimir Lambov commented on CASSANDRA-18533:
-

1. Yes, precisely.
2. The key cache is constructed in a completely separate portion of the code, 
isn't it? Ignore the key cache settings (except migration), I don't think 
changing this is something we can do at the moment.
3. Although it is not at the moment, the row index granularity in particular 
should be a table-level property -- there's no real reason to use one setting 
for all tables, and there's an advantage to be had by making it configurable. 
However, things like the key cache size or index summary capacity are something 
to be shared, not just between tables but also potentially between formats; I 
don't want to get into a complicated solution for this, I would either ignore 
any table-level modification for these (with a warning) or check that the value 
is the same among all tables. This, along with format variations (e.g. 
"bti-fast"), is also out of scope for this ticket.

> Move format-specific sstable options into the format configuration
> --
>
> Key: CASSANDRA-18533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18533
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>
> This mainly concerns cassandra yaml settings:
> - {{column_index_size}}, which should also be renamed to 
> {{row_index_granularity}}
> - {{column_index_cache_size}}
> - {{index_summary_capacity}}
> - {{index_summary_resize_interval}}
> and possibly
> - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, 
> {{key_cache_migrate_during_compaction}}
> - {{sstable_preemptive_open_interval}}
> Existing settings should be deprecated but still picked up if defined.
> At this point we will not consider table-level options that make better sense 
> as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, 
> {{crc_check_chance}} and possibly {{compression}}), because we do not yet 
> support per-table format selection/configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780350#comment-17780350
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

Yes, I intend to commit it to 5.0.

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0, 5.x
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18945:

Fix Version/s: 5.0

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0, 5.x
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18945:

 Bug Category: Parent values: Degradation(12984)Level 1 values: Performance 
Bug/Regression(12997)
   Complexity: Normal
Discovered By: Adhoc Test
Reviewers: Branimir Lambov
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780240#comment-17780240
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

[~smiklosovic], would you be willing to be the second reviewer?

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-10-25 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779468#comment-17779468
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

So the {{KEYSPACE_PER_TEST}} fix for unexpected flushes no longer works after 
CASSANDRA-17071? All of the tests that use it will be having intermittent 
failures unless we find a way to block this.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-25 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779444#comment-17779444
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

Attached [the result of a recent 
benchmark|https://issues.apache.org/jira/secure/attachment/13063855/key-value-oss.html]
 comparing the UCS default (green) to STCS (blue) and an option with larger 
SSTable size (orange). The default UCS has worse results in the throughput 
stage, but more importantly it is unable to serve the 110k ops/s during the 1:1 
and read-only stages. I'm still investigating what causes these reads to be so 
slow, but switching to 10GiB target fully fixes the problem (the two other 
options the orange graph uses, 'base_shard_count': '1' and 
'max_sstables_to_compact': '32', help but are not as significant on their own).

Rather than ask users to choose a target size based on their expected data 
density, the database should be able to deal with this itself. Admitting some 
of the growth into the sstable size is a good way to achieve that.

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: key-value-oss.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-25 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18945:

Attachment: key-value-oss.html

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: key-value-oss.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
> where 휆 is a parameter whose value is between 0 and 1.
> With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 휆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-10-20 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1830#comment-1830
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

It looks like the reason for the unexpected flush is the commit log:
{code:java}
[junit-timeout] INFO  [OptionalTasks:1] 2023-10-12 21:55:11,095 
ColumnFamilyStore.java:1017 - Enqueuing flush of 
cql_test_keyspace_alt.table_01, Reason: COMMITLOG_DIRTY, Usage: 74.752KiB (0%) 
on-heap, 3.777KiB (0%) off-heap
[junit-timeout] INFO  [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,103 
Flushing.java:154 - Writing Memtable-table_01@1180822937(6.854KiB serialized 
bytes, 242 ops, 74.916KiB (0%) on-heap, 3.781KiB (0%) off-heap), flushed range 
= [null, null)
[junit-timeout] INFO  [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,128 
Flushing.java:180 - Completed flushing 
/tmp/cassandra/build/test/cassandra/data/cql_test_keyspace_alt/table_01-03e61210694a11eeb4091bdb4ac3170b/nc-1-big-Data.db
 (6.839KiB) ... {code}
which is flushing just 242 out of the 1000 ops that the test needs per table.

We need to understand what causes these {{COMMITLOG_DIRTY}} flushes, because 
there are quite a few tests that will fail if a flush happens at the wrong 
time. Or maybe somehow disable commitlog-driven flushing for tests (e.g. by 
setting a really large commit log space limit).

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-20 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-18945:
---

 Summary: Unified Compaction Strategy is creating too many sstables
 Key: CASSANDRA-18945
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Compaction
Reporter: Branimir Lambov


The unified compaction strategy currently aims to create sstables with close to 
the same size, defaulting to 1 GiB. Unfortunately tests show that Cassandra 
starts to have performance problems when the number of sstables grows to the 
order of a thousand, and in particular that even 1 TiB of data with the default 
configuration is creating too many sstables for efficient processing. This 
matters even more for SAI, where the number of sstables in the system can have 
a proportional effect on the complexity of operations.

It is quite easy to create a configuration option that allows sstables to take 
some part of the data growth by adding a multiplier to [the shard count 
calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
 formula, replacing 
{{2 ^ round(log2(d / (t * b))) * b}} 
with 
{{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, 
where 휆 is a parameter whose value is between 0 and 1.

With this, a 휆 of 0.5 would mean that shard count and sstable size grow in 
parallel at the square root of the data size growth. 0 would result in no 
growth, and 1 in always using the same number of shards.

It may also be valuable to introduce a threshold for engaging the base shard 
count to avoid splitting lowest-level sstables into fragments that are too 
small.

Once both of these are in place, we can set defaults that better suit all node 
densities, including 10 TiB and beyond, for example:
 - target size of 1 GiB
 - 휆 of 1/3
 - base shard count of 4
 - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params

2023-10-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1333#comment-1333
 ] 

Branimir Lambov commented on CASSANDRA-18872:
-

The patch looks good to me, the changes are not too invasive and can be easily 
replaced with format configuration in CASSANDRA-18534.

Do we have a documentation ticket corresponding to this? AFAICS [the 
docs|https://cassandra.apache.org/doc/latest/cassandra/operating/compression.html]
 only mention the compression-level setting, even for 4.1. This documentation 
change also needs to explain that the chance only applies to compressed 
sstables.

> Remove deprecated crc_check_chance in compression params
> 
>
> Key: CASSANDRA-18872
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18872
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Compression, Legacy/CQL
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> crc_check_chance was moved from compression parameters and it is a standalone 
> table parameter. This was done in times of 3.0 so it is now time to get rid 
> of that in 5.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18534) Make sstable format configurable per table

2023-10-10 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18534:

Fix Version/s: 5.0
   (was: 5.x)

> Make sstable format configurable per table
> --
>
> Key: CASSANDRA-18534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18534
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Some SSTable format settings need to be configurable per table for better 
> efficiency. This includes:
>  - {{row_index_granularity}}
>  - {{bloom_filter_fp_chance}}
>  - {{crc_check_chance}}
>  - {{min/max_index_interval}}
> Some of these are currently configurable using direct properties of tables. 
> Having them as format properties makes better sense and should also support 
> specifying useable combinations of settings, e.g.
> {code:java}
> CREATE TABLE ... WITH sstable_format = "bti-fast";
> CREATE TABLE ... WITH sstable_format = "bti-small";
> {code}
> where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} 
> e.g. as
> {code:java}
> sstable.format.options:
>   - bti-fast:
>   row_index_granularity: 1kiB
>   bloom_filter_fp_chance: 0.01
>   - bti-small:
>   row_index_granularity: 32kiB
>   bloom_filter_fp_chance: 0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table

2023-10-10 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773696#comment-17773696
 ] 

Branimir Lambov commented on CASSANDRA-18534:
-

bq. Also, do you think it is possible and useful to make sstable_format contain 
custom parameters?

_All_ of the parameters to the SSTable format are custom, i.e. format-specific. 
This is also the qualifying condition for something to be moved into the format 
config: if you can imagine an SSTable format that does not need that flag, then 
it belongs to the format. E.g. bloom-filter-less formats do not need 
{{bloom_filter_fp_chance}}, and (even though they are not a feature of writing 
an SSTable) only {{BIG}} requires key cache options. Unless we are certain that 
CRC is the only way a format could defend against bit rot, {{check_crc_chance}} 
is also a format-specific property.

> Make sstable format configurable per table
> --
>
> Key: CASSANDRA-18534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18534
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Some SSTable format settings need to be configurable per table for better 
> efficiency. This includes:
>  - {{row_index_granularity}}
>  - {{bloom_filter_fp_chance}}
>  - {{crc_check_chance}}
>  - {{min/max_index_interval}}
> Some of these are currently configurable using direct properties of tables. 
> Having them as format properties makes better sense and should also support 
> specifying useable combinations of settings, e.g.
> {code:java}
> CREATE TABLE ... WITH sstable_format = "bti-fast";
> CREATE TABLE ... WITH sstable_format = "bti-small";
> {code}
> where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} 
> e.g. as
> {code:java}
> sstable.format.options:
>   - bti-fast:
>   row_index_granularity: 1kiB
>   bloom_filter_fp_chance: 0.01
>   - bti-small:
>   row_index_granularity: 32kiB
>   bloom_filter_fp_chance: 0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params

2023-10-09 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773438#comment-17773438
 ] 

Branimir Lambov commented on CASSANDRA-18872:
-

Have you looked at CASSANDRA-18534? Now that we have multiple SSTable formats, 
it makes a lot of sense to move properties like this into the format 
configuration, which in turn would mean passing a format configuration (instead 
of compression one) to the file handle builder.

> Remove deprecated crc_check_chance in compression params
> 
>
> Key: CASSANDRA-18872
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18872
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Compression, Legacy/CQL
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> crc_check_chance was moved from compression parameters and it is a standalone 
> table parameter. This was done in times of 3.0 so it is now time to get rid 
> of that in 5.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-10-04 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771732#comment-17771732
 ] 

Branimir Lambov commented on CASSANDRA-18464:
-

To make the review easier, could you fork the {{apache/cassandra}} repository 
on github, push a branch with the changes to your fork on top of 
{{cassandra-5.0}}, and open a pull request against {{apache/cassandra-5.0}}?

My comments so far are these:

On [Config.java 
117|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R117]:
{quote}
Using booleans makes it very unclear which options are actually valid, and what 
the alternative means. Please change the configuration to an enum, e.g. 
{{commit_log_access_mode}} with values {{direct_jna}}, {{direct}}, and {{mmap}}.
{quote}
{quote}
Actually, there should be only one direct option, and whether it uses nio or 
jni is an implementation detail that the users needn't care about.

The next question is whether or not non-direct should be supported at all, and 
I personally prefer to not support it as this adds configuration complexity for 
no expected benefit.

This also means that it makes sense to simply switch all other commit log 
segment types to be written direct, and this is simple enough to do in this 
ticket (especially since we dropped Java 8 and can use NIO's {{DIRECT}} option).
{quote}

On [Config.java 
517|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R517]:
{quote}
When would someone need to change this?
{quote}


> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported 

[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-10-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771432#comment-17771432
 ] 

Branimir Lambov commented on CASSANDRA-18464:
-

There was a typo in my response above, I am in favour of having the patch land 
in 5.0.

Just the 512 vs 4k difference is not something I would personally consider a 
good reason to include the JNA writing; the sync segments are usually much 
larger than that. I would rather go with the simpler NIO option. 

I can't find my code comments with the link above any more. They are 
[here|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#r128716588].

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18894) Drop commitlog chain marker updates

2023-09-29 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-18894:
---

 Summary: Drop commitlog chain marker updates
 Key: CASSANDRA-18894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18894
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Commit Log
Reporter: Branimir Lambov


CASSANDRA-13987 added a periodic update of the last commit log chain marker in 
order to allow for data in memory-mapped segments to be recovered even if it 
was not part of a synced segment.

A much simpler way to do this is something in the vein of CASSANDRA-16482, i.e. 
ignoring an empty sync marker for the last entry in the commit log. We could do 
this by default if the commit log is uncompressed (and possibly only if using 
memory mapping after CASSANDRA-18464).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-09-29 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18464:

Reviewers: Branimir Lambov
   Status: Review In Progress  (was: Patch Available)

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-09-29 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770415#comment-17770415
 ] 

Branimir Lambov commented on CASSANDRA-18464:
-

This patch is very valuable, and I support if going into 5.0 as well as 5.1.

In separate tests we have often found a memory-mapped commit log to be a 
serious performance problem for a node with a lot of data. Even without DIRECT 
or JNA, not using `msync` is making a huge difference. Because of this most of 
the performance testing I personally do is done with compressed commit log.

I added comments to [the latest published 
branch|https://github.com/driftx/cassandra/tree/CASSANDRA-18464-trunk] with 
some suggested changes. I am curious, if the NIO option is constructed 
correctly (with aligned direct buffers, possibly also issuing the writes to be 
page-aligned and containing whole pages), is it still copying to internal 
buffers?

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Commented] (CASSANDRA-18773) Compactions are slow

2023-09-26 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769063#comment-17769063
 ] 

Branimir Lambov commented on CASSANDRA-18773:
-

There's some leftover code in the trunk version, apart from that the newer 
versions look good.

> Compactions are slow
> 
>
> Key: CASSANDRA-18773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18773
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Cameron Zemek
>Assignee: Cameron Zemek
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: 18773.patch, compact-poc.patch, flamegraph.png, 
> stress.yaml
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> I have noticed that compactions involving a lot of sstables are very slow 
> (for example major compactions). I have attached a cassandra stress profile 
> that can generate such a dataset under ccm. In my local test I have 2567 
> sstables at 4Mb each.
> I added code to track wall clock time of various parts of the code. One 
> problematic part is ManyToOne constructor. Tracing through the code for every 
> partition creating a ManyToOne for all the sstable iterators for each 
> partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked 
> on single core CPU (since this code is single threaded) with it spending 85% 
> of the wall clock time in ManyToOne constructor.
> As another datapoint to show its the merge iterator part of the code using 
> the cfstats from [https://github.com/instaclustr/cassandra-sstable-tools/] 
> which reads all the sstables but does no merging gets 26Mb/sec read speed.
> Tracking back from ManyToOne call I see this in 
> UnfilteredPartitionIterators::merge
> {code:java}
>                 for (int i = 0; i < toMerge.size(); i++)
>                 {
>                     if (toMerge.get(i) == null)
>                     {
>                         if (null == empty)
>                             empty = EmptyIterators.unfilteredRow(metadata, 
> partitionKey, isReverseOrder);
>                         toMerge.set(i, empty);
>                     }
>                 }
>  {code}
> Not sure what purpose of creating these empty rows are. But on a whim I 
> removed all these empty iterators before passing to ManyToOne and then all 
> the wall clock time shifted to CompactionIterator::hasNext() and read speed 
> increased to 1.5Mb/s.
> So there are further bottlenecks in this code path it seems, but the first is 
> this ManyToOne and having to build it for every partition read.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18873) Fix broken JMH benchmarks

2023-09-25 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768591#comment-17768591
 ] 

Branimir Lambov commented on CASSANDRA-18873:
-

{quote}
* ReadSmallPartitionsBench (assertion error)
* ReadWidePartitionsBench (assertion error)
{quote}
These two tests need larger memtable size allocation to produce useable output. 
One way to "fix" this is to replace {{INMEM}} with {{NO}} for the default 
{{flush}}, which will make it ignore the fact that part of the data is in an 
sstable; another is to reduce the default {{count}} by an order of magnitude.

Both of these changes would make the test less suitable for what it is 
primarily meant to measure (access time with a non-trivial data size in a 
single memtable/sstable).

> Fix broken JMH benchmarks
> -
>
> Key: CASSANDRA-18873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18873
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/benchmark
>Reporter: Jacek Lewandowski
>Priority: Normal
> Attachments: BenchTimeTest.java, 
> jmh-AtomicBtreePartitionUpdateBench.log, jmh-BloomFilterSerializerBench.log, 
> jmh-KeyLookupBench.log, jmh-ReadSmallPartitionsBench.log, 
> jmh-ReadWidePartitionsBench.log
>
>
> The following benchmarks are broken:
> * {{ZeroCopyStreamingBench}}
> * {{MutationBench}}
> * {{FastThreadLocalBench}}
> * {{AtomicBTreePartitionUpdateBench}} (OOM on Jenkins)
> * {{ReadSmallPartitionsBench}} (assertion error)
> * {{ReadWidePartitionsBench}} (assertion error)
> * {{BloomFilterSerializerBench}} (NPE)
> * {{KeyLookupBench}} (IAE)
> Additionally, those benchmarks take too much time to run:
> * {{BTreeUpdateBench}} ~ 58 hours
> * {{AtomicBTreePartitionUpdateBench}} ~ 5 hours
> * {{BTreeTransformBench}} ~ 2.5 hours
> Here the complete list of estimated benchmark times:
> {noformat}
> Estimated time for CacheLoaderBench: ~5 s
> Estimated time for LatencyTrackingBench: ~26 s
> Estimated time for SampleBench: ~30 s
> Estimated time for ReadWriteBench: ~30 s
> Estimated time for MutationBench: ~30 s
> Estimated time for CompactionBench: ~35 s
> Estimated time for DiagnosticEventPersistenceBench: ~40 s
> Estimated time for ZeroCopyStreamingBench: ~44 s
> Estimated time for BatchStatementBench: ~110 s
> Estimated time for DiagnosticEventServiceBench: ~120 s
> Estimated time for MessageOutBench: ~144 s
> Estimated time for BloomFilterSerializerBench: ~144 s
> Estimated time for FastThreadLocalBench: ~156 s
> Estimated time for HashingBench: ~156 s
> Estimated time for ChecksumBench: ~208 s
> Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s
> Estimated time for PendingRangesBench: ~ 5 m
> Estimated time for DirectorySizerBench: ~ 5 m
> Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m
> Estimated time for PreaggregatedByteBufsBench: ~ 7 m
> Estimated time for AutoBoxingBench: ~ 8 m
> Estimated time for OutputStreamBench: ~ 13 m
> Estimated time for BTreeBuildBench: ~ 13 m
> Estimated time for StringsEncodeBench: ~ 20 m
> Estimated time for instance.ReadWidePartitionsBench: ~ 21 m
> Estimated time for btree.BTreeBuildBench: ~ 30 m
> Estimated time for BTreeSearchIteratorBench: ~ 31 m
> Estimated time for btree.BTreeTransformBench: ~ 138 m
> Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m
> Estimated time for btree.BTreeUpdateBench: ~58 h
> Total estimated time: ~69 h
> {noformat}
> I'd like to add a test which estimates the benchmark times and fails if a 
> single benchmark estimated run time is longer than xxx minutes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration

2023-09-21 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767594#comment-17767594
 ] 

Branimir Lambov commented on CASSANDRA-18533:
-

Absolutely.

> Move format-specific sstable options into the format configuration
> --
>
> Key: CASSANDRA-18533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18533
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Branimir Lambov
>Priority: Normal
>
> This mainly concerns cassandra yaml settings:
> - {{column_index_size}}, which should also be renamed to 
> {{row_index_granularity}}
> - {{column_index_cache_size}}
> - {{index_summary_capacity}}
> - {{index_summary_resize_interval}}
> and possibly
> - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, 
> {{key_cache_migrate_during_compaction}}
> - {{sstable_preemptive_open_interval}}
> Existing settings should be deprecated but still picked up if defined.
> At this point we will not consider table-level options that make better sense 
> as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, 
> {{crc_check_chance}} and possibly {{compression}}), because we do not yet 
> support per-table format selection/configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

  Fix Version/s: 3.11.17
 4.0.12
 4.1.4
 5.0-alpha2
 5.1
Source Control Link: https://github.com/apache/cassandra/pull/2656
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Commited 
([3.11|https://github.com/apache/cassandra/commit/87c2af85c1305c130af7d66f83dec03a1c4a8bb2]
 
[4.0|https://github.com/apache/cassandra/commit/c6385ac3ddccabdc7cb650b090fa69c0523274e8]
 
[4.1|https://github.com/apache/cassandra/commit/db6641fbb6fd0c439e14f94caecdeee999311c62]
 
[5.0|https://github.com/apache/cassandra/commit/a23f4c0b15c684240ef0bcd55875610e8bd7179b]
 
[trunk|https://github.com/apache/cassandra/commit/970ec2d1db5770c13a42e1f2862ea398317d0f15])

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 3.11.17, 4.0.12, 4.1.4, 5.0-alpha2, 5.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Review In Progress  (was: Needs Committer)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Ready to Commit  (was: Review In Progress)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Needs Committer  (was: Patch Available)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Patch Available  (was: Requires Testing)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Requires Testing  (was: Review In Progress)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Reviewers: Branimir Lambov, Michael Semb Wever  (was: Michael Semb Wever)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Reviewers: Branimir Lambov, Michael Semb Wever, Branimir Lambov  (was: 
Branimir Lambov, Michael Semb Wever)
   Branimir Lambov, Michael Semb Wever, Branimir Lambov  (was: 
Branimir Lambov, Michael Semb Wever)
   Status: Review In Progress  (was: Patch Available)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Test and Documentation Plan: CI
 Status: Patch Available  (was: In Progress)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements

2023-09-21 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767504#comment-17767504
 ] 

Branimir Lambov edited comment on CASSANDRA-18871 at 9/21/23 10:47 AM:
---

Yes, the parameter passing works great for me now.

Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine.


was (Author: blambov):
Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine.

> JMH benchmark improvements
> --
>
> Key: CASSANDRA-18871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build, Legacy/Tools
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1
>
>
> 1. CASSANDRA-12586  introduced {{build-jmh}} task which builds uber jar for 
> JMH benchmarks which is then not used with {{ant microbench}} task. It is 
> used though by the {{test/bin/jmh}} script. 
> In fact, I have no idea why we should use uber jar if JMH can perfectly run 
> with a regular classpath. Maybe that had something to do with older JMH 
> version which was used that time. Building uber jars takes time and is 
> annoying. Since it seems to be redundant anyway, I'm going to remove it and 
> fix {{test/bin/jmh}} to use a regular classpath. 
> 2. I'll add support for async profiler in benchmarks. That is, the 
> {{microbench}} target automatically fetches the async profiler binaries and 
> adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we 
> run {{microbench-with-profiler}} task. If no additional properties are 
> provided some default options will be applied (defined in the script, can be 
> negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property 
> will be added as profiler options after library path and target directory 
> definition.
> 3. If someone wants to see any additional improvements, please comment on the 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements

2023-09-21 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767504#comment-17767504
 ] 

Branimir Lambov commented on CASSANDRA-18871:
-

Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine.

> JMH benchmark improvements
> --
>
> Key: CASSANDRA-18871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build, Legacy/Tools
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1
>
>
> 1. CASSANDRA-12586  introduced {{build-jmh}} task which builds uber jar for 
> JMH benchmarks which is then not used with {{ant microbench}} task. It is 
> used though by the {{test/bin/jmh}} script. 
> In fact, I have no idea why we should use uber jar if JMH can perfectly run 
> with a regular classpath. Maybe that had something to do with older JMH 
> version which was used that time. Building uber jars takes time and is 
> annoying. Since it seems to be redundant anyway, I'm going to remove it and 
> fix {{test/bin/jmh}} to use a regular classpath. 
> 2. I'll add support for async profiler in benchmarks. That is, the 
> {{microbench}} target automatically fetches the async profiler binaries and 
> adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we 
> run {{microbench-with-profiler}} task. If no additional properties are 
> provided some default options will be applied (defined in the script, can be 
> negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property 
> will be added as profiler options after library path and target directory 
> definition.
> 3. If someone wants to see any additional improvements, please comment on the 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements

2023-09-20 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767113#comment-17767113
 ] 

Branimir Lambov commented on CASSANDRA-18871:
-

Can one specify a specific benchmark to run? Would it be too hard to also add 
other parameters?

> JMH benchmark improvements
> --
>
> Key: CASSANDRA-18871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build, Legacy/Tools
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1
>
>
> 1. CASSANDRA-12586  introduced {{build-jmh}} task which builds uber jar for 
> JMH benchmarks which is then not used with {{ant microbench}} task. It is 
> used though by the {{test/bin/jmh}} script. 
> In fact, I have no idea why we should use uber jar if JMH can perfectly run 
> with a regular classpath. Maybe that had something to do with older JMH 
> version which was used that time. Building uber jars takes time and is 
> annoying. Since it seems to be redundant anyway, I'm going to remove it and 
> fix {{test/bin/jmh}} to use a regular classpath. 
> 2. I'll add support for async profiler in benchmarks. That is, the 
> {{microbench}} target automatically fetches the async profiler binaries and 
> adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we 
> run {{microbench-with-profiler}} task. If no additional properties are 
> provided some default options will be applied (defined in the script, can be 
> negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property 
> will be added as profiler options after library path and target directory 
> definition.
> 3. If someone wants to see any additional improvements, please comment on the 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15284) AssertionError while scrubbing sstable

2023-09-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765693#comment-17765693
 ] 

Branimir Lambov commented on CASSANDRA-15284:
-

Thank you for the excellent investigation and write-up. It does look like there 
are many copies of the same position info in the writer, and that they aren't 
correctly handled. In addition to what you describe, {{padToPageBoundary}} 
isn't adjusting the compressed size, which can cause the same problem.

I agree we need to get rid of the extras: {{lastFlushOffset}} and 
{{compressedSize}} should both be eliminated, their usages replaced with 
{{bufferOffset}}, and {{compressedSize}} should be taken from {{chunkOffset}}.

Dealing with the broken full CRC is a bit more involved, but we could flag if 
{{resetAndTruncate}} needed to back through to a different chunk, and do a full 
pass over the file to recalculate it on completion if flagged.

> AssertionError while scrubbing sstable
> --
>
> Key: CASSANDRA-15284
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15284
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Compression
>Reporter: Gianluigi Tiesi
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.1.x
>
> Attachments: assert-comp-meta.diff
>
>
> I've got a damaged data file but while trying to run scrub (online or 
> offline) I always get this
> error:
>  
> {code:java}
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:474)
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:239)
> at 
> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:163)
> at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73)
> at 
> org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61)
> at 
> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104)
> at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:362)
> at 
> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:331)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openFinal(BigTableWriter.java:336)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openFinalEarly(BigTableWriter.java:318)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.switchWriter(SSTableRewriter.java:322)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:370)
> at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
> at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:184)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.finish(SSTableRewriter.java:357)
> at 
> org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:291)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:1010)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$200(CompactionManager.java:83)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$3.execute(CompactionManager.java:391)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:312)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> At the moment I've moved away the corrupted file, If you need more info fell 
> free to ask
>   
> According to the source 
> [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/io/compress/CompressionMetadata.java#L474]
> looks like the requested chung length is <= 0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761375#comment-17761375
 ] 

Branimir Lambov commented on CASSANDRA-18756:
-

There's a risk that making it work as intended will unexpectedly change 
behaviour for people that are already using the flag, and I would rather not do 
that. Especially if we change it in patch releases for all supported versions.

If we are to enable that functionality, IMHO it should be under a different 
flag (and then only for a subset of versions).

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-01 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Open  (was: Triage Needed)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761314#comment-17761314
 ] 

Branimir Lambov commented on CASSANDRA-18756:
-

[~mck], could you do the second review this small patch which corrects a 
problem with CASSANDRA-13418? Your name came up as a reviewer for that ticket 
and it would be great to get the opinion of someone who has some context on it.

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table

2023-09-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761307#comment-17761307
 ] 

Branimir Lambov commented on CASSANDRA-18534:
-

We can't remove the existing option, only deprecate it (i.e. give a warning 
that it may be removed in a later version). We also have to honor the value if 
it is present.

I agree that we should throw an exception if both versions are given in the 
DDL. The complication is what happens if the format-side option is given in the 
yaml: in this case I think we should let the table-side option override it even 
if it is given in the legacy way (with perhaps a deprecation warning).

> Make sstable format configurable per table
> --
>
> Key: CASSANDRA-18534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18534
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>
> Some SSTable format settings need to be configurable per table for better 
> efficiency. This includes:
>  - {{row_index_granularity}}
>  - {{bloom_filter_fp_chance}}
>  - {{crc_check_chance}}
>  - {{min/max_index_interval}}
> Some of these are currently configurable using direct properties of tables. 
> Having them as format properties makes better sense and should also support 
> specifying useable combinations of settings, e.g.
> {code:java}
> CREATE TABLE ... WITH sstable_format = "bti-fast";
> CREATE TABLE ... WITH sstable_format = "bti-small";
> {code}
> where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} 
> e.g. as
> {code:java}
> sstable.format.options:
>   - bti-fast:
>   row_index_granularity: 1kiB
>   bloom_filter_fp_chance: 0.01
>   - bti-small:
>   row_index_granularity: 32kiB
>   bloom_filter_fp_chance: 0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16333) Document provide_overlapping_tombstones compaction option

2023-08-31 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760978#comment-17760978
 ] 

Branimir Lambov commented on CASSANDRA-16333:
-

I don't think this describes it correctly.

When compacting data, it will check if a row is deleted by some tombstone in a 
newer table that does not participate in the compaction. If it is, it will drop 
the row from the result. If this manages to result in the partition completely 
being removed from the result of the compaction, this will make the tombstone 
that deletes the row purgeable.

To be honest, I had hopes that this can help solve the tombstones problem when 
I wrote the code, but now I'm not very confident that it is worth using. When 
faced with accumulation of tombstones, performing a major compaction or 
switching to LCS or levelled UCS is a better option. Ultimately the problem 
should be solved by something that removes the tombstone factor from queries, 
and we have something in mind as the long-term solution.

> Document provide_overlapping_tombstones compaction option
> -
>
> Key: CASSANDRA-16333
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16333
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Paulo Motta
>Assignee: Sumanth Pasupuleti
>Priority: Normal
>
> This option was added on CASSANDRA-7019 but it's not documented. We should 
> add it to 
> https://cassandra.apache.org/doc/latest/operating/compaction/index.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18773) Compactions are slow

2023-08-29 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759864#comment-17759864
 ] 

Branimir Lambov commented on CASSANDRA-18773:
-

{quote}This does copying of the rows into memory to pass across to the writer
{quote}
Partitions can have millions of rows, we cannot afford to copy something this 
big to memory. The current materialization point is the row and even that is a 
problem for some use cases.

Processing the read in another thread adds a lot of synchronization overhead, 
changes the balance between request serving and compaction, and in a busy node 
can actually result in compaction failing to make progress. We have instead 
been focusing on making compaction more parallelizable (see e.g. 
UCS/CEP-26/CASSANDRA-18397). UCS can already parallelize major compactions when 
there is no overlap, and we can do even better if we provide the split points 
to the compaction process in advance (see CASSANDRA-18802).

> Compactions are slow
> 
>
> Key: CASSANDRA-18773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18773
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Cameron Zemek
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: 18773.patch, compact-poc.patch, flamegraph.png, 
> stress.yaml
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I have noticed that compactions involving a lot of sstables are very slow 
> (for example major compactions). I have attached a cassandra stress profile 
> that can generate such a dataset under ccm. In my local test I have 2567 
> sstables at 4Mb each.
> I added code to track wall clock time of various parts of the code. One 
> problematic part is ManyToOne constructor. Tracing through the code for every 
> partition creating a ManyToOne for all the sstable iterators for each 
> partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked 
> on single core CPU (since this code is single threaded) with it spending 85% 
> of the wall clock time in ManyToOne constructor.
> As another datapoint to show its the merge iterator part of the code using 
> the cfstats from [https://github.com/instaclustr/cassandra-sstable-tools/] 
> which reads all the sstables but does no merging gets 26Mb/sec read speed.
> Tracking back from ManyToOne call I see this in 
> UnfilteredPartitionIterators::merge
> {code:java}
>                 for (int i = 0; i < toMerge.size(); i++)
>                 {
>                     if (toMerge.get(i) == null)
>                     {
>                         if (null == empty)
>                             empty = EmptyIterators.unfilteredRow(metadata, 
> partitionKey, isReverseOrder);
>                         toMerge.set(i, empty);
>                     }
>                 }
>  {code}
> Not sure what purpose of creating these empty rows are. But on a whim I 
> removed all these empty iterators before passing to ManyToOne and then all 
> the wall clock time shifted to CompactionIterator::hasNext() and read speed 
> increased to 1.5Mb/s.
> So there are further bottlenecks in this code path it seems, but the first is 
> this ManyToOne and having to build it for every partition read.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18802) Extend compaction interfaces to provide split points at operation start

2023-08-29 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-18802:
---

 Summary: Extend compaction interfaces to provide split points at 
operation start
 Key: CASSANDRA-18802
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18802
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Compaction
Reporter: Branimir Lambov


The current compaction interfaces allow a compaction strategy to split at 
arbitrary points while it is writing output. In some cases (e.g. UCS) we know 
in advance where we want to split. Giving this information before the operation 
starts allows it to operate on multiple segments of the output in parallel, 
i.e. parallelize within an operation rather than between operations, which can 
reduce individual operations' duration and significantly improve the DB's 
chances of keeping up with load, especially on L0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18773) Compactions are slow

2023-08-28 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759595#comment-17759595
 ] 

Branimir Lambov edited comment on CASSANDRA-18773 at 8/28/23 2:11 PM:
--

The crux of the issue here is likely in this comment in 
{{{}UnfilteredPartitionIterators.merge{}}}:
{code:java}
// Note that because the MergeListener cares about it, we want 
to preserve the index of the iterator.
// Non-present iterator will thus be set to empty in getReduced.
{code}
The need for this is, indeed, a serious problem in merges of many sstables in 
key-value tables (i.e. ones containing only one row per partition) that we have 
not yet tried to address.

I expect that simply changing the code of {{merge}} to only present partitions 
selected by the partition merger, for example by doing
{code:java}
diff --git 
a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java
--- 
a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
(revision 052a26474108febad545d6528bb203ecf19b22e5)
+++ 
b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
(date 1693230551402)
@@ -113,16 +113,11 @@
 private final List toMerge = new 
ArrayList<>(iterators.size());
 
 private DecoratedKey partitionKey;
-private boolean isReverseOrder;
 
 public void reduce(int idx, UnfilteredRowIterator current)
 {
 partitionKey = current.partitionKey();
-isReverseOrder = current.isReverseOrder();
-
-// Note that because the MergeListener cares about it, we want 
to preserve the index of the iterator.
-// Non-present iterator will thus be set to empty in 
getReduced.
-toMerge.set(idx, current);
+toMerge.add(current);
 }
 
 @SuppressWarnings("resource")
@@ -132,20 +127,6 @@
  ? null
  : 
listener.getRowMergeListener(partitionKey, toMerge);
 
-// Make a single empty iterator object to merge, we don't need 
toMerge.size() copiess
-UnfilteredRowIterator empty = null;
-
-// Replace nulls by empty iterators
-for (int i = 0; i < toMerge.size(); i++)
-{
-if (toMerge.get(i) == null)
-{
-if (null == empty)
-empty = EmptyIterators.unfilteredRow(metadata, 
partitionKey, isReverseOrder);
-toMerge.set(i, empty);
-}
-}
-
 return UnfilteredRowIterators.merge(toMerge, rowListener);
 }
{code}
will give a higher performance boost. The problem is that doing this causes the 
merge listener (which can be e.g. a secondary index implementation) to lose 
information about the sources of a merged value. At some point this was crucial 
for secondary indexes, but I'm not sure it still is, and I don't think anyone 
has invested the time to understand whether it is still necessary in Cassandra 
4. It's even less likely to matter for Cassandra 5, whose storage-attached 
indexes don't need merge listeners.

Nevertheless, unless we have done this investigation, this behaviour needs to 
be preserved, but I believe we can still get an improvement for the cases where 
there is no index. To get a more complete solution to the problem, how about 
changing the {{merge}} code to get the {{rowListener}} on the first call to 
{{{}reduce{}}}, and switch between the two methods of constructing the 
{{toMerge}} list based on whether or not it is {{{}null{}}}?


was (Author: blambov):
The crux of the issue here is likely in this comment in 
{{{}UnfilteredPartitionIterators.merge{}}}:
{code:java}
// Note that because the MergeListener cares about it, we want 
to preserve the index of the iterator.
// Non-present iterator will thus be set to empty in getReduced.
{code}
The need for this is, indeed, a serious problem in merges of many sstables in 
key-value tables (i.e. ones containing only one row per partition) that we have 
not yet tried to address.

I expect that simply changing the code of {{merge}} to only present partitions 
selected by the partition merger, for example by doing
{code:java}
diff --git 
a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java
--- 
a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
(revision 

[jira] [Commented] (CASSANDRA-18773) Compactions are slow

2023-08-28 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759595#comment-17759595
 ] 

Branimir Lambov commented on CASSANDRA-18773:
-

The crux of the issue here is likely in this comment in 
{{{}UnfilteredPartitionIterators.merge{}}}:
{code:java}
// Note that because the MergeListener cares about it, we want 
to preserve the index of the iterator.
// Non-present iterator will thus be set to empty in getReduced.
{code}
The need for this is, indeed, a serious problem in merges of many sstables in 
key-value tables (i.e. ones containing only one row per partition) that we have 
not yet tried to address.

I expect that simply changing the code of {{merge}} to only present partitions 
selected by the partition merger, for example by doing
{code:java}
diff --git 
a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java
--- 
a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
(revision 052a26474108febad545d6528bb203ecf19b22e5)
+++ 
b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java 
(date 1693230551402)
@@ -113,16 +113,11 @@
 private final List toMerge = new 
ArrayList<>(iterators.size());
 
 private DecoratedKey partitionKey;
-private boolean isReverseOrder;
 
 public void reduce(int idx, UnfilteredRowIterator current)
 {
 partitionKey = current.partitionKey();
-isReverseOrder = current.isReverseOrder();
-
-// Note that because the MergeListener cares about it, we want 
to preserve the index of the iterator.
-// Non-present iterator will thus be set to empty in 
getReduced.
-toMerge.set(idx, current);
+toMerge.add(current);
 }
 
 @SuppressWarnings("resource")
@@ -132,20 +127,6 @@
  ? null
  : 
listener.getRowMergeListener(partitionKey, toMerge);
 
-// Make a single empty iterator object to merge, we don't need 
toMerge.size() copiess
-UnfilteredRowIterator empty = null;
-
-// Replace nulls by empty iterators
-for (int i = 0; i < toMerge.size(); i++)
-{
-if (toMerge.get(i) == null)
-{
-if (null == empty)
-empty = EmptyIterators.unfilteredRow(metadata, 
partitionKey, isReverseOrder);
-toMerge.set(i, empty);
-}
-}
-
 return UnfilteredRowIterators.merge(toMerge, rowListener);
 }
{code}
will give a higher performance boost. The problem is that doing this causes the 
merge listener (which can be e.g. a secondary index implementation) to lose 
information about the sources of a merged value. At some point this was crucial 
for secondary indexes, but I'm not sure it still is, and I don't think anyone 
has invested the time to understand whether it is still necessary in Cassandra 
4. It's even less likely to matter for Cassandra 5, whose storage-attached 
indexes don't need merge listeners.

To get a more complete solution to the problem, how about changing the 
{{merge}} code to get the {{rowListener}} on the first call to {{reduce}}, and 
switch between the two methods of constructing the {{toMerge}} list based on 
whether or not it is {{null}}?

> Compactions are slow
> 
>
> Key: CASSANDRA-18773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18773
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Cameron Zemek
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: compact-poc.patch, flamegraph.png, stress.yaml
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I have noticed that compactions involving a lot of sstables are very slow 
> (for example major compactions). I have attached a cassandra stress profile 
> that can generate such a dataset under ccm. In my local test I have 2567 
> sstables at 4Mb each.
> I added code to track wall clock time of various parts of the code. One 
> problematic part is ManyToOne constructor. Tracing through the code for every 
> partition creating a ManyToOne for all the sstable iterators for each 
> partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked 
> on single core CPU (since this code is single threaded) with it spending 85% 
> of the wall clock time in ManyToOne constructor.
> As 

[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-08-25 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759002#comment-17759002
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

This kind of flakiness usually comes from truncation of a table from a 
different concurrently completed test causing a flush of the whole keyspace.

The {{KEYSPACE_PER_TEST}} changes fix this, and I must have done a repeated run 
on CircleCI to verify (I don't remember if I really did at the time).

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.x
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table

2023-08-23 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758137#comment-17758137
 ] 

Branimir Lambov commented on CASSANDRA-18534:
-

bq.  Is there a reason we're relying primarily on the .yaml rather than a 
system_distributed table accessible via vtables for aliased groups of sstable 
configuration params?

Because it is a good idea to permit the configuration to change between nodes. 
E.g. if we want to do a gradual rollout of something, or test it out on just 
one node, or compare performance, or because we have a heterogeneous cluster 
and we want to have a different value for some parameter on some nodes.

Granted, these could be given as overrides, but we are getting into the 
territory of a very complex solution for a simple problem, especially 
considering the dance that has to be performed to initialize a storage format 
from the data in a table whose storage format needs to be initialized too.

> Make sstable format configurable per table
> --
>
> Key: CASSANDRA-18534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18534
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>
> Some SSTable format settings need to be configurable per table for better 
> efficiency. This includes:
>  - {{row_index_granularity}}
>  - {{bloom_filter_fp_chance}}
>  - {{crc_check_chance}}
>  - {{min/max_index_interval}}
> Some of these are currently configurable using direct properties of tables. 
> Having them as format properties makes better sense and should also support 
> specifying useable combinations of settings, e.g.
> {code:java}
> CREATE TABLE ... WITH sstable_format = "bti-fast";
> CREATE TABLE ... WITH sstable_format = "bti-small";
> {code}
> where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} 
> e.g. as
> {code:java}
> sstable.format.options:
>   - bti-fast:
>   row_index_granularity: 1kiB
>   bloom_filter_fp_chance: 0.01
>   - bti-small:
>   row_index_granularity: 32kiB
>   bloom_filter_fp_chance: 0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table

2023-08-17 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755595#comment-17755595
 ] 

Branimir Lambov commented on CASSANDRA-18534:
-

No such validation is needed. When we receive an {{ALTER/CREATE}} statement we 
validate that it starts with a known format name, that the specific variation 
is specified in the yaml, and that the definition in the yaml is 
understood/validated by the format.

> Make sstable format configurable per table
> --
>
> Key: CASSANDRA-18534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18534
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>
> Some SSTable format settings need to be configurable per table for better 
> efficiency. This includes:
>  - {{row_index_granularity}}
>  - {{bloom_filter_fp_chance}}
>  - {{crc_check_chance}}
>  - {{min/max_index_interval}}
> Some of these are currently configurable using direct properties of tables. 
> Having them as format properties makes better sense and should also support 
> specifying useable combinations of settings, e.g.
> {code:java}
> CREATE TABLE ... WITH sstable_format = "bti-fast";
> CREATE TABLE ... WITH sstable_format = "bti-small";
> {code}
> where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} 
> e.g. as
> {code:java}
> sstable.format.options:
>   - bti-fast:
>   row_index_granularity: 1kiB
>   bloom_filter_fp_chance: 0.01
>   - bti-small:
>   row_index_granularity: 32kiB
>   bloom_filter_fp_chance: 0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-08-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754645#comment-17754645
 ] 

Branimir Lambov commented on CASSANDRA-18756:
-

Well, things are a bit more complex. Because the iterator is indeed created, 
the {{ignoreOverlaps}} flag does not have the intended effect of not taking 
into account older sstables that may have data shadowed by a tombstone. This 
behaviour is helping make the option much safer than it would otherwise be, and 
cannot be changed without risking unexpected data resurrection.

Because of this a fix for this issue should remove the overlap iterator part of 
the intended meaning of {{unsafe_aggressive_sstable_expiration}} rather than 
make it work correctly.

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Priority: Normal
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals

2023-08-15 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18757:

 Bug Category: Parent values: Degradation(12984)Level 1 values: Resource 
Management(12995)
   Complexity: Low Hanging Fruit
Discovered By: Code Inspection
Since Version: 5.0-alpha1

> UnifiedCompactionTask is incorrectly setting keepOriginals
> --
>
> Key: CASSANDRA-18757
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18757
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Priority: Normal
>
> {code:java}
> super(cfs, txn, gcBefore, 
> strategy.getController().getIgnoreOverlapsInExpirationCheck());{code}
> in {{UnifiedCompactionTask}} is calling the base constructor
> {code:java}
>  public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long 
> gcBefore, boolean keepOriginals)
> {code}
> which can set {{keepOriginals}} to true when it should not be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



  1   2   3   4   5   6   7   8   9   10   >