[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18753: Source Control Link: https://github.com/apache/cassandra/pull/2896 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 12h 20m > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes
[ https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829805#comment-17829805 ] Branimir Lambov commented on CASSANDRA-19471: - They are only for the IAE, which is a a serious issue and IMHO a blocker for 5.0. I have not investigated the commitlog being written with durable writes off which is a much more benign issue. It is likely caused by the preparation of the direct I/O segments writing and flushing the header and first sync marker in advance of any use of the segment. > Commitlog with direct io fails test_change_durable_writes > - > > Key: CASSANDRA-19471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19471 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0-rc, 5.x > > > With the commitlog_disk_access_mode set to direct, and the improved > configuration_test.py::TestConfiguration::test_change_durable_writes from > CASSANDRA-19465, this fails with either: > {noformat} > AssertionError: Commitlog was written with durable writes disabled > {noformat} > Or what appears to be the original exception reported in CASSANDRA-19465: > {noformat} > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 > StorageService.java:631 - Stopping native transport > node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 > StorageProxy.java:1670 - Failed to apply mutation locally : > java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576) > at java.base/java.nio.Buffer.createPositionException(Buffer.java:341) > at java.base/java.nio.Buffer.position(Buffer.java:316) > at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216) > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52) > at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53) > at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612) > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:244) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:264) > at > org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664) > at > org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624) > at > org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 > StorageService.java:636 - Stopping gossiper > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes
[ https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828364#comment-17828364 ] Branimir Lambov edited comment on CASSANDRA-19471 at 3/19/24 2:43 PM: -- I believe the problem is that the buffer's limit (set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208]) is not the same as the buffer's capacity (from which {{endOfBuffer}} is set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]). I guess what we want is to change the former to set the limit first and then apply {{{}slice{}}}. We probably also want the aligning path above it to go through this slicing to set the capacity appropriately. I'd also change the assertions that follow to make sure the limit and capacity of the prepared buffer match, and are equal to the segment size. was (Author: blambov): I believe the problem is that the buffer's limit (set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208]) is not the same as the buffer's capacity (from which `endOfBuffer` is set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]). I guess what we want is to change the former to set the limit first and then apply `slice`. > Commitlog with direct io fails test_change_durable_writes > - > > Key: CASSANDRA-19471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19471 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Brandon Williams >Priority: Normal > Fix For: 5.0-rc, 5.x > > > With the commitlog_disk_access_mode set to direct, and the improved > configuration_test.py::TestConfiguration::test_change_durable_writes from > CASSANDRA-19465, this fails with either: > {noformat} > AssertionError: Commitlog was written with durable writes disabled > {noformat} > Or what appears to be the original exception reported in CASSANDRA-19465: > {noformat} > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 > StorageService.java:631 - Stopping native transport > node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 > StorageProxy.java:1670 - Failed to apply mutation locally : > java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576) > at java.base/java.nio.Buffer.createPositionException(Buffer.java:341) > at java.base/java.nio.Buffer.position(Buffer.java:316) > at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216) > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52) > at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53) > at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612) > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:244) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:264) > at > org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664) > at > org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624) > at > org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 > StorageService.java:636 - Stopping gossiper > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes
[ https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828364#comment-17828364 ] Branimir Lambov commented on CASSANDRA-19471: - I believe the problem is that the buffer's limit (set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208]) is not the same as the buffer's capacity (from which `endOfBuffer` is set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]). I guess what we want is to change the former to set the limit first and then apply `slice`. > Commitlog with direct io fails test_change_durable_writes > - > > Key: CASSANDRA-19471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19471 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Brandon Williams >Priority: Normal > Fix For: 5.0-rc, 5.x > > > With the commitlog_disk_access_mode set to direct, and the improved > configuration_test.py::TestConfiguration::test_change_durable_writes from > CASSANDRA-19465, this fails with either: > {noformat} > AssertionError: Commitlog was written with durable writes disabled > {noformat} > Or what appears to be the original exception reported in CASSANDRA-19465: > {noformat} > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 > StorageService.java:631 - Stopping native transport > node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 > StorageProxy.java:1670 - Failed to apply mutation locally : > java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576) > at java.base/java.nio.Buffer.createPositionException(Buffer.java:341) > at java.base/java.nio.Buffer.position(Buffer.java:316) > at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216) > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52) > at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53) > at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612) > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:244) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:264) > at > org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664) > at > org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624) > at > org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 > StorageService.java:636 - Stopping gossiper > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml
[ https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824756#comment-17824756 ] Branimir Lambov commented on CASSANDRA-19460: - LGTM > Fix tests to work with ULID SSTable identifiers to enable > uuid_sstable_identifiers_enabled in cassandra-latest.yaml > --- > > Key: CASSANDRA-19460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19460 > Project: Cassandra > Issue Type: Task > Components: CI, Test/dtest/java, Test/unit >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-18753 we identified that we want to also set > uuid_sstable_identifiers_enabled to true, while running a CI with it turned > on, it failed (1). > Errors do not seem to be serious, it is just the test suite we have is not > prepared for the case when uuid_sstable_identifiers_enabled is set to true by > default. > We need to fix all these tests so we can have cassandra-latest.yaml > containing that property. > https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml
[ https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19460: Reviewers: Branimir Lambov Status: Review In Progress (was: Needs Committer) > Fix tests to work with ULID SSTable identifiers to enable > uuid_sstable_identifiers_enabled in cassandra-latest.yaml > --- > > Key: CASSANDRA-19460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19460 > Project: Cassandra > Issue Type: Task > Components: CI, Test/dtest/java, Test/unit >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-18753 we identified that we want to also set > uuid_sstable_identifiers_enabled to true, while running a CI with it turned > on, it failed (1). > Errors do not seem to be serious, it is just the test suite we have is not > prepared for the case when uuid_sstable_identifiers_enabled is set to true by > default. > We need to fix all these tests so we can have cassandra-latest.yaml > containing that property. > https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml
[ https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19460: Status: Ready to Commit (was: Review In Progress) > Fix tests to work with ULID SSTable identifiers to enable > uuid_sstable_identifiers_enabled in cassandra-latest.yaml > --- > > Key: CASSANDRA-19460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19460 > Project: Cassandra > Issue Type: Task > Components: CI, Test/dtest/java, Test/unit >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-18753 we identified that we want to also set > uuid_sstable_identifiers_enabled to true, while running a CI with it turned > on, it failed (1). > Errors do not seem to be serious, it is just the test suite we have is not > prepared for the case when uuid_sstable_identifiers_enabled is set to true by > default. > We need to fix all these tests so we can have cassandra-latest.yaml > containing that property. > https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824394#comment-17824394 ] Branimir Lambov commented on CASSANDRA-18753: - Committed to 5.0 as [06ed1afc34128523298020e7601dad148f2b2fb6|https://github.com/apache/cassandra/commit/06ed1afc34128523298020e7601dad148f2b2fb6] and trunk as [28efb63df52bafaf51cd458da021f6050900017a|https://github.com/apache/cassandra/commit/28efb63df52bafaf51cd458da021f6050900017a]. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823998#comment-17823998 ] Branimir Lambov commented on CASSANDRA-18753: - That test is apparently already fixed. [Latest run|https://app.circleci.com/pipelines/github/blambov/cassandra/606/workflows/628459f1-f3fe-449c-a047-a784cc9711f5/jobs/24959/tests] had only a timeout of {{ActiveCompactionsTest}} -- reduced the number of iterations in the test to fix this. Uploaded final version; I'm ready to commit it but I'd like one last review of the wording in {{NEWS.txt}} and {{cassandra(-latest).yaml}}. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI
[ https://issues.apache.org/jira/browse/CASSANDRA-19459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19459: Resolution: Fixed Status: Resolved (was: Triage Needed) Fixed by CASSANDRA-19018. > test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions > fails with SAI > --- > > Key: CASSANDRA-19459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19459 > Project: Cassandra > Issue Type: Bug > Components: Feature/SAI >Reporter: Branimir Lambov >Priority: Normal > > The dtest > {{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} > fails when the default secondary index is switched to SAI with > {code} > test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions > failed; it passed 0 out of the required 1 times. > > Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] > exited with non-zero status; exit status: 2; > stderr: error: null > -- StackTrace -- > java.lang.NullPointerException > at java.base/java.util.Objects.requireNonNull(Objects.java:209) > at > org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102) > at > org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166) > at > org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125) > at > org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092) > at > org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289) > at > org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:840) > {code} > Discovered while testing CASSANDRA-18753. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823946#comment-17823946 ] Branimir Lambov edited comment on CASSANDRA-18753 at 3/6/24 10:07 AM: -- Well, tests [look much better now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7]. We have only one failure, {{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} with SAI. Opened CASSANDRA-19459 for this, and proceeding to merge this ticket. was (Author: blambov): Well, tests [look much better now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7]. We have only one failure, {{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} with SAI. Opened CASSANDRA- 19459 for this, and proceeding to merge this ticket. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI
Branimir Lambov created CASSANDRA-19459: --- Summary: test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI Key: CASSANDRA-19459 URL: https://issues.apache.org/jira/browse/CASSANDRA-19459 Project: Cassandra Issue Type: Bug Components: Feature/SAI Reporter: Branimir Lambov The dtest {{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} fails when the default secondary index is switched to SAI with {code} test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions failed; it passed 0 out of the required 1 times. Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] exited with non-zero status; exit status: 2; stderr: error: null -- StackTrace -- java.lang.NullPointerException at java.base/java.util.Objects.requireNonNull(Objects.java:209) at org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102) at org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166) at org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125) at org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092) at org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289) at org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90) at org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354) at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:840) {code} Discovered while testing CASSANDRA-18753. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822034#comment-17822034 ] Branimir Lambov commented on CASSANDRA-18753: - I don't mind removing it, especially if we have a plan for adding it back. I'll remove it and re-run CI. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799053#comment-17799053 ] Branimir Lambov edited comment on CASSANDRA-18753 at 1/16/24 8:49 AM: -- Merged CCM and DTest patches (they do not change anything unless the {{--configuration-yaml}} flag is used). [The state of failing tests at the moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]: - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}): -- {{CQLVectorTest}} (CASSANDRA-19167) -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168) - JUnit tests in latest mode: -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, {{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, {{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042) -- {{RepairJobTest}} (CASSANDRA-19043) -- {{ClientRequestMetricsTest}} (CASSANDRA-19046) - JVM dtests in latest mode: -- {{RepairTest}} (CASSANDRA-19085) -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126) -- {{QueriesTableTest}} (CASSANDRA-19046) - Python dtests in latest mode: -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145) -- {{TestReplaceAddress}} (CASSANDRA-19144) -- {{TestSnapshot}} (CASSANDRA-19126) -- {{TestClientRequestMetrics}} (CASSANDRA-19046) Several {{TestBootstrap}} tests seems to be failing in all configurations, some already marked as flaky; this likely is not caused by this patch. There are also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run repeatedly due to longer {{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}). Please review [the PR|https://github.com/apache/cassandra/pull/2896]. was (Author: blambov): Merged CCM and DTest patches (they do not change anything unless the {{--configuration-yaml}} flag is used). [The state of failing tests at the moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]: - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}): -- {{CQLVectorTest}} (CASSANDRA-19167) -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168) - JUnit tests in latest mode: -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, {{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, {{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042) -- {{RepairJobTest}} (CASSANDRA-19043) - JVM dtests in latest mode: -- {{RepairTest}} (CASSANDRA-19085) -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126) -- {{QueriesTableTest}} (CASSANDRA-19046) - Python dtests in latest mode: -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145) -- {{TestReplaceAddress}} (CASSANDRA-19144) -- {{TestSnapshot}} (CASSANDRA-19126) -- {{TestClientRequestMetrics}} (CASSANDRA-19046) Several {{TestBootstrap}} tests seems to be failing in all configurations, some already marked as flaky; this likely is not caused by this patch. There are also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run repeatedly due to longer {{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}). Please review [the PR|https://github.com/apache/cassandra/pull/2896]. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 9h 50m > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804661#comment-17804661 ] Branimir Lambov commented on CASSANDRA-19126: - It is to me. > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798565#comment-17798565 ] Branimir Lambov commented on CASSANDRA-19126: - {quote} {code} private static final String MIXED_MODE_ERROR = "Some nodes involved in repair are on an incompatible major version. " + "Repair is not supported in mixed major version clusters."; {code} {quote} _To me_ this message in the context of a 5.0 cluster where something is in the wrong compatibility mode would be quite confusing. At the very least we need to state very clearly that a 5.x node in compatibility mode is considered a 4.x node for all intents and purposes, including being a "same major version" for the message above. Also, does this not mean we can't ever drop 4.0 support because e.g. 6.0 must be compatible with 5.0, including in its compatibility mode? > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795406#comment-17795406 ] Branimir Lambov commented on CASSANDRA-19126: - In other words, you both feel that it is okay for {{BulkLoader}} to not work if it is not the corresponding version or is not configured exactly like the database is? Separately, that a node in e.g. {{UPGRADING}} mode should not be able to stream sstables to one in {{NONE}}? > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795089#comment-17795089 ] Branimir Lambov commented on CASSANDRA-19126: - > Precise fix for this would be to use the same compatibility mode for bulk > loader and the node. While this would fix the test, it would not do anything about the underlying problem. C* 5 nodes in different compatibility mode should be able to stream with each other. One should at least be able to stream whole sstables from legacy mode to current. Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it might violate the downgradability promise while such data is not compacted. We probably need a warning if current-format data is streamed to a node in legacy mode (e.g. suggesting one does upgradesstables before downgrading below 5.0). > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795089#comment-17795089 ] Branimir Lambov edited comment on CASSANDRA-19126 at 12/10/23 4:57 PM: --- bq. Precise fix for this would be to use the same compatibility mode for bulk loader and the node. While this would fix the test, it would not do anything about the underlying problem. C* 5 nodes in different compatibility mode should be able to stream with each other. One should at least be able to stream whole sstables from legacy mode to current. Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it might violate the downgradability promise while such data is not compacted. We probably need a warning if current-format data is streamed to a node in legacy mode (e.g. suggesting one does upgradesstables before downgrading below 5.0). was (Author: blambov): > Precise fix for this would be to use the same compatibility mode for bulk > loader and the node. While this would fix the test, it would not do anything about the underlying problem. C* 5 nodes in different compatibility mode should be able to stream with each other. One should at least be able to stream whole sstables from legacy mode to current. Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it might violate the downgradability promise while such data is not compacted. We probably need a warning if current-format data is streamed to a node in legacy mode (e.g. suggesting one does upgradesstables before downgrading below 5.0). > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers
[ https://issues.apache.org/jira/browse/CASSANDRA-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19168: Fix Version/s: 5.0-rc > VectorUpdateDeleteTest fails with heap_buffers > -- > > Key: CASSANDRA-19168 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19168 > Project: Cassandra > Issue Type: Bug > Components: Feature/Vector Search >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} > fails with > {code} > junit.framework.AssertionFailedError: Result set does not contain a row with > pk = 0 > at > org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133) > at > org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers
Branimir Lambov created CASSANDRA-19168: --- Summary: VectorUpdateDeleteTest fails with heap_buffers Key: CASSANDRA-19168 URL: https://issues.apache.org/jira/browse/CASSANDRA-19168 Project: Cassandra Issue Type: Bug Components: Feature/Vector Search Reporter: Branimir Lambov When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} fails with {code} junit.framework.AssertionFailedError: Result set does not contain a row with pk = 0 at org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133) at org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers
[ https://issues.apache.org/jira/browse/CASSANDRA-19167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19167: Fix Version/s: 5.0-rc > CQLVectorTest fails with heap_buffers > - > > Key: CASSANDRA-19167 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19167 > Project: Cassandra > Issue Type: Bug > Components: Feature/Vector Search >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} > test fails with > {code} > org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: > Invalid 32-bits integer value, expecting 4 bytes but got 6 > at > org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819) > at > org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135) > at > org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082) > at > org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180) > at > org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142) > at > org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277) > at > org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58) > at > org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605) > at > org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175) > at > org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162) > at > org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999) > at > org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564) > at > org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600) > at > org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570) > at > org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108) > at > org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445) > at > org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597) > at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576) > at > org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers
Branimir Lambov created CASSANDRA-19167: --- Summary: CQLVectorTest fails with heap_buffers Key: CASSANDRA-19167 URL: https://issues.apache.org/jira/browse/CASSANDRA-19167 Project: Cassandra Issue Type: Bug Components: Feature/Vector Search Reporter: Branimir Lambov When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} test fails with {code} org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: Invalid 32-bits integer value, expecting 4 bytes but got 6 at org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695) at org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842) at org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819) at org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135) at org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83) at org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141) at org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082) at org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180) at org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142) at org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277) at org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58) at org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605) at org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175) at org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162) at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999) at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564) at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600) at org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570) at org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108) at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445) at org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597) at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576) at org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19145) Python dtest TestWriteFailures.test_paxos is failing with Paxos V2
Branimir Lambov created CASSANDRA-19145: --- Summary: Python dtest TestWriteFailures.test_paxos is failing with Paxos V2 Key: CASSANDRA-19145 URL: https://issues.apache.org/jira/browse/CASSANDRA-19145 Project: Cassandra Issue Type: Bug Components: Feature/Lightweight Transactions Reporter: Branimir Lambov With configuration changed to engage Paxos V2 with repaired state purging, the dtest fails with: {code} test_paxos write_failures_test.TestWriteFailures self = def test_paxos(self): """ A light transaction receives a WriteFailure """ > exc = self._perform_cql_statement("INSERT INTO mytable (key, value) > VALUES ('key1', 'Value 1') IF NOT EXISTS") write_failures_test.py:202: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ write_failures_test.py:88: in _perform_cql_statement session.execute(statement) ../env3.7/src/cassandra-driver/cassandra/cluster.py:2618: in execute return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state, host, execute_as).result() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = def result(self): """ Return the final result or raise an Exception if errors were encountered. If the final result or error has not been set yet, this method will block until it is set, or the timeout set for the request expires. Timeout is specified in the Session request execution functions. If the timeout is exceeded, an :exc:`cassandra.OperationTimedOut` will be raised. This is a client-side timeout. For more information about server-side coordinator timeouts, see :class:`.policies.RetryPolicy`. Example usage:: >>> future = session.execute_async("SELECT * FROM mycf") >>> # do other stuff... >>> try: ... rows = future.result() ... for row in rows: ... ... # process results ... except Exception: ... log.exception("Operation failed:") """ self._event.wait() if self._final_result is not _NOT_SET: return ResultSet(self, self._final_result) else: > raise self._final_exception E cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed out waiting for replica nodes' responses] message="CAS operation timed out: received 1 of 2 required responses after 0 contention retries" info={'consistency': 'SERIAL', 'required_responses': 2, 'received_responses': 1, 'write_type': 'CAS'} ../env3.7/src/cassandra-driver/cassandra/cluster.py:4894: WriteTimeout {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19144) Python dtest replace_address_test.TestReplaceAddress is failing with Paxos V2
Branimir Lambov created CASSANDRA-19144: --- Summary: Python dtest replace_address_test.TestReplaceAddress is failing with Paxos V2 Key: CASSANDRA-19144 URL: https://issues.apache.org/jira/browse/CASSANDRA-19144 Project: Cassandra Issue Type: Bug Components: Consistency/Bootstrap and Decommission, Feature/Lightweight Transactions Reporter: Branimir Lambov Paxos repair is causing an unexpected failure: {code} test_replace_with_insufficient_replicas replace_address_test.TestReplaceAddress failed on teardown with "Failed: Unexpected error found in node logs (see stdout for full details). Errors: [[replacement] 'ERROR [main] 2023-11-29 10:23:08,752 CassandraDaemon.java:878 - Exception encountered during startup\njava.lang.UnsupportedOperationException: null\n\tat org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)']" Unexpected error found in node logs (see stdout for full details). Errors: [[replacement] 'ERROR [main] 2023-11-29 10:23:08,752 CassandraDaemon.java:878 - Exception encountered during startup\njava.lang.UnsupportedOperationException: null\n\tat org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)'] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792095#comment-17792095 ] Branimir Lambov commented on CASSANDRA-19126: - Python dtest \{{snaphost_test}} is also failing because of this sstableloader problem: {code:java} Exception: sstableloader command '/home/cassandra/cassandra/bin/sstableloader -d 127.0.0.1 /tmp/tmpidg_8u3c/0/ks/cf' failed; exit status: 1'; stdout: Established connection to initial hosts Opening sstables and calculating sections to stream Streaming relevant part of /tmp/tmpidg_8u3c/0/ks/cf/da-1-bti-Data.db to [/127.0.0.1:7000] progress: total: 100% 0.000B/s (avg: 0.000B/s) ; stderr: ERROR 10:16:01,391 [Stream #4bb85ff0-8ea0-11ee-94d3-3de6344de31d] Streaming error occurred on session with peer 127.0.0.1:7000 java.lang.ClassCastException: class org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible cannot be cast to class org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success (org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible and org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success are in unnamed module of loader 'app') {code} > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792090#comment-17792090 ] Branimir Lambov commented on CASSANDRA-19046: - Python dtest failure related to this: {{client_request_metrics_test.TestClientRequestMetrics}} {code:java} > self.cas_read_contention() client_request_metrics_test.py:103: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ client_request_metrics_test.py:355: in cas_read_contention consistency_level=CL.SERIAL)) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = metric_factory = functools.partial(, 'CASRead') statement = def cas_contention(self, metric_factory, statement): query_count = 20 cassandra_version = self.dtest_config.cassandra_version_from_build def sample(): baseline = metric_factory() baseline.validate(cassandra_version) execute_concurrent_with_args(self.session, statement, repeat([], query_count), raise_on_first_error=False) updated = metric_factory() updated.validate(cassandra_version) return updated.diff(baseline) for _ in range(10): diff = sample() if 'ContentionHistogram.Count' in diff: break assert diff['Latency.Count'] == query_count assert diff['TotalLatency.Count'] > 0 > assert 0 < diff['ContentionHistogram.Count'] <= query_count E KeyError: 'ContentionHistogram.Count' client_request_metrics_test.py:382: KeyError{code} > Paxos V2 does not update individual fields of readMetrics > - > > Key: CASSANDRA-19046 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19046 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Observability/Metrics >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with > {{paxos_variant: v2}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791958#comment-17791958 ] Branimir Lambov commented on CASSANDRA-19126: - I believe what Brandon means is that we also need upgrade tests where only some nodes have changed {{storage_compatibility_mode}}. [This line|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L259] is what appears to be preventing {{BulkLoader}} from working. I don't have enough knowledge in the area and have not dug deep enough to understand all implications. > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19126: Description: In particular, SSTableLoader appears to be incompatible with storage_compatibility_mode: NONE, which manifests as a failure of {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} when the flag is turned on (found during CASSANDRA-18753 testing). Setting {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not help (according to the docs, this setting is not picked up). This is likely a bigger problem as the acceptable streaming version for C* 5 is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear to be able to stream with each other if their setting for the compatibility mode is different. was: In particular, SSTableLoader appears to be incompatible with storage_compatibility_mode: NONE, which manifests as a failure of `org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when the flag is turned on (found during CASSANDRA-18753 testing). Setting `storage_compatibility_mode: NONE` in the tool configuration yaml does not help (according to the docs, this setting is not picked up). This is likely a bigger problem as the acceptable streaming version for C* 5 is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear to be able to stream with each other if their setting for the compatibility mode is different. > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Priority: Normal > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
Branimir Lambov created CASSANDRA-19126: --- Summary: Streaming appears to be incompatible with different storage_compatibility_mode settings Key: CASSANDRA-19126 URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 Project: Cassandra Issue Type: Bug Components: Consistency/Streaming, Legacy/Streaming and Messaging, Messaging/Internode, Tool/bulk load Reporter: Branimir Lambov In particular, SSTableLoader appears to be incompatible with storage_compatibility_mode: NONE, which manifests as a failure of `org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when the flag is turned on (found during CASSANDRA-18753 testing). Setting `storage_compatibility_mode: NONE` in the tool configuration yaml does not help (according to the docs, this setting is not picked up). This is likely a bigger problem as the acceptable streaming version for C* 5 is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear to be able to stream with each other if their setting for the compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19085: Fix Version/s: 5.0-rc > In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE > --- > > Key: CASSANDRA-19085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19085 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, > the test fails with an exception that appears to be a genuine problem: > {code:java} > junit.framework.AssertionFailedError: Exception found expected null, but > was: at > org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > > > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) > at > org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > org.apache.cassandra.distributed.shared.ShutdownException: Uncaught > exceptions were thrown during test > at > org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) > at > org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) > at > org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Suppressed: java.lang.IllegalStateException: complete already: > (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) > at > org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) > at > org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) > at > org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) > at > org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) > at > org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) > at > org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) > at > org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at >
[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18753: Fix Version/s: 5.0-rc (was: 5.0.x) > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 3h 20m > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19085: Description: More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, the test fails with an exception that appears to be a genuine problem: {code:java} junit.framework.AssertionFailedError: Exception found expected null, but was: at org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) at org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) at org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) at org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions were thrown during test at org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) at org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) at org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) Suppressed: java.lang.IllegalStateException: complete already: (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) at org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) at org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) at org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) at org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:833){code} The updates to {{pending}} in ActiveRepairService are not concurrency-safe, but fixing them by doing e.g. {code:java} Index: src/java/org/apache/cassandra/service/ActiveRepairService.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 === diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java b/src/java/org/apache/cassandra/service/ActiveRepairService.java --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java (revision 04552046f74f596e69e2d98c3f3e522fb5888c99) +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java (date 1700839874092) @@ -675,7 +675,7 @@ if (promise.isDone()) return; String errorMsg = "Did not get replies from all endpoints."; - if (promise.tryFailure(new RuntimeException(errorMsg))) + if (pending.getAndSet(-1) > 0 && promise.tryFailure(new RuntimeException(errorMsg))) participateFailed(parentRepairSession, errorMsg); }, timeoutMillis, MILLISECONDS); @@ -703,8 +703,8 @@ failedNodes.add(from.toString()); if (failureReason == RequestFailureReason.TIMEOUT) { -
[jira] [Created] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
Branimir Lambov created CASSANDRA-19085: --- Summary: In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE Key: CASSANDRA-19085 URL: https://issues.apache.org/jira/browse/CASSANDRA-19085 Project: Cassandra Issue Type: Bug Components: Consistency/Repair Reporter: Branimir Lambov More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, the test fails with an exception that appears to be a genuine problem: {code:java} junit.framework.AssertionFailedError: Exception found expected null, but was: at org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) at org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) at org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) at org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions were thrown during test at org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) at org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) at org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) Suppressed: java.lang.IllegalStateException: complete already: (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) at org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) at org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) at org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) at org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:833){code} The updates to {{pending}} in AbstractRepairService are not concurrency-safe, but fixing them by doing e.g. {code:java} Index: src/java/org/apache/cassandra/service/ActiveRepairService.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 === diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java b/src/java/org/apache/cassandra/service/ActiveRepairService.java --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java (revision 04552046f74f596e69e2d98c3f3e522fb5888c99) +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java (date 1700839874092) @@ -675,7 +675,7 @@ if (promise.isDone()) return; String errorMsg = "Did not get replies from all endpoints."; - if (promise.tryFailure(new RuntimeException(errorMsg))) + if (pending.getAndSet(-1) > 0 && promise.tryFailure(new RuntimeException(errorMsg))) participateFailed(parentRepairSession, errorMsg); }, timeoutMillis,
[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals
[ https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789130#comment-17789130 ] Branimir Lambov commented on CASSANDRA-18757: - Tests look good, repeated test completed with no failures: [https://app.circleci.com/pipelines/github/blambov/cassandra?branch=CASSANDRA-18757] [~smiklosovic], do you give a second approval so that I can commit this? > UnifiedCompactionTask is incorrectly setting keepOriginals > -- > > Key: CASSANDRA-18757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18757 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > super(cfs, txn, gcBefore, > strategy.getController().getIgnoreOverlapsInExpirationCheck());{code} > in {{UnifiedCompactionTask}} is calling the base constructor > {code:java} > public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long > gcBefore, boolean keepOriginals) > {code} > which can set {{keepOriginals}} to true when it should not be. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18753) We should offer an option for optimized default configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789052#comment-17789052 ] Branimir Lambov edited comment on CASSANDRA-18753 at 11/23/23 10:20 AM: DTest support has been added. The python dtests require pull requests for [CCM|https://github.com/riptano/ccm/pull/760] and [cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be merged. It works by passing an argument to ccm to make it read the configuration from "cassandra_latest.yaml". The new configuration replaces {{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on in the latest configuration. I'm not happy at all with how the in-jvm dtests are configured at this point (directly including the settings in code), but I could not think of a quick way to get them to load a configuration file. The latest config is combined with vnodes to lighten the testing load. Test results to appear [here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e]. was (Author: blambov): DTest support has been added. The python dtests require pull requests for [CCM|https://github.com/riptano/ccm/pull/760] and [cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be merged. It works by passing an argument to ccm to make it read the configuration from "cassandra_latest.yaml". The new configuration replaces {{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on in the latest configuration. I'm not happy at all with how the in-jvm dtests are configured at this point (directly including the settings in code), but I could not think of a quick way to get them to load a configuration file. Test results to appear [here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e]. > We should offer an option for optimized default configuration > - > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0.x, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals
[ https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788730#comment-17788730 ] Branimir Lambov commented on CASSANDRA-18757: - How about splitting this into separate tests for the 4 cases? I.e. have the four calls in {{testIgnoreOverlaps}} run in separate {{@Test}}-annotated methods? > UnifiedCompactionTask is incorrectly setting keepOriginals > -- > > Key: CASSANDRA-18757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18757 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > super(cfs, txn, gcBefore, > strategy.getController().getIgnoreOverlapsInExpirationCheck());{code} > in {{UnifiedCompactionTask}} is calling the base constructor > {code:java} > public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long > gcBefore, boolean keepOriginals) > {code} > which can set {{keepOriginals}} to true when it should not be. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19046: Summary: Paxos V2 does not update individual fields of readMetrics (was: Paxos V2 does not individual fields of readMetrics) > Paxos V2 does not update individual fields of readMetrics > - > > Key: CASSANDRA-19046 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19046 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Observability/Metrics >Reporter: Branimir Lambov >Priority: Normal > > As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with > {{paxos_variant: v2}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index
[ https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787323#comment-17787323 ] Branimir Lambov commented on CASSANDRA-19034: - Yes, we have run the entire unit test suite (no dtests yet) with SAI as default, and these three are the only failures that aren't usecases that SAI can't support (ByteOrderedPartitioner and blobs). With CASSANDRA-18753, we will have a test configuration run as part as the precommit tests that runs with SAI (plus tries, UCS, paxos v2...). > SelectTest fails when run with SAI index > > > Key: CASSANDRA-19034 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19034 > Project: Cassandra > Issue Type: Bug > Components: Feature/SAI >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-beta > > > When run with SAI index, the following two tests error out: > {code} > [junit-timeout] Testcase: > testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: >FAILED > [junit-timeout] Got less rows than expected. Expected 1 but got 0 > [junit-timeout] junit.framework.AssertionFailedError: Got less rows than > expected. Expected 1 but got 0 > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > [junit-timeout] > [junit-timeout] > [junit-timeout] Testcase: > testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: > FAILED > [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected > <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 > column 0 (k1 of type int), expected <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > The latter seems to be giving the results in the wrong order, and the order > flips when the data is flushed. > Caught during preparation of _latest config that would switch default to SAI > (CASSANDRA-18753). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index
[ https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787279#comment-17787279 ] Branimir Lambov commented on CASSANDRA-19034: - A further failure of this kind: {code} [junit-timeout] Testcase: testStaticIndexAndNonStaticIndex(org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest)-_jdk11: FAILED [junit-timeout] Got less rows than expected. Expected 1 but got 0 [junit-timeout] junit.framework.AssertionFailedError: Got less rows than expected. Expected 1 but got 0 [junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) [junit-timeout] at org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest.testStaticIndexAndNonStaticIndex(SecondaryIndexOnStaticColumnTest.java:191) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit-timeout] [junit-timeout] [junit-timeout] Test org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest FAILED {code} > SelectTest fails when run with SAI index > > > Key: CASSANDRA-19034 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19034 > Project: Cassandra > Issue Type: Bug > Components: Feature/SAI >Reporter: Branimir Lambov >Priority: Normal > > When run with SAI index, the following two tests error out: > {code} > [junit-timeout] Testcase: > testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: >FAILED > [junit-timeout] Got less rows than expected. Expected 1 but got 0 > [junit-timeout] junit.framework.AssertionFailedError: Got less rows than > expected. Expected 1 but got 0 > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > [junit-timeout] > [junit-timeout] > [junit-timeout] Testcase: > testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: > FAILED > [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected > <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 > column 0 (k1 of type int), expected <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > The latter seems to be giving the results in the wrong order, and the order > flips when the data is flushed. > Caught during preparation of _latest config that would switch default to SAI > (CASSANDRA-18753). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To
[jira] [Created] (CASSANDRA-19034) SelectTest fails when run with SAI index
Branimir Lambov created CASSANDRA-19034: --- Summary: SelectTest fails when run with SAI index Key: CASSANDRA-19034 URL: https://issues.apache.org/jira/browse/CASSANDRA-19034 Project: Cassandra Issue Type: Bug Components: Feature/SAI Reporter: Branimir Lambov When run with SAI index, the following two tests error out: {code} [junit-timeout] Testcase: testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: FAILED [junit-timeout] Got less rows than expected. Expected 1 but got 0 [junit-timeout] junit.framework.AssertionFailedError: Got less rows than expected. Expected 1 but got 0 [junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625) [junit-timeout] at org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit-timeout] [junit-timeout] [junit-timeout] Testcase: testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: FAILED [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected <1> but got <0> [junit-timeout] Invalid value for row 1 column 2 (v of type set), expected <{4, 5, 6}> but got <{2, 3, 4}> [junit-timeout] [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 column 0 (k1 of type int), expected <1> but got <0> [junit-timeout] Invalid value for row 1 column 2 (v of type set), expected <{4, 5, 6}> but got <{2, 3, 4}> [junit-timeout] [junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543) [junit-timeout] at org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) {code} The latter seems to be giving the results in the wrong order, and the order flips when the data is flushed. Caught during preparation of _latest config that would switch default to SAI (CASSANDRA-18753). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786290#comment-17786290 ] Branimir Lambov edited comment on CASSANDRA-18710 at 11/15/23 10:15 AM: {quote}So perhaps the expected value should be calculated as a moving average by updating it with subsequent table sizes. {quote} This makes sense. Sorting the sstable files by name should give them in the correct order, so we can easily calculate the moving average from them. Actually, that would solve the extra flush problem as well, wouldn't it? was (Author: blambov): {quote}So perhaps the expected value should be calculated as a moving average by updating it with subsequent table sizes. {quote} This makes sense. Sorting the sstable files by name should give them in the correct order, so we can easily calculate the moving average from them. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786290#comment-17786290 ] Branimir Lambov commented on CASSANDRA-18710: - {quote}So perhaps the expected value should be calculated as a moving average by updating it with subsequent table sizes. {quote} This makes sense. Sorting the sstable files by name should give them in the correct order, so we can easily calculate the moving average from them. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals
[ https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786282#comment-17786282 ] Branimir Lambov commented on CASSANDRA-18757: - I think it is a leftover from a refactoring that (among other things) fixed CASSANDRA-18756 in DSE. Fix LGTM, but it's a shame that no test caught it. > UnifiedCompactionTask is incorrectly setting keepOriginals > -- > > Key: CASSANDRA-18757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18757 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > super(cfs, txn, gcBefore, > strategy.getController().getIgnoreOverlapsInExpirationCheck());{code} > in {{UnifiedCompactionTask}} is calling the base constructor > {code:java} > public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long > gcBefore, boolean keepOriginals) > {code} > which can set {{keepOriginals}} to true when it should not be. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782692#comment-17782692 ] Branimir Lambov edited comment on CASSANDRA-18945 at 11/3/23 6:15 PM: -- {quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should translate to baseShardCount Review Comment: @ethan-brown2022 `count >= 0` is more natural to me {quote} I can't find this to reply to it directly. The comment at the end of the line says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass {{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which would fail count >= 0, but is acceptable and should translate to baseShardCount)" or something similar? was (Author: blambov): {quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should translate to baseShardCount Review Comment: @ethan-brown2022 `count >= 0` is more natural to me {quote} I can't find this to reply to it directly. The comment at the end of the line says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass {{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which would fail {{{}count >= 0,{}}}", but is acceptable and should translate to baseShardCount)" or something similar? > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782692#comment-17782692 ] Branimir Lambov commented on CASSANDRA-18945: - {quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should translate to baseShardCount Review Comment: @ethan-brown2022 `count >= 0` is more natural to me {quote} I can't find this to reply to it directly. The comment at the end of the line says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass {{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which would fail {{{}count >= 0,{}}}", but is acceptable and should translate to baseShardCount)" or something similar? > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18232) Write docs for CEP-26 Unified Compaction Strategy (UCS)
[ https://issues.apache.org/jira/browse/CASSANDRA-18232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782640#comment-17782640 ] Branimir Lambov commented on CASSANDRA-18232: - There are some additional options coming with CASSANDRA-18945. The details can be found in [the developer-side markdown doc|https://github.com/datastax/cassandra/blob/CASSANDRA-18945/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#full-sharding-scheme]. > Write docs for CEP-26 Unified Compaction Strategy (UCS) > --- > > Key: CASSANDRA-18232 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18232 > Project: Cassandra > Issue Type: New Feature > Components: Documentation >Reporter: Lorina Poland >Assignee: Lorina Poland >Priority: High > Fix For: 5.x > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782638#comment-17782638 ] Branimir Lambov commented on CASSANDRA-18945: - We will handle the docs in the documentation ticket, CASSANDRA-18232. I will reach out to Lorina make her aware of the changes. > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18997) Unified Compaction Strategy is missing documentation
Branimir Lambov created CASSANDRA-18997: --- Summary: Unified Compaction Strategy is missing documentation Key: CASSANDRA-18997 URL: https://issues.apache.org/jira/browse/CASSANDRA-18997 Project: Cassandra Issue Type: Task Components: Documentation Reporter: Branimir Lambov UCS is missing from [the CQL documentation for 5.0|https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/ddl.html#cql-compaction-options] and [the compaction page|https://cassandra.apache.org/doc/5.0/cassandra/managing/operating/compaction/index.html#compaction-options]. We need to create a documentation page for UCS and link it from both. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782610#comment-17782610 ] Branimir Lambov commented on CASSANDRA-18710: - Yes, this looks like a 4.1 regression that is affecting all tests that are sensitive to the number of sstables. Such tests usually run in a separate keyspace (using {{KEYSPACE_PER_TEST}}) to avoid the keyspace flush that dropping a table triggers, but this new commit log recycling is triggering another flush that is not restricted to the affected keyspace. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0-beta, 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782193#comment-17782193 ] Branimir Lambov commented on CASSANDRA-18533: - I would keep it simple and not add a common settings entry under options. If necessary, the user can copy the value to both. > Move format-specific sstable options into the format configuration > -- > > Key: CASSANDRA-18533 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18533 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > This mainly concerns cassandra yaml settings: > - {{column_index_size}}, which should also be renamed to > {{row_index_granularity}} > - {{column_index_cache_size}} > - {{index_summary_capacity}} > - {{index_summary_resize_interval}} > and possibly > - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, > {{key_cache_migrate_during_compaction}} > - {{sstable_preemptive_open_interval}} > Existing settings should be deprecated but still picked up if defined. > At this point we will not consider table-level options that make better sense > as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, > {{crc_check_chance}} and possibly {{compression}}), because we do not yet > support per-table format selection/configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782184#comment-17782184 ] Branimir Lambov commented on CASSANDRA-18533: - 1. Yes, precisely. 2. The key cache is constructed in a completely separate portion of the code, isn't it? Ignore the key cache settings (except migration), I don't think changing this is something we can do at the moment. 3. Although it is not at the moment, the row index granularity in particular should be a table-level property -- there's no real reason to use one setting for all tables, and there's an advantage to be had by making it configurable. However, things like the key cache size or index summary capacity are something to be shared, not just between tables but also potentially between formats; I don't want to get into a complicated solution for this, I would either ignore any table-level modification for these (with a warning) or check that the value is the same among all tables. This, along with format variations (e.g. "bti-fast"), is also out of scope for this ticket. > Move format-specific sstable options into the format configuration > -- > > Key: CASSANDRA-18533 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18533 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > This mainly concerns cassandra yaml settings: > - {{column_index_size}}, which should also be renamed to > {{row_index_granularity}} > - {{column_index_cache_size}} > - {{index_summary_capacity}} > - {{index_summary_resize_interval}} > and possibly > - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, > {{key_cache_migrate_during_compaction}} > - {{sstable_preemptive_open_interval}} > Existing settings should be deprecated but still picked up if defined. > At this point we will not consider table-level options that make better sense > as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, > {{crc_check_chance}} and possibly {{compression}}), because we do not yet > support per-table format selection/configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780350#comment-17780350 ] Branimir Lambov commented on CASSANDRA-18945: - Yes, I intend to commit it to 5.0. > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0, 5.x > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18945: Fix Version/s: 5.0 > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0, 5.x > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18945: Bug Category: Parent values: Degradation(12984)Level 1 values: Performance Bug/Regression(12997) Complexity: Normal Discovered By: Adhoc Test Reviewers: Branimir Lambov Severity: Normal Status: Open (was: Triage Needed) > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780240#comment-17780240 ] Branimir Lambov commented on CASSANDRA-18945: - [~smiklosovic], would you be willing to be the second reviewer? > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779468#comment-17779468 ] Branimir Lambov commented on CASSANDRA-18710: - So the {{KEYSPACE_PER_TEST}} fix for unexpected flushes no longer works after CASSANDRA-17071? All of the tests that use it will be having intermittent failures unless we find a way to block this. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779444#comment-17779444 ] Branimir Lambov commented on CASSANDRA-18945: - Attached [the result of a recent benchmark|https://issues.apache.org/jira/secure/attachment/13063855/key-value-oss.html] comparing the UCS default (green) to STCS (blue) and an option with larger SSTable size (orange). The default UCS has worse results in the throughput stage, but more importantly it is unable to serve the 110k ops/s during the 1:1 and read-only stages. I'm still investigating what causes these reads to be so slow, but switching to 10GiB target fully fixes the problem (the two other options the orange graph uses, 'base_shard_count': '1' and 'max_sstables_to_compact': '32', help but are not as significant on their own). Rather than ask users to choose a target size based on their expected data density, the database should be able to deal with this itself. Admitting some of the growth into the sstable size is a good way to achieve that. > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: key-value-oss.html > > Time Spent: 10m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18945: Attachment: key-value-oss.html > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: key-value-oss.html > > Time Spent: 10m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, > where 휆 is a parameter whose value is between 0 and 1. > With this, a 휆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 휆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1830#comment-1830 ] Branimir Lambov commented on CASSANDRA-18710: - It looks like the reason for the unexpected flush is the commit log: {code:java} [junit-timeout] INFO [OptionalTasks:1] 2023-10-12 21:55:11,095 ColumnFamilyStore.java:1017 - Enqueuing flush of cql_test_keyspace_alt.table_01, Reason: COMMITLOG_DIRTY, Usage: 74.752KiB (0%) on-heap, 3.777KiB (0%) off-heap [junit-timeout] INFO [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,103 Flushing.java:154 - Writing Memtable-table_01@1180822937(6.854KiB serialized bytes, 242 ops, 74.916KiB (0%) on-heap, 3.781KiB (0%) off-heap), flushed range = [null, null) [junit-timeout] INFO [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,128 Flushing.java:180 - Completed flushing /tmp/cassandra/build/test/cassandra/data/cql_test_keyspace_alt/table_01-03e61210694a11eeb4091bdb4ac3170b/nc-1-big-Data.db (6.839KiB) ... {code} which is flushing just 242 out of the 1000 ops that the test needs per table. We need to understand what causes these {{COMMITLOG_DIRTY}} flushes, because there are quite a few tests that will fail if a flush happens at the wrong time. Or maybe somehow disable commitlog-driven flushing for tests (e.g. by setting a really large commit log space limit). > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
Branimir Lambov created CASSANDRA-18945: --- Summary: Unified Compaction Strategy is creating too many sstables Key: CASSANDRA-18945 URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 Project: Cassandra Issue Type: Bug Components: Local/Compaction Reporter: Branimir Lambov The unified compaction strategy currently aims to create sstables with close to the same size, defaulting to 1 GiB. Unfortunately tests show that Cassandra starts to have performance problems when the number of sstables grows to the order of a thousand, and in particular that even 1 TiB of data with the default configuration is creating too many sstables for efficient processing. This matters even more for SAI, where the number of sstables in the system can have a proportional effect on the complexity of operations. It is quite easy to create a configuration option that allows sstables to take some part of the data growth by adding a multiplier to [the shard count calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] formula, replacing {{2 ^ round(log2(d / (t * b))) * b}} with {{2 ^ round((1 - 휆) * log2(d / (t * b))) * b}}, where 휆 is a parameter whose value is between 0 and 1. With this, a 휆 of 0.5 would mean that shard count and sstable size grow in parallel at the square root of the data size growth. 0 would result in no growth, and 1 in always using the same number of shards. It may also be valuable to introduce a threshold for engaging the base shard count to avoid splitting lowest-level sstables into fragments that are too small. Once both of these are in place, we can set defaults that better suit all node densities, including 10 TiB and beyond, for example: - target size of 1 GiB - 휆 of 1/3 - base shard count of 4 - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params
[ https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1333#comment-1333 ] Branimir Lambov commented on CASSANDRA-18872: - The patch looks good to me, the changes are not too invasive and can be easily replaced with format configuration in CASSANDRA-18534. Do we have a documentation ticket corresponding to this? AFAICS [the docs|https://cassandra.apache.org/doc/latest/cassandra/operating/compression.html] only mention the compression-level setting, even for 4.1. This documentation change also needs to explain that the chance only applies to compressed sstables. > Remove deprecated crc_check_chance in compression params > > > Key: CASSANDRA-18872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18872 > Project: Cassandra > Issue Type: Task > Components: Feature/Compression, Legacy/CQL >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 20m > Remaining Estimate: 0h > > crc_check_chance was moved from compression parameters and it is a standalone > table parameter. This was done in times of 3.0 so it is now time to get rid > of that in 5.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18534) Make sstable format configurable per table
[ https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18534: Fix Version/s: 5.0 (was: 5.x) > Make sstable format configurable per table > -- > > Key: CASSANDRA-18534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18534 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Some SSTable format settings need to be configurable per table for better > efficiency. This includes: > - {{row_index_granularity}} > - {{bloom_filter_fp_chance}} > - {{crc_check_chance}} > - {{min/max_index_interval}} > Some of these are currently configurable using direct properties of tables. > Having them as format properties makes better sense and should also support > specifying useable combinations of settings, e.g. > {code:java} > CREATE TABLE ... WITH sstable_format = "bti-fast"; > CREATE TABLE ... WITH sstable_format = "bti-small"; > {code} > where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} > e.g. as > {code:java} > sstable.format.options: > - bti-fast: > row_index_granularity: 1kiB > bloom_filter_fp_chance: 0.01 > - bti-small: > row_index_granularity: 32kiB > bloom_filter_fp_chance: 0.1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table
[ https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773696#comment-17773696 ] Branimir Lambov commented on CASSANDRA-18534: - bq. Also, do you think it is possible and useful to make sstable_format contain custom parameters? _All_ of the parameters to the SSTable format are custom, i.e. format-specific. This is also the qualifying condition for something to be moved into the format config: if you can imagine an SSTable format that does not need that flag, then it belongs to the format. E.g. bloom-filter-less formats do not need {{bloom_filter_fp_chance}}, and (even though they are not a feature of writing an SSTable) only {{BIG}} requires key cache options. Unless we are certain that CRC is the only way a format could defend against bit rot, {{check_crc_chance}} is also a format-specific property. > Make sstable format configurable per table > -- > > Key: CASSANDRA-18534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18534 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > Time Spent: 0.5h > Remaining Estimate: 0h > > Some SSTable format settings need to be configurable per table for better > efficiency. This includes: > - {{row_index_granularity}} > - {{bloom_filter_fp_chance}} > - {{crc_check_chance}} > - {{min/max_index_interval}} > Some of these are currently configurable using direct properties of tables. > Having them as format properties makes better sense and should also support > specifying useable combinations of settings, e.g. > {code:java} > CREATE TABLE ... WITH sstable_format = "bti-fast"; > CREATE TABLE ... WITH sstable_format = "bti-small"; > {code} > where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} > e.g. as > {code:java} > sstable.format.options: > - bti-fast: > row_index_granularity: 1kiB > bloom_filter_fp_chance: 0.01 > - bti-small: > row_index_granularity: 32kiB > bloom_filter_fp_chance: 0.1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params
[ https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773438#comment-17773438 ] Branimir Lambov commented on CASSANDRA-18872: - Have you looked at CASSANDRA-18534? Now that we have multiple SSTable formats, it makes a lot of sense to move properties like this into the format configuration, which in turn would mean passing a format configuration (instead of compression one) to the file handle builder. > Remove deprecated crc_check_chance in compression params > > > Key: CASSANDRA-18872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18872 > Project: Cassandra > Issue Type: Task > Components: Feature/Compression, Legacy/CQL >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > crc_check_chance was moved from compression parameters and it is a standalone > table parameter. This was done in times of 3.0 so it is now time to get rid > of that in 5.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771732#comment-17771732 ] Branimir Lambov commented on CASSANDRA-18464: - To make the review easier, could you fork the {{apache/cassandra}} repository on github, push a branch with the changes to your fork on top of {{cassandra-5.0}}, and open a pull request against {{apache/cassandra-5.0}}? My comments so far are these: On [Config.java 117|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R117]: {quote} Using booleans makes it very unclear which options are actually valid, and what the alternative means. Please change the configuration to an enum, e.g. {{commit_log_access_mode}} with values {{direct_jna}}, {{direct}}, and {{mmap}}. {quote} {quote} Actually, there should be only one direct option, and whether it uses nio or jni is an implementation detail that the users needn't care about. The next question is whether or not non-direct should be supported at all, and I personally prefer to not support it as this adds configuration complexity for no expected benefit. This also means that it makes sense to simply switch all other commit log segment types to be written direct, and this is simple enough to do in this ticket (especially since we dropped Java 8 and can use NIO's {{DIRECT}} option). {quote} On [Config.java 517|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R517]: {quote} When would someone need to change this? {quote} > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported
[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771432#comment-17771432 ] Branimir Lambov commented on CASSANDRA-18464: - There was a typo in my response above, I am in favour of having the patch land in 5.0. Just the 512 vs 4k difference is not something I would personally consider a good reason to include the JNA writing; the sync segments are usually much larger than that. I would rather go with the simpler NIO option. I can't find my code comments with the link above any more. They are [here|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#r128716588]. > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported yet and it can be enabled later > based on the Community feedback. > Following improvement are seen with Direct I/O enablement. > # 32 cores >= ~15% > # 64 cores >= ~80% > Also, another observation would like to share here. Reading Commitlog files > with Direct I/O might help in reducing node bring-up time after the node > crash. > Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07 > The attached patch enables Direct I/O feature for Commitlog files. Please > check and share your feedback. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18894) Drop commitlog chain marker updates
Branimir Lambov created CASSANDRA-18894: --- Summary: Drop commitlog chain marker updates Key: CASSANDRA-18894 URL: https://issues.apache.org/jira/browse/CASSANDRA-18894 Project: Cassandra Issue Type: Improvement Components: Local/Commit Log Reporter: Branimir Lambov CASSANDRA-13987 added a periodic update of the last commit log chain marker in order to allow for data in memory-mapped segments to be recovered even if it was not part of a synced segment. A much simpler way to do this is something in the vein of CASSANDRA-16482, i.e. ignoring an empty sync marker for the last entry in the commit log. We could do this by default if the commit log is uncompressed (and possibly only if using memory mapping after CASSANDRA-18464). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18464: Reviewers: Branimir Lambov Status: Review In Progress (was: Patch Available) > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported yet and it can be enabled later > based on the Community feedback. > Following improvement are seen with Direct I/O enablement. > # 32 cores >= ~15% > # 64 cores >= ~80% > Also, another observation would like to share here. Reading Commitlog files > with Direct I/O might help in reducing node bring-up time after the node > crash. > Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07 > The attached patch enables Direct I/O feature for Commitlog files. Please > check and share your feedback. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770415#comment-17770415 ] Branimir Lambov commented on CASSANDRA-18464: - This patch is very valuable, and I support if going into 5.0 as well as 5.1. In separate tests we have often found a memory-mapped commit log to be a serious performance problem for a node with a lot of data. Even without DIRECT or JNA, not using `msync` is making a huge difference. Because of this most of the performance testing I personally do is done with compressed commit log. I added comments to [the latest published branch|https://github.com/driftx/cassandra/tree/CASSANDRA-18464-trunk] with some suggested changes. I am curious, if the NIO option is constructed correctly (with aligned direct buffers, possibly also issuing the writes to be page-aligned and containing whole pages), is it still copying to internal buffers? > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported yet and it can be enabled later > based on the Community feedback. > Following improvement are seen with Direct I/O enablement. > # 32 cores >= ~15% > # 64 cores >= ~80% > Also, another observation would like to share here. Reading Commitlog files > with Direct I/O might help in reducing node bring-up time after the node > crash. > Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07 > The attached patch enables Direct I/O feature for Commitlog files. Please > check and share your feedback. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail:
[jira] [Commented] (CASSANDRA-18773) Compactions are slow
[ https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769063#comment-17769063 ] Branimir Lambov commented on CASSANDRA-18773: - There's some leftover code in the trunk version, apart from that the newer versions look good. > Compactions are slow > > > Key: CASSANDRA-18773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18773 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Cameron Zemek >Assignee: Cameron Zemek >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: 18773.patch, compact-poc.patch, flamegraph.png, > stress.yaml > > Time Spent: 2h 50m > Remaining Estimate: 0h > > I have noticed that compactions involving a lot of sstables are very slow > (for example major compactions). I have attached a cassandra stress profile > that can generate such a dataset under ccm. In my local test I have 2567 > sstables at 4Mb each. > I added code to track wall clock time of various parts of the code. One > problematic part is ManyToOne constructor. Tracing through the code for every > partition creating a ManyToOne for all the sstable iterators for each > partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked > on single core CPU (since this code is single threaded) with it spending 85% > of the wall clock time in ManyToOne constructor. > As another datapoint to show its the merge iterator part of the code using > the cfstats from [https://github.com/instaclustr/cassandra-sstable-tools/] > which reads all the sstables but does no merging gets 26Mb/sec read speed. > Tracking back from ManyToOne call I see this in > UnfilteredPartitionIterators::merge > {code:java} > for (int i = 0; i < toMerge.size(); i++) > { > if (toMerge.get(i) == null) > { > if (null == empty) > empty = EmptyIterators.unfilteredRow(metadata, > partitionKey, isReverseOrder); > toMerge.set(i, empty); > } > } > {code} > Not sure what purpose of creating these empty rows are. But on a whim I > removed all these empty iterators before passing to ManyToOne and then all > the wall clock time shifted to CompactionIterator::hasNext() and read speed > increased to 1.5Mb/s. > So there are further bottlenecks in this code path it seems, but the first is > this ManyToOne and having to build it for every partition read. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18873) Fix broken JMH benchmarks
[ https://issues.apache.org/jira/browse/CASSANDRA-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768591#comment-17768591 ] Branimir Lambov commented on CASSANDRA-18873: - {quote} * ReadSmallPartitionsBench (assertion error) * ReadWidePartitionsBench (assertion error) {quote} These two tests need larger memtable size allocation to produce useable output. One way to "fix" this is to replace {{INMEM}} with {{NO}} for the default {{flush}}, which will make it ignore the fact that part of the data is in an sstable; another is to reduce the default {{count}} by an order of magnitude. Both of these changes would make the test less suitable for what it is primarily meant to measure (access time with a non-trivial data size in a single memtable/sstable). > Fix broken JMH benchmarks > - > > Key: CASSANDRA-18873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18873 > Project: Cassandra > Issue Type: Bug > Components: Test/benchmark >Reporter: Jacek Lewandowski >Priority: Normal > Attachments: BenchTimeTest.java, > jmh-AtomicBtreePartitionUpdateBench.log, jmh-BloomFilterSerializerBench.log, > jmh-KeyLookupBench.log, jmh-ReadSmallPartitionsBench.log, > jmh-ReadWidePartitionsBench.log > > > The following benchmarks are broken: > * {{ZeroCopyStreamingBench}} > * {{MutationBench}} > * {{FastThreadLocalBench}} > * {{AtomicBTreePartitionUpdateBench}} (OOM on Jenkins) > * {{ReadSmallPartitionsBench}} (assertion error) > * {{ReadWidePartitionsBench}} (assertion error) > * {{BloomFilterSerializerBench}} (NPE) > * {{KeyLookupBench}} (IAE) > Additionally, those benchmarks take too much time to run: > * {{BTreeUpdateBench}} ~ 58 hours > * {{AtomicBTreePartitionUpdateBench}} ~ 5 hours > * {{BTreeTransformBench}} ~ 2.5 hours > Here the complete list of estimated benchmark times: > {noformat} > Estimated time for CacheLoaderBench: ~5 s > Estimated time for LatencyTrackingBench: ~26 s > Estimated time for SampleBench: ~30 s > Estimated time for ReadWriteBench: ~30 s > Estimated time for MutationBench: ~30 s > Estimated time for CompactionBench: ~35 s > Estimated time for DiagnosticEventPersistenceBench: ~40 s > Estimated time for ZeroCopyStreamingBench: ~44 s > Estimated time for BatchStatementBench: ~110 s > Estimated time for DiagnosticEventServiceBench: ~120 s > Estimated time for MessageOutBench: ~144 s > Estimated time for BloomFilterSerializerBench: ~144 s > Estimated time for FastThreadLocalBench: ~156 s > Estimated time for HashingBench: ~156 s > Estimated time for ChecksumBench: ~208 s > Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s > Estimated time for PendingRangesBench: ~ 5 m > Estimated time for DirectorySizerBench: ~ 5 m > Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m > Estimated time for PreaggregatedByteBufsBench: ~ 7 m > Estimated time for AutoBoxingBench: ~ 8 m > Estimated time for OutputStreamBench: ~ 13 m > Estimated time for BTreeBuildBench: ~ 13 m > Estimated time for StringsEncodeBench: ~ 20 m > Estimated time for instance.ReadWidePartitionsBench: ~ 21 m > Estimated time for btree.BTreeBuildBench: ~ 30 m > Estimated time for BTreeSearchIteratorBench: ~ 31 m > Estimated time for btree.BTreeTransformBench: ~ 138 m > Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m > Estimated time for btree.BTreeUpdateBench: ~58 h > Total estimated time: ~69 h > {noformat} > I'd like to add a test which estimates the benchmark times and fails if a > single benchmark estimated run time is longer than xxx minutes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767594#comment-17767594 ] Branimir Lambov commented on CASSANDRA-18533: - Absolutely. > Move format-specific sstable options into the format configuration > -- > > Key: CASSANDRA-18533 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18533 > Project: Cassandra > Issue Type: Improvement >Reporter: Branimir Lambov >Priority: Normal > > This mainly concerns cassandra yaml settings: > - {{column_index_size}}, which should also be renamed to > {{row_index_granularity}} > - {{column_index_cache_size}} > - {{index_summary_capacity}} > - {{index_summary_resize_interval}} > and possibly > - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, > {{key_cache_migrate_during_compaction}} > - {{sstable_preemptive_open_interval}} > Existing settings should be deprecated but still picked up if defined. > At this point we will not consider table-level options that make better sense > as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, > {{crc_check_chance}} and possibly {{compression}}), because we do not yet > support per-table format selection/configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Fix Version/s: 3.11.17 4.0.12 4.1.4 5.0-alpha2 5.1 Source Control Link: https://github.com/apache/cassandra/pull/2656 Resolution: Fixed Status: Resolved (was: Ready to Commit) Commited ([3.11|https://github.com/apache/cassandra/commit/87c2af85c1305c130af7d66f83dec03a1c4a8bb2] [4.0|https://github.com/apache/cassandra/commit/c6385ac3ddccabdc7cb650b090fa69c0523274e8] [4.1|https://github.com/apache/cassandra/commit/db6641fbb6fd0c439e14f94caecdeee999311c62] [5.0|https://github.com/apache/cassandra/commit/a23f4c0b15c684240ef0bcd55875610e8bd7179b] [trunk|https://github.com/apache/cassandra/commit/970ec2d1db5770c13a42e1f2862ea398317d0f15]) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 3.11.17, 4.0.12, 4.1.4, 5.0-alpha2, 5.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Review In Progress (was: Needs Committer) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Ready to Commit (was: Review In Progress) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Needs Committer (was: Patch Available) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Patch Available (was: Requires Testing) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Requires Testing (was: Review In Progress) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Reviewers: Branimir Lambov, Michael Semb Wever (was: Michael Semb Wever) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Reviewers: Branimir Lambov, Michael Semb Wever, Branimir Lambov (was: Branimir Lambov, Michael Semb Wever) Branimir Lambov, Michael Semb Wever, Branimir Lambov (was: Branimir Lambov, Michael Semb Wever) Status: Review In Progress (was: Patch Available) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Test and Documentation Plan: CI Status: Patch Available (was: In Progress) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767504#comment-17767504 ] Branimir Lambov edited comment on CASSANDRA-18871 at 9/21/23 10:47 AM: --- Yes, the parameter passing works great for me now. Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine. was (Author: blambov): Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine. > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767504#comment-17767504 ] Branimir Lambov commented on CASSANDRA-18871: - Async profiler also tested under WSL2/Ubuntu 20.04 -- works fine. > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18871) JMH benchmark improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767113#comment-17767113 ] Branimir Lambov commented on CASSANDRA-18871: - Can one specify a specific benchmark to run? Would it be too hard to also add other parameters? > JMH benchmark improvements > -- > > Key: CASSANDRA-18871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18871 > Project: Cassandra > Issue Type: Improvement > Components: Build, Legacy/Tools >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.1 > > > 1. CASSANDRA-12586 introduced {{build-jmh}} task which builds uber jar for > JMH benchmarks which is then not used with {{ant microbench}} task. It is > used though by the {{test/bin/jmh}} script. > In fact, I have no idea why we should use uber jar if JMH can perfectly run > with a regular classpath. Maybe that had something to do with older JMH > version which was used that time. Building uber jars takes time and is > annoying. Since it seems to be redundant anyway, I'm going to remove it and > fix {{test/bin/jmh}} to use a regular classpath. > 2. I'll add support for async profiler in benchmarks. That is, the > {{microbench}} target automatically fetches the async profiler binaries and > adds the necessary args for JMH ({{-prof asyc...}} in particular) whenever we > run {{microbench-with-profiler}} task. If no additional properties are > provided some default options will be applied (defined in the script, can be > negotiated). Otherwise, whatever is passed to the {{profiler.opts}} property > will be added as profiler options after library path and target directory > definition. > 3. If someone wants to see any additional improvements, please comment on the > ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15284) AssertionError while scrubbing sstable
[ https://issues.apache.org/jira/browse/CASSANDRA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765693#comment-17765693 ] Branimir Lambov commented on CASSANDRA-15284: - Thank you for the excellent investigation and write-up. It does look like there are many copies of the same position info in the writer, and that they aren't correctly handled. In addition to what you describe, {{padToPageBoundary}} isn't adjusting the compressed size, which can cause the same problem. I agree we need to get rid of the extras: {{lastFlushOffset}} and {{compressedSize}} should both be eliminated, their usages replaced with {{bufferOffset}}, and {{compressedSize}} should be taken from {{chunkOffset}}. Dealing with the broken full CRC is a bit more involved, but we could flag if {{resetAndTruncate}} needed to back through to a different chunk, and do a full pass over the file to recalculate it on completion if flagged. > AssertionError while scrubbing sstable > -- > > Key: CASSANDRA-15284 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15284 > Project: Cassandra > Issue Type: Bug > Components: Feature/Compression >Reporter: Gianluigi Tiesi >Priority: Normal > Fix For: 3.11.x, 4.0.x, 4.1.x > > Attachments: assert-comp-meta.diff > > > I've got a damaged data file but while trying to run scrub (online or > offline) I always get this > error: > > {code:java} > -- StackTrace -- > java.lang.AssertionError > at > org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:474) > at > org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:239) > at > org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:163) > at > org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:73) > at > org.apache.cassandra.io.util.MmappedRegions.(MmappedRegions.java:61) > at > org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104) > at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:362) > at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:331) > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.openFinal(BigTableWriter.java:336) > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.openFinalEarly(BigTableWriter.java:318) > at > org.apache.cassandra.io.sstable.SSTableRewriter.switchWriter(SSTableRewriter.java:322) > at > org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:370) > at > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) > at > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:184) > at > org.apache.cassandra.io.sstable.SSTableRewriter.finish(SSTableRewriter.java:357) > at > org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:291) > at > org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:1010) > at > org.apache.cassandra.db.compaction.CompactionManager.access$200(CompactionManager.java:83) > at > org.apache.cassandra.db.compaction.CompactionManager$3.execute(CompactionManager.java:391) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:312) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at java.lang.Thread.run(Thread.java:748) > {code} > At the moment I've moved away the corrupted file, If you need more info fell > free to ask > > According to the source > [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/io/compress/CompressionMetadata.java#L474] > looks like the requested chung length is <= 0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761375#comment-17761375 ] Branimir Lambov commented on CASSANDRA-18756: - There's a risk that making it work as intended will unexpectedly change behaviour for people that are already using the flag, and I would rather not do that. Especially if we change it in patch releases for all supported versions. If we are to enable that functionality, IMHO it should be under a different flag (and then only for a subset of versions). > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Open (was: Triage Needed) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761314#comment-17761314 ] Branimir Lambov commented on CASSANDRA-18756: - [~mck], could you do the second review this small patch which corrects a problem with CASSANDRA-13418? Your name came up as a reviewer for that ticket and it would be great to get the opinion of someone who has some context on it. > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table
[ https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761307#comment-17761307 ] Branimir Lambov commented on CASSANDRA-18534: - We can't remove the existing option, only deprecate it (i.e. give a warning that it may be removed in a later version). We also have to honor the value if it is present. I agree that we should throw an exception if both versions are given in the DDL. The complication is what happens if the format-side option is given in the yaml: in this case I think we should let the table-side option override it even if it is given in the legacy way (with perhaps a deprecation warning). > Make sstable format configurable per table > -- > > Key: CASSANDRA-18534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18534 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > Some SSTable format settings need to be configurable per table for better > efficiency. This includes: > - {{row_index_granularity}} > - {{bloom_filter_fp_chance}} > - {{crc_check_chance}} > - {{min/max_index_interval}} > Some of these are currently configurable using direct properties of tables. > Having them as format properties makes better sense and should also support > specifying useable combinations of settings, e.g. > {code:java} > CREATE TABLE ... WITH sstable_format = "bti-fast"; > CREATE TABLE ... WITH sstable_format = "bti-small"; > {code} > where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} > e.g. as > {code:java} > sstable.format.options: > - bti-fast: > row_index_granularity: 1kiB > bloom_filter_fp_chance: 0.01 > - bti-small: > row_index_granularity: 32kiB > bloom_filter_fp_chance: 0.1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16333) Document provide_overlapping_tombstones compaction option
[ https://issues.apache.org/jira/browse/CASSANDRA-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760978#comment-17760978 ] Branimir Lambov commented on CASSANDRA-16333: - I don't think this describes it correctly. When compacting data, it will check if a row is deleted by some tombstone in a newer table that does not participate in the compaction. If it is, it will drop the row from the result. If this manages to result in the partition completely being removed from the result of the compaction, this will make the tombstone that deletes the row purgeable. To be honest, I had hopes that this can help solve the tombstones problem when I wrote the code, but now I'm not very confident that it is worth using. When faced with accumulation of tombstones, performing a major compaction or switching to LCS or levelled UCS is a better option. Ultimately the problem should be solved by something that removes the tombstone factor from queries, and we have something in mind as the long-term solution. > Document provide_overlapping_tombstones compaction option > - > > Key: CASSANDRA-16333 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16333 > Project: Cassandra > Issue Type: Improvement > Components: Documentation >Reporter: Paulo Motta >Assignee: Sumanth Pasupuleti >Priority: Normal > > This option was added on CASSANDRA-7019 but it's not documented. We should > add it to > https://cassandra.apache.org/doc/latest/operating/compaction/index.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18773) Compactions are slow
[ https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759864#comment-17759864 ] Branimir Lambov commented on CASSANDRA-18773: - {quote}This does copying of the rows into memory to pass across to the writer {quote} Partitions can have millions of rows, we cannot afford to copy something this big to memory. The current materialization point is the row and even that is a problem for some use cases. Processing the read in another thread adds a lot of synchronization overhead, changes the balance between request serving and compaction, and in a busy node can actually result in compaction failing to make progress. We have instead been focusing on making compaction more parallelizable (see e.g. UCS/CEP-26/CASSANDRA-18397). UCS can already parallelize major compactions when there is no overlap, and we can do even better if we provide the split points to the compaction process in advance (see CASSANDRA-18802). > Compactions are slow > > > Key: CASSANDRA-18773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18773 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Cameron Zemek >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: 18773.patch, compact-poc.patch, flamegraph.png, > stress.yaml > > Time Spent: 10m > Remaining Estimate: 0h > > I have noticed that compactions involving a lot of sstables are very slow > (for example major compactions). I have attached a cassandra stress profile > that can generate such a dataset under ccm. In my local test I have 2567 > sstables at 4Mb each. > I added code to track wall clock time of various parts of the code. One > problematic part is ManyToOne constructor. Tracing through the code for every > partition creating a ManyToOne for all the sstable iterators for each > partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked > on single core CPU (since this code is single threaded) with it spending 85% > of the wall clock time in ManyToOne constructor. > As another datapoint to show its the merge iterator part of the code using > the cfstats from [https://github.com/instaclustr/cassandra-sstable-tools/] > which reads all the sstables but does no merging gets 26Mb/sec read speed. > Tracking back from ManyToOne call I see this in > UnfilteredPartitionIterators::merge > {code:java} > for (int i = 0; i < toMerge.size(); i++) > { > if (toMerge.get(i) == null) > { > if (null == empty) > empty = EmptyIterators.unfilteredRow(metadata, > partitionKey, isReverseOrder); > toMerge.set(i, empty); > } > } > {code} > Not sure what purpose of creating these empty rows are. But on a whim I > removed all these empty iterators before passing to ManyToOne and then all > the wall clock time shifted to CompactionIterator::hasNext() and read speed > increased to 1.5Mb/s. > So there are further bottlenecks in this code path it seems, but the first is > this ManyToOne and having to build it for every partition read. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18802) Extend compaction interfaces to provide split points at operation start
Branimir Lambov created CASSANDRA-18802: --- Summary: Extend compaction interfaces to provide split points at operation start Key: CASSANDRA-18802 URL: https://issues.apache.org/jira/browse/CASSANDRA-18802 Project: Cassandra Issue Type: Improvement Components: Local/Compaction Reporter: Branimir Lambov The current compaction interfaces allow a compaction strategy to split at arbitrary points while it is writing output. In some cases (e.g. UCS) we know in advance where we want to split. Giving this information before the operation starts allows it to operate on multiple segments of the output in parallel, i.e. parallelize within an operation rather than between operations, which can reduce individual operations' duration and significantly improve the DB's chances of keeping up with load, especially on L0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18773) Compactions are slow
[ https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759595#comment-17759595 ] Branimir Lambov edited comment on CASSANDRA-18773 at 8/28/23 2:11 PM: -- The crux of the issue here is likely in this comment in {{{}UnfilteredPartitionIterators.merge{}}}: {code:java} // Note that because the MergeListener cares about it, we want to preserve the index of the iterator. // Non-present iterator will thus be set to empty in getReduced. {code} The need for this is, indeed, a serious problem in merges of many sstables in key-value tables (i.e. ones containing only one row per partition) that we have not yet tried to address. I expect that simply changing the code of {{merge}} to only present partitions selected by the partition merger, for example by doing {code:java} diff --git a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java --- a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java (revision 052a26474108febad545d6528bb203ecf19b22e5) +++ b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java (date 1693230551402) @@ -113,16 +113,11 @@ private final List toMerge = new ArrayList<>(iterators.size()); private DecoratedKey partitionKey; -private boolean isReverseOrder; public void reduce(int idx, UnfilteredRowIterator current) { partitionKey = current.partitionKey(); -isReverseOrder = current.isReverseOrder(); - -// Note that because the MergeListener cares about it, we want to preserve the index of the iterator. -// Non-present iterator will thus be set to empty in getReduced. -toMerge.set(idx, current); +toMerge.add(current); } @SuppressWarnings("resource") @@ -132,20 +127,6 @@ ? null : listener.getRowMergeListener(partitionKey, toMerge); -// Make a single empty iterator object to merge, we don't need toMerge.size() copiess -UnfilteredRowIterator empty = null; - -// Replace nulls by empty iterators -for (int i = 0; i < toMerge.size(); i++) -{ -if (toMerge.get(i) == null) -{ -if (null == empty) -empty = EmptyIterators.unfilteredRow(metadata, partitionKey, isReverseOrder); -toMerge.set(i, empty); -} -} - return UnfilteredRowIterators.merge(toMerge, rowListener); } {code} will give a higher performance boost. The problem is that doing this causes the merge listener (which can be e.g. a secondary index implementation) to lose information about the sources of a merged value. At some point this was crucial for secondary indexes, but I'm not sure it still is, and I don't think anyone has invested the time to understand whether it is still necessary in Cassandra 4. It's even less likely to matter for Cassandra 5, whose storage-attached indexes don't need merge listeners. Nevertheless, unless we have done this investigation, this behaviour needs to be preserved, but I believe we can still get an improvement for the cases where there is no index. To get a more complete solution to the problem, how about changing the {{merge}} code to get the {{rowListener}} on the first call to {{{}reduce{}}}, and switch between the two methods of constructing the {{toMerge}} list based on whether or not it is {{{}null{}}}? was (Author: blambov): The crux of the issue here is likely in this comment in {{{}UnfilteredPartitionIterators.merge{}}}: {code:java} // Note that because the MergeListener cares about it, we want to preserve the index of the iterator. // Non-present iterator will thus be set to empty in getReduced. {code} The need for this is, indeed, a serious problem in merges of many sstables in key-value tables (i.e. ones containing only one row per partition) that we have not yet tried to address. I expect that simply changing the code of {{merge}} to only present partitions selected by the partition merger, for example by doing {code:java} diff --git a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java --- a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java (revision
[jira] [Commented] (CASSANDRA-18773) Compactions are slow
[ https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759595#comment-17759595 ] Branimir Lambov commented on CASSANDRA-18773: - The crux of the issue here is likely in this comment in {{{}UnfilteredPartitionIterators.merge{}}}: {code:java} // Note that because the MergeListener cares about it, we want to preserve the index of the iterator. // Non-present iterator will thus be set to empty in getReduced. {code} The need for this is, indeed, a serious problem in merges of many sstables in key-value tables (i.e. ones containing only one row per partition) that we have not yet tried to address. I expect that simply changing the code of {{merge}} to only present partitions selected by the partition merger, for example by doing {code:java} diff --git a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java --- a/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java (revision 052a26474108febad545d6528bb203ecf19b22e5) +++ b/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java (date 1693230551402) @@ -113,16 +113,11 @@ private final List toMerge = new ArrayList<>(iterators.size()); private DecoratedKey partitionKey; -private boolean isReverseOrder; public void reduce(int idx, UnfilteredRowIterator current) { partitionKey = current.partitionKey(); -isReverseOrder = current.isReverseOrder(); - -// Note that because the MergeListener cares about it, we want to preserve the index of the iterator. -// Non-present iterator will thus be set to empty in getReduced. -toMerge.set(idx, current); +toMerge.add(current); } @SuppressWarnings("resource") @@ -132,20 +127,6 @@ ? null : listener.getRowMergeListener(partitionKey, toMerge); -// Make a single empty iterator object to merge, we don't need toMerge.size() copiess -UnfilteredRowIterator empty = null; - -// Replace nulls by empty iterators -for (int i = 0; i < toMerge.size(); i++) -{ -if (toMerge.get(i) == null) -{ -if (null == empty) -empty = EmptyIterators.unfilteredRow(metadata, partitionKey, isReverseOrder); -toMerge.set(i, empty); -} -} - return UnfilteredRowIterators.merge(toMerge, rowListener); } {code} will give a higher performance boost. The problem is that doing this causes the merge listener (which can be e.g. a secondary index implementation) to lose information about the sources of a merged value. At some point this was crucial for secondary indexes, but I'm not sure it still is, and I don't think anyone has invested the time to understand whether it is still necessary in Cassandra 4. It's even less likely to matter for Cassandra 5, whose storage-attached indexes don't need merge listeners. To get a more complete solution to the problem, how about changing the {{merge}} code to get the {{rowListener}} on the first call to {{reduce}}, and switch between the two methods of constructing the {{toMerge}} list based on whether or not it is {{null}}? > Compactions are slow > > > Key: CASSANDRA-18773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18773 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Cameron Zemek >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: compact-poc.patch, flamegraph.png, stress.yaml > > Time Spent: 10m > Remaining Estimate: 0h > > I have noticed that compactions involving a lot of sstables are very slow > (for example major compactions). I have attached a cassandra stress profile > that can generate such a dataset under ccm. In my local test I have 2567 > sstables at 4Mb each. > I added code to track wall clock time of various parts of the code. One > problematic part is ManyToOne constructor. Tracing through the code for every > partition creating a ManyToOne for all the sstable iterators for each > partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked > on single core CPU (since this code is single threaded) with it spending 85% > of the wall clock time in ManyToOne constructor. > As
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759002#comment-17759002 ] Branimir Lambov commented on CASSANDRA-18710: - This kind of flakiness usually comes from truncation of a table from a different concurrently completed test causing a flush of the whole keyspace. The {{KEYSPACE_PER_TEST}} changes fix this, and I must have done a repeated run on CircleCI to verify (I don't remember if I really did at the time). > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.x > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table
[ https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758137#comment-17758137 ] Branimir Lambov commented on CASSANDRA-18534: - bq. Is there a reason we're relying primarily on the .yaml rather than a system_distributed table accessible via vtables for aliased groups of sstable configuration params? Because it is a good idea to permit the configuration to change between nodes. E.g. if we want to do a gradual rollout of something, or test it out on just one node, or compare performance, or because we have a heterogeneous cluster and we want to have a different value for some parameter on some nodes. Granted, these could be given as overrides, but we are getting into the territory of a very complex solution for a simple problem, especially considering the dance that has to be performed to initialize a storage format from the data in a table whose storage format needs to be initialized too. > Make sstable format configurable per table > -- > > Key: CASSANDRA-18534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18534 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > Some SSTable format settings need to be configurable per table for better > efficiency. This includes: > - {{row_index_granularity}} > - {{bloom_filter_fp_chance}} > - {{crc_check_chance}} > - {{min/max_index_interval}} > Some of these are currently configurable using direct properties of tables. > Having them as format properties makes better sense and should also support > specifying useable combinations of settings, e.g. > {code:java} > CREATE TABLE ... WITH sstable_format = "bti-fast"; > CREATE TABLE ... WITH sstable_format = "bti-small"; > {code} > where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} > e.g. as > {code:java} > sstable.format.options: > - bti-fast: > row_index_granularity: 1kiB > bloom_filter_fp_chance: 0.01 > - bti-small: > row_index_granularity: 32kiB > bloom_filter_fp_chance: 0.1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table
[ https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755595#comment-17755595 ] Branimir Lambov commented on CASSANDRA-18534: - No such validation is needed. When we receive an {{ALTER/CREATE}} statement we validate that it starts with a known format name, that the specific variation is specified in the yaml, and that the definition in the yaml is understood/validated by the format. > Make sstable format configurable per table > -- > > Key: CASSANDRA-18534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18534 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > Some SSTable format settings need to be configurable per table for better > efficiency. This includes: > - {{row_index_granularity}} > - {{bloom_filter_fp_chance}} > - {{crc_check_chance}} > - {{min/max_index_interval}} > Some of these are currently configurable using direct properties of tables. > Having them as format properties makes better sense and should also support > specifying useable combinations of settings, e.g. > {code:java} > CREATE TABLE ... WITH sstable_format = "bti-fast"; > CREATE TABLE ... WITH sstable_format = "bti-small"; > {code} > where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} > e.g. as > {code:java} > sstable.format.options: > - bti-fast: > row_index_granularity: 1kiB > bloom_filter_fp_chance: 0.01 > - bti-small: > row_index_granularity: 32kiB > bloom_filter_fp_chance: 0.1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754645#comment-17754645 ] Branimir Lambov commented on CASSANDRA-18756: - Well, things are a bit more complex. Because the iterator is indeed created, the {{ignoreOverlaps}} flag does not have the intended effect of not taking into account older sstables that may have data shadowed by a tombstone. This behaviour is helping make the option much safer than it would otherwise be, and cannot be changed without risking unexpected data resurrection. Because of this a fix for this issue should remove the overlap iterator part of the intended meaning of {{unsafe_aggressive_sstable_expiration}} rather than make it work correctly. > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Priority: Normal > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals
[ https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18757: Bug Category: Parent values: Degradation(12984)Level 1 values: Resource Management(12995) Complexity: Low Hanging Fruit Discovered By: Code Inspection Since Version: 5.0-alpha1 > UnifiedCompactionTask is incorrectly setting keepOriginals > -- > > Key: CASSANDRA-18757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18757 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Priority: Normal > > {code:java} > super(cfs, txn, gcBefore, > strategy.getController().getIgnoreOverlapsInExpirationCheck());{code} > in {{UnifiedCompactionTask}} is calling the base constructor > {code:java} > public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long > gcBefore, boolean keepOriginals) > {code} > which can set {{keepOriginals}} to true when it should not be. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org