[jira] [Updated] (CASSANDRA-15274) Multiple Corrupt datafiles across entire environment
[ https://issues.apache.org/jira/browse/CASSANDRA-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil O Conduin updated CASSANDRA-15274: --- Resolution: Fixed Status: Resolved (was: Triage Needed) Our datafile corruption issues were a problem with the OS wrongly taking one block belonging to a C* data file thinking it was no longer used and treating it as a free block that would later be used. For example: C* deletes file after compaction, OS collects all blocks which are free now and sends TRIM command to SSD, but SSD from time to time picks the wrong block, not the one reported by OS - does the trim - causing zeroized blocks to be seen in the datafile and later use it for different file. So the symptom is - we suddenly see 4096 zeroes in the datafile- it means SSD just trimmed the block, after some time we can see some data written to those blocks - it means the block is used by other file and therefore gives us a corrupt file. We turned off the scheduled TRIM function on the OS and we are no longer getting corruptions. This was very difficult to pinpoint. > Multiple Corrupt datafiles across entire environment > - > > Key: CASSANDRA-15274 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15274 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Phil O Conduin >Priority: Normal > Labels: impact-high > > Cassandra Version: 2.2.13 > PRE-PROD environment. > * 2 datacenters. > * 9 physical servers in each datacenter - (_Cisco UCS C220 M4 SFF_) > * 4 Cassandra instances on each server (cass_a, cass_b, cass_c, cass_d) > * 72 Cassandra instances across the 2 data centres, 36 in site A, 36 in site > B. > We also have 2 Reaper Nodes we use for repair. One reaper node in each > datacenter each running with its own Cassandra back end in a cluster together. > OS Details [Red Hat Linux] > cass_a@x 0 10:53:01 ~ $ uname -a > Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 > x86_64 x86_64 GNU/Linux > cass_a@x 0 10:57:31 ~ $ cat /etc/*release > NAME="Red Hat Enterprise Linux Server" > VERSION="7.6 (Maipo)" > ID="rhel" > Storage Layout > cass_a@xx 0 10:46:28 ~ $ df -h > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/vg01-lv_root 20G 2.2G 18G 11% / > devtmpfs 63G 0 63G 0% /dev > tmpfs 63G 0 63G 0% /dev/shm > tmpfs 63G 4.1G 59G 7% /run > tmpfs 63G 0 63G 0% /sys/fs/cgroup > >> 4 cassandra instances > /dev/sdd 1.5T 802G 688G 54% /data/ssd4 > /dev/sda 1.5T 798G 692G 54% /data/ssd1 > /dev/sdb 1.5T 681G 810G 46% /data/ssd2 > /dev/sdc 1.5T 558G 932G 38% /data/ssd3 > Cassandra load is about 200GB and the rest of the space is snapshots > CPU > cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\(' > CPU(s): 64 > Thread(s) per core: 2 > Core(s) per socket: 16 > Socket(s): 2 > *Description of problem:* > During repair of the cluster, we are seeing multiple corruptions in the log > files on a lot of instances. There seems to be no pattern to the corruption. > It seems that the repair job is finding all the corrupted files for us. The > repair will hang on the node where the corrupted file is found. To fix this > we remove/rename the datafile and bounce the Cassandra instance. Our > hardware/OS team have stated there is no problem on their side. I do not > believe it the repair causing the corruption. > > So let me give you an example of a corrupted file and maybe someone might be > able to work through it with me? > When this corrupted file was reported in the log it looks like it was the > repair that found it. > $ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00" --until > "2019-08-07 22:45:00" > Aug 07 22:30:33 cassandra[34611]: INFO 21:30:33 Writing > Memtable-compactions_in_progress@830377457(0.008KiB serialized bytes, 1 ops, > 0%/0% of on/off-heap limit) > Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle > tree for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on KeyspaceMetadata/x, > (-1476350953672479093,-1474461 > Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread > Thread[ValidationExecutor:825,1,main] > Aug 07 22:30:33 cassandra[34611]: org.apache.cassandra.io.FSReadError: > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > /x/ssd2/data/KeyspaceMetadata/x-1e453cb0 > Aug 07 22:30:33 cassandra[34611]: at >
[jira] [Commented] (CASSANDRA-15274) Multiple Corrupt datafiles across entire environment
[ https://issues.apache.org/jira/browse/CASSANDRA-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961436#comment-16961436 ] Phil O Conduin commented on CASSANDRA-15274: Hi Benedict, Sorry I forgot to come back and update this jira. Our datafile corruption issues were a problem with the OS wrongly taking one block belonging to a C* data file thinking it was no longer used and treating it as a free block that would later be used. For example: C* deletes file after compaction, OS collects all blocks which are free now and sends TRIM command to SSD, but SSD from time to time picks the wrong block, not the one reported by OS - does the trim - causing zeroized blocks to be seen in the datafile and later use it for different file. So the symptom is - we suddenly see 4096 zeroes in the datafile- it means SSD just trimmed the block, after some time we can see some data written to those blocks - it means the block is used by other file and therefore gives us a corrupt file. We turned off the scheduled TRIM function on the OS and we are no longer getting corruptions. This was very difficult to pinpoint. > Multiple Corrupt datafiles across entire environment > - > > Key: CASSANDRA-15274 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15274 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Phil O Conduin >Priority: Normal > Labels: impact-high > > Cassandra Version: 2.2.13 > PRE-PROD environment. > * 2 datacenters. > * 9 physical servers in each datacenter - (_Cisco UCS C220 M4 SFF_) > * 4 Cassandra instances on each server (cass_a, cass_b, cass_c, cass_d) > * 72 Cassandra instances across the 2 data centres, 36 in site A, 36 in site > B. > We also have 2 Reaper Nodes we use for repair. One reaper node in each > datacenter each running with its own Cassandra back end in a cluster together. > OS Details [Red Hat Linux] > cass_a@x 0 10:53:01 ~ $ uname -a > Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 > x86_64 x86_64 GNU/Linux > cass_a@x 0 10:57:31 ~ $ cat /etc/*release > NAME="Red Hat Enterprise Linux Server" > VERSION="7.6 (Maipo)" > ID="rhel" > Storage Layout > cass_a@xx 0 10:46:28 ~ $ df -h > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/vg01-lv_root 20G 2.2G 18G 11% / > devtmpfs 63G 0 63G 0% /dev > tmpfs 63G 0 63G 0% /dev/shm > tmpfs 63G 4.1G 59G 7% /run > tmpfs 63G 0 63G 0% /sys/fs/cgroup > >> 4 cassandra instances > /dev/sdd 1.5T 802G 688G 54% /data/ssd4 > /dev/sda 1.5T 798G 692G 54% /data/ssd1 > /dev/sdb 1.5T 681G 810G 46% /data/ssd2 > /dev/sdc 1.5T 558G 932G 38% /data/ssd3 > Cassandra load is about 200GB and the rest of the space is snapshots > CPU > cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\(' > CPU(s): 64 > Thread(s) per core: 2 > Core(s) per socket: 16 > Socket(s): 2 > *Description of problem:* > During repair of the cluster, we are seeing multiple corruptions in the log > files on a lot of instances. There seems to be no pattern to the corruption. > It seems that the repair job is finding all the corrupted files for us. The > repair will hang on the node where the corrupted file is found. To fix this > we remove/rename the datafile and bounce the Cassandra instance. Our > hardware/OS team have stated there is no problem on their side. I do not > believe it the repair causing the corruption. > > So let me give you an example of a corrupted file and maybe someone might be > able to work through it with me? > When this corrupted file was reported in the log it looks like it was the > repair that found it. > $ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00" --until > "2019-08-07 22:45:00" > Aug 07 22:30:33 cassandra[34611]: INFO 21:30:33 Writing > Memtable-compactions_in_progress@830377457(0.008KiB serialized bytes, 1 ops, > 0%/0% of on/off-heap limit) > Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle > tree for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on KeyspaceMetadata/x, > (-1476350953672479093,-1474461 > Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread > Thread[ValidationExecutor:825,1,main] > Aug 07 22:30:33 cassandra[34611]: org.apache.cassandra.io.FSReadError: > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > /x/ssd2/data/KeyspaceMetadata/x-1e453cb0 > Aug 07 22:30:33 cassandra[34611]: at >
[jira] [Updated] (CASSANDRA-15382) Fix flaky unit test - testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Chella updated CASSANDRA-15382: - Resolution: Duplicate Status: Resolved (was: Triage Needed) > Fix flaky unit test - > testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest > --- > > Key: CASSANDRA-15382 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15382 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Vinay Chella >Priority: Normal > > As part of Apache Cassandra 4.0-alpha2, it is found out that > "testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest" is > flaky. > > +Failing test:+ > *testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest* > {code:java} > junit.framework.AssertionFailedError > at > org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > > +Sample failed runs:+ > * > [https://circleci.com/gh/vinaykumarchella/cassandra/527#tests/containers/37] > * > [https://circleci.com/gh/vinaykumarchella/cassandra/535#tests/containers/37] > * [https://circleci.com/gh/vinaykumarchella/cassandra/489#tests/containers/7] > * > [https://circleci.com/gh/vinaykumarchella/cassandra/512#tests/containers/37] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15382) Fix flaky unit test - testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961365#comment-16961365 ] Andrew Prudhomme commented on CASSANDRA-15382: -- I think this is a duplicate of https://issues.apache.org/jira/browse/CASSANDRA-15310 > Fix flaky unit test - > testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest > --- > > Key: CASSANDRA-15382 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15382 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Vinay Chella >Priority: Normal > > As part of Apache Cassandra 4.0-alpha2, it is found out that > "testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest" is > flaky. > > +Failing test:+ > *testIdleDisconnect::org.apache.cassandra.transport.IdleDisconnectTest* > {code:java} > junit.framework.AssertionFailedError > at > org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > > +Sample failed runs:+ > * > [https://circleci.com/gh/vinaykumarchella/cassandra/527#tests/containers/37] > * > [https://circleci.com/gh/vinaykumarchella/cassandra/535#tests/containers/37] > * [https://circleci.com/gh/vinaykumarchella/cassandra/489#tests/containers/7] > * > [https://circleci.com/gh/vinaykumarchella/cassandra/512#tests/containers/37] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15381) Failing test - testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15381: - Resolution: Fixed Status: Resolved (was: Triage Needed) > Failing test - > testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest > --- > > Key: CASSANDRA-15381 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15381 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Vinay Chella >Priority: Normal > > As part of Apache Cassandra 4.0-alpha2 voting, the following test is failing > across different test suites and runs. > CircleCI Run: > [https://circleci.com/gh/vinaykumarchella/cassandra/487#tests/containers/37] > *testDatabaseDescriptorRef-compression - > org.apache.cassandra.config.DatabaseDescriptorRefTest* > {code:java} > junit.framework.AssertionFailedError > at > org.apache.cassandra.config.DatabaseDescriptorRefTest.checkViolations(DatabaseDescriptorRefTest.java:293) > at > org.apache.cassandra.config.DatabaseDescriptorRefTest.testDatabaseDescriptorRef(DatabaseDescriptorRefTest.java:277){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15371: - Fix Version/s: 3.11.x 3.0.x 2.2.x 4.0 Since Version: 2.2.x Source Control Link: https://github.com/apache/cassandra/commit/c009a7b9d4e3c32e88238a7ba8f67b14d287c5d3 Resolution: Fixed Status: Resolved (was: Ready to Commit) Thanks for the patch. > Incorrect messaging service version breaks in-JVM upgrade tests on trunk > > > Key: CASSANDRA-15371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15371 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x > > > The in-JVM upgrade tests on trunk currently fail because the messaging > version for internode messaging is selected as > {{MessagingService.current_version}}, > a regression from the implementation in CASSANDRA-15078. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15371: - Status: Ready to Commit (was: Review In Progress) +1 > Incorrect messaging service version breaks in-JVM upgrade tests on trunk > > > Key: CASSANDRA-15371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15371 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > The in-JVM upgrade tests on trunk currently fail because the messaging > version for internode messaging is selected as > {{MessagingService.current_version}}, > a regression from the implementation in CASSANDRA-15078. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15371: - Status: Patch Available (was: In Progress) > Incorrect messaging service version breaks in-JVM upgrade tests on trunk > > > Key: CASSANDRA-15371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15371 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > The in-JVM upgrade tests on trunk currently fail because the messaging > version for internode messaging is selected as > {{MessagingService.current_version}}, > a regression from the implementation in CASSANDRA-15078. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15371: - Reviewers: Dinesh Joshi, Dinesh Joshi (was: Dinesh Joshi) Dinesh Joshi, Dinesh Joshi (was: Dinesh Joshi) Status: Review In Progress (was: Patch Available) > Incorrect messaging service version breaks in-JVM upgrade tests on trunk > > > Key: CASSANDRA-15371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15371 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > The in-JVM upgrade tests on trunk currently fail because the messaging > version for internode messaging is selected as > {{MessagingService.current_version}}, > a regression from the implementation in CASSANDRA-15078. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15371) Incorrect messaging service version breaks in-JVM upgrade tests on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15371: - Status: In Progress (was: Patch Available) > Incorrect messaging service version breaks in-JVM upgrade tests on trunk > > > Key: CASSANDRA-15371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15371 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > > The in-JVM upgrade tests on trunk currently fail because the messaging > version for internode messaging is selected as > {{MessagingService.current_version}}, > a regression from the implementation in CASSANDRA-15078. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-3.11 updated (b697af8 -> 9c1925a)
This is an automated email from the ASF dual-hosted git repository. djoshi pushed a change to branch cassandra-3.11 in repository https://gitbox.apache.org/repos/asf/cassandra.git. from b697af8 Merge branch 'cassandra-3.0' into cassandra-3.11 add 02cc5df In-JVM DTest: Set correct internode message version for upgrade test add bfa7998 Merge branch 'cassandra-2.2' into cassandra-3.0 add 9c1925a Merge branch 'cassandra-3.0' into cassandra-3.11 No new revisions were added by this update. Summary of changes: CHANGES.txt | 4 .../org/apache/cassandra/distributed/impl/Instance.java | 13 +++-- 2 files changed, 15 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated (b97fc30 -> c009a7b)
This is an automated email from the ASF dual-hosted git repository. djoshi pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git. from b97fc30 Remove StageManager add 02cc5df In-JVM DTest: Set correct internode message version for upgrade test add bfa7998 Merge branch 'cassandra-2.2' into cassandra-3.0 add 9c1925a Merge branch 'cassandra-3.0' into cassandra-3.11 add c009a7b Merge branch 'cassandra-3.11' into trunk No new revisions were added by this update. Summary of changes: CHANGES.txt| 4 .../org/apache/cassandra/distributed/impl/Instance.java| 14 +- .../apache/cassandra/config/DatabaseDescriptorRefTest.java | 1 + 3 files changed, 18 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-2.2 updated (4ee4cee -> 02cc5df)
This is an automated email from the ASF dual-hosted git repository. djoshi pushed a change to branch cassandra-2.2 in repository https://gitbox.apache.org/repos/asf/cassandra.git. from 4ee4cee Fix 2.2 deb Build-Depends JDK version to match build.xml source.version add 02cc5df In-JVM DTest: Set correct internode message version for upgrade test No new revisions were added by this update. Summary of changes: CHANGES.txt | 4 .../org/apache/cassandra/distributed/impl/Instance.java | 12 +++- 2 files changed, 15 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-3.0 updated (a81bfd6 -> bfa7998)
This is an automated email from the ASF dual-hosted git repository. djoshi pushed a change to branch cassandra-3.0 in repository https://gitbox.apache.org/repos/asf/cassandra.git. from a81bfd6 Use `rm -f` in rpm spec to prevent failure on missing file add 02cc5df In-JVM DTest: Set correct internode message version for upgrade test add bfa7998 Merge branch 'cassandra-2.2' into cassandra-3.0 No new revisions were added by this update. Summary of changes: CHANGES.txt | 5 + .../org/apache/cassandra/distributed/impl/Instance.java | 12 +++- 2 files changed, 16 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout
[ https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961319#comment-16961319 ] Blake Eggleston commented on CASSANDRA-15373: - fixed, also made a small adjustment to unbreak a test for CASSANDRA-14912 (https://circleci.com/gh/bdeggleston/cassandra/2537#tests/containers/94) > validate value sizes in LegacyLayout > > > Key: CASSANDRA-15373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15373 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 3.0.19, 3.11.5, 4.0 > > > In 2.1, all values are serialized as variable length blobs, with a length > prefix, followed by the actual value, even with fixed width types like int32. > The 3.0 storage engine, on the other hand, omits the length prefix for fixed > width types. Since the length of fixed width types are not validated on the > 3.0 write path, writing data for a fixed width type from an incorrectly sized > byte buffer will over or underflow the space allocated for it, corrupting the > remainder of that partition or indexed region from being read. This is not > discovered until we attempt to read the corrupted value. This patch updates > LegacyLayout to throw a marshal exception if it encounters an unexpected > value size for fixed size columns. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15277) Make it possible to resize concurrent read / write thread pools at runtime
[ https://issues.apache.org/jira/browse/CASSANDRA-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961249#comment-16961249 ] Jon Meredith commented on CASSANDRA-15277: -- And CASSANDRA-15227 landed, so reverted to the original version with review feedback applied and rebased against the current trunk. [Branch|https://github.com/jonmeredith/cassandra/tree/CASSANDRA-15277-v4] [GitHub PR|https://github.com/apache/cassandra/pull/371] [CircleCI run|https://circleci.com/workflow-run/37c049c2-0a7b-4780-9d35-20493bf7a4e5] > Make it possible to resize concurrent read / write thread pools at runtime > -- > > Key: CASSANDRA-15277 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15277 > Project: Cassandra > Issue Type: Improvement > Components: Local/Other >Reporter: Jon Meredith >Assignee: Jon Meredith >Priority: Normal > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > To better mitigate cluster overload the executor services for various stages > should be configurable at runtime (probably as a JMX hot property). > Related to CASSANDRA-5044, this would add the capability to resize to > multiThreadedLowSignalStage pools based on SEPExecutor. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-15365: Status: Ready to Commit (was: Changes Suggested) > Add primary key liveness info when skipping illegal cells > - > > Key: CASSANDRA-15365 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15365 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy > cells, problem is that if the row only contains illegal cells, we return a > totally empty row which breaks stats collection: > https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70 > If the row only has these invalid cells, we should add a primary key liveness > info to it to match the 2.1 behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961245#comment-16961245 ] Sam Tunnicliffe commented on CASSANDRA-15365: - +1 > Add primary key liveness info when skipping illegal cells > - > > Key: CASSANDRA-15365 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15365 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy > cells, problem is that if the row only contains illegal cells, we return a > totally empty row which breaks stats collection: > https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70 > If the row only has these invalid cells, we should add a primary key liveness > info to it to match the 2.1 behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout
[ https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961237#comment-16961237 ] Sam Tunnicliffe commented on CASSANDRA-15373: - +1 from me too - you just need to fix the newest tests on the 3.11 branch (e.g. {{s/new Clustering/Clustering.make/}}) > validate value sizes in LegacyLayout > > > Key: CASSANDRA-15373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15373 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 3.0.19, 3.11.5, 4.0 > > > In 2.1, all values are serialized as variable length blobs, with a length > prefix, followed by the actual value, even with fixed width types like int32. > The 3.0 storage engine, on the other hand, omits the length prefix for fixed > width types. Since the length of fixed width types are not validated on the > 3.0 write path, writing data for a fixed width type from an incorrectly sized > byte buffer will over or underflow the space allocated for it, corrupting the > remainder of that partition or indexed region from being read. This is not > discovered until we attempt to read the corrupted value. This patch updates > LegacyLayout to throw a marshal exception if it encounters an unexpected > value size for fixed size columns. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15373) validate value sizes in LegacyLayout
[ https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15373: --- Status: Ready to Commit (was: Review In Progress) +1 from me > validate value sizes in LegacyLayout > > > Key: CASSANDRA-15373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15373 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 3.0.19, 3.11.5, 4.0 > > > In 2.1, all values are serialized as variable length blobs, with a length > prefix, followed by the actual value, even with fixed width types like int32. > The 3.0 storage engine, on the other hand, omits the length prefix for fixed > width types. Since the length of fixed width types are not validated on the > 3.0 write path, writing data for a fixed width type from an incorrectly sized > byte buffer will over or underflow the space allocated for it, corrupting the > remainder of that partition or indexed region from being read. This is not > discovered until we attempt to read the corrupted value. This patch updates > LegacyLayout to throw a marshal exception if it encounters an unexpected > value size for fixed size columns. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15373) validate value sizes in LegacyLayout
[ https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15373: --- Reviewers: Benedict Elliott Smith, Sam Tunnicliffe, Benedict Elliott Smith (was: Benedict Elliott Smith, Sam Tunnicliffe) Benedict Elliott Smith, Sam Tunnicliffe, Benedict Elliott Smith (was: Benedict Elliott Smith, Sam Tunnicliffe) Status: Review In Progress (was: Patch Available) > validate value sizes in LegacyLayout > > > Key: CASSANDRA-15373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15373 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 3.0.19, 3.11.5, 4.0 > > > In 2.1, all values are serialized as variable length blobs, with a length > prefix, followed by the actual value, even with fixed width types like int32. > The 3.0 storage engine, on the other hand, omits the length prefix for fixed > width types. Since the length of fixed width types are not validated on the > 3.0 write path, writing data for a fixed width type from an incorrectly sized > byte buffer will over or underflow the space allocated for it, corrupting the > remainder of that partition or indexed region from being read. This is not > discovered until we attempt to read the corrupted value. This patch updates > LegacyLayout to throw a marshal exception if it encounters an unexpected > value size for fixed size columns. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout
[ https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961190#comment-16961190 ] Blake Eggleston commented on CASSANDRA-15373: - good catch, fixed > validate value sizes in LegacyLayout > > > Key: CASSANDRA-15373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15373 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 3.0.19, 3.11.5, 4.0 > > > In 2.1, all values are serialized as variable length blobs, with a length > prefix, followed by the actual value, even with fixed width types like int32. > The 3.0 storage engine, on the other hand, omits the length prefix for fixed > width types. Since the length of fixed width types are not validated on the > 3.0 write path, writing data for a fixed width type from an incorrectly sized > byte buffer will over or underflow the space allocated for it, corrupting the > remainder of that partition or indexed region from being read. This is not > discovered until we attempt to read the corrupted value. This patch updates > LegacyLayout to throw a marshal exception if it encounters an unexpected > value size for fixed size columns. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961189#comment-16961189 ] Benedict Elliott Smith commented on CASSANDRA-15365: WFM > Add primary key liveness info when skipping illegal cells > - > > Key: CASSANDRA-15365 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15365 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy > cells, problem is that if the row only contains illegal cells, we return a > totally empty row which breaks stats collection: > https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70 > If the row only has these invalid cells, we should add a primary key liveness > info to it to match the 2.1 behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15364) Avoid over scanning data directories in LogFile.verify()
[ https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961188#comment-16961188 ] David Capwell commented on CASSANDRA-15364: --- Assert msg looks good to me; loop assumes the above asserts ran so can safely loop over the files. +1 > Avoid over scanning data directories in LogFile.verify() > > > Key: CASSANDRA-15364 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15364 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.x > > > We currently list the data directory for every {{REMOVE}} record in the file > in {{LogFile.verify()}} - this can get very expensive during startup when we > call {{LogTransaction.removeUnfinishedLeftovers()}}. In > {{LogRecord.getExistingFiles(Set absoluteFilePaths)}} we also fully > parse the file name of the sstables found, here we only need to prefix match. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14629) Abstract Virtual Table for very large result sets
[ https://issues.apache.org/jira/browse/CASSANDRA-14629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961186#comment-16961186 ] Aleksey Yeschenko commented on CASSANDRA-14629: --- Looks ok to me overall, but I do have two questions (two variations of the same question, really): 1. {{select(DecoratedKey partitionKey, ClusteringIndexFilter clusteringFilter, ColumnFilter columnFilter)}} involves invoking {{hasKey(DecoratedKey partitionKey)}} and then {{getRows(DecoratedKey key, ClusteringIndexFilter clusteringFilter, ColumnFilter columnFilter)}}. Depending on the underlying implementation, this could mean a lot of extra work - up to doubling the amount of work needed. As an illustration, a common {{Map}} usage antipattern comes to mind: doing a {{contains()}} followed by {{get()}} instead of just doing get and checking for {{null}}. I know that one of the use cases you have in mind for this code is exposing the raw content of sstables, and I can see this overhead being relatively significant there potentially. I would suggest getting rid of {{hasKey()}} entirely and of the related check. 2. Similarly, {{select(DataRange dataRange, ColumnFilter columnFilter)}} and {{getPartitionKeys(DataRange dataRange)}} invocation could maybe also be remodelled to permit a single underlying iterator? It's possible that I'm missing something here, so these aren't demands for changes - just a conversation starter. P.S. You can suppress the redundant suppression warnings themselves like this: {code:java} @SuppressWarnings({"resource", "RedundantSuppression"}) {code} > Abstract Virtual Table for very large result sets > - > > Key: CASSANDRA-14629 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14629 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/CQL, Legacy/Observability >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Low > Labels: pull-request-available, virtual-tables > Time Spent: 4h > Remaining Estimate: 0h > > For virtual tables that are very large we cannot use existing > abstractvirtualtable since it would OOM the node possibly. An example would > be a table to view the internal cache contents or to view contents of > sstables. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961184#comment-16961184 ] Marcus Eriksson commented on CASSANDRA-15365: - so, in 2.1 these invalid cells act as a kind of permanent row markers, for example: * {{create table %s (pk int, c1 text, c2 text, v1 text, primary key (pk, c1, c2))}} * insert an invalid cell, select * now returns: {{3, a, aa, null}} * do an insert with a ttl for the whole row: {{insert into %s (pk, c1, c2, v1) values (3, 'a', 'aa', 'vaaalue') using ttl 2}} * select * before ttl: {{3, a, aa, vaaalue}} * select * after ttl expires: {{3, a, aa, null}} If the invalid cell didn't exist, the select would have returned nothing Translating this to 3.0 including this patch, we would translate the whole-row-insert row marker to the PKLI and ignore the invalid cell, and after ttl expires we would return nothing. We can't really fully translate the 2.1 behaviour to 3.0 using only the PKLI - for example if we in the example above set the PKLI to the timestamp of the invalid cell and later overwrote that row with a ttl row, we would fully purge it after ttl expires, while in the 2.1 case we would still keep the invalid cell I vote we keep the current patch behaviour, wdyt [~samt] & [~benedict]? > Add primary key liveness info when skipping illegal cells > - > > Key: CASSANDRA-15365 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15365 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy > cells, problem is that if the row only contains illegal cells, we return a > totally empty row which breaks stats collection: > https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70 > If the row only has these invalid cells, we should add a primary key liveness > info to it to match the 2.1 behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15381) Failing test - testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961160#comment-16961160 ] Jon Meredith commented on CASSANDRA-15381: -- Fix included as part of CASSANDRA-15371 > Failing test - > testDatabaseDescriptorRef::org.apache.cassandra.config.DatabaseDescriptorRefTest > --- > > Key: CASSANDRA-15381 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15381 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Vinay Chella >Priority: Normal > > As part of Apache Cassandra 4.0-alpha2 voting, the following test is failing > across different test suites and runs. > CircleCI Run: > [https://circleci.com/gh/vinaykumarchella/cassandra/487#tests/containers/37] > *testDatabaseDescriptorRef-compression - > org.apache.cassandra.config.DatabaseDescriptorRefTest* > {code:java} > junit.framework.AssertionFailedError > at > org.apache.cassandra.config.DatabaseDescriptorRefTest.checkViolations(DatabaseDescriptorRefTest.java:293) > at > org.apache.cassandra.config.DatabaseDescriptorRefTest.testDatabaseDescriptorRef(DatabaseDescriptorRefTest.java:277){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15373) validate value sizes in LegacyLayout
[ https://issues.apache.org/jira/browse/CASSANDRA-15373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961057#comment-16961057 ] Benedict Elliott Smith commented on CASSANDRA-15373: I think there's one remaining missing check: in {{decodeCellName}} we need to verify that the {{collectionElement}} is valid, as it can be a fixed width type. > validate value sizes in LegacyLayout > > > Key: CASSANDRA-15373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15373 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 3.0.19, 3.11.5, 4.0 > > > In 2.1, all values are serialized as variable length blobs, with a length > prefix, followed by the actual value, even with fixed width types like int32. > The 3.0 storage engine, on the other hand, omits the length prefix for fixed > width types. Since the length of fixed width types are not validated on the > 3.0 write path, writing data for a fixed width type from an incorrectly sized > byte buffer will over or underflow the space allocated for it, corrupting the > remainder of that partition or indexed region from being read. This is not > discovered until we attempt to read the corrupted value. This patch updates > LegacyLayout to throw a marshal exception if it encounters an unexpected > value size for fixed size columns. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15227) Remove StageManager
[ https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15227: --- Source Control Link: [b97fc302b10d0ec5303421b3b185675872672c46|https://github.com/apache/cassandra/commit/b97fc302b10d0ec5303421b3b185675872672c46] Resolution: Fixed Status: Resolved (was: Ready to Commit) Sorry for letting this one atrophy - I've made some minor cosmetic changes and committed. > Remove StageManager > --- > > Key: CASSANDRA-15227 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15227 > Project: Cassandra > Issue Type: Task > Components: Local/Other >Reporter: Benedict Elliott Smith >Assignee: Venkata Harikrishna Nukala >Priority: Normal > Fix For: 4.0 > > > his is a minor cleanup; this class should not exist, but should be embedded > in the Stage enum. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15227) Remove StageManager
[ https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15227: --- Status: Ready to Commit (was: Review In Progress) > Remove StageManager > --- > > Key: CASSANDRA-15227 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15227 > Project: Cassandra > Issue Type: Task > Components: Local/Other >Reporter: Benedict Elliott Smith >Assignee: Venkata Harikrishna Nukala >Priority: Normal > Fix For: 4.0 > > > his is a minor cleanup; this class should not exist, but should be embedded > in the Stage enum. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15227) Remove StageManager
[ https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15227: --- Reviewers: Benedict Elliott Smith, Benedict Elliott Smith (was: Benedict Elliott Smith) Benedict Elliott Smith, Benedict Elliott Smith (was: Benedict Elliott Smith) Status: Review In Progress (was: Patch Available) > Remove StageManager > --- > > Key: CASSANDRA-15227 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15227 > Project: Cassandra > Issue Type: Task > Components: Local/Other >Reporter: Benedict Elliott Smith >Assignee: Venkata Harikrishna Nukala >Priority: Normal > Fix For: 4.0 > > > his is a minor cleanup; this class should not exist, but should be embedded > in the Stage enum. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated (3522b54 -> b97fc30)
This is an automated email from the ASF dual-hosted git repository. benedict pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git. from 3522b54 Make ConnectionBurnTest a proper unit test (fixes `ant test-burn`) add b97fc30 Remove StageManager No new revisions were added by this update. Summary of changes: .../concurrent/JMXEnabledSingleThreadExecutor.java | 77 + .../concurrent/JMXEnabledThreadPoolExecutor.java | 5 - .../org/apache/cassandra/concurrent/Stage.java | 178 - .../apache/cassandra/concurrent/StageManager.java | 155 -- .../org/apache/cassandra/db/ColumnFamilyStore.java | 8 +- src/java/org/apache/cassandra/db/Keyspace.java | 3 +- .../cassandra/db/SinglePartitionReadCommand.java | 6 +- .../cassandra/db/commitlog/CommitLogReplayer.java | 3 +- src/java/org/apache/cassandra/gms/Gossiper.java| 20 +-- .../cassandra/index/SecondaryIndexManager.java | 4 +- .../cassandra/net/InboundMessageHandler.java | 3 +- .../org/apache/cassandra/net/MessagingService.java | 3 +- .../org/apache/cassandra/net/RequestCallbacks.java | 3 +- .../org/apache/cassandra/repair/Validator.java | 5 +- .../apache/cassandra/schema/MigrationManager.java | 9 +- .../cassandra/schema/SchemaPushVerbHandler.java| 3 +- .../org/apache/cassandra/service/CacheService.java | 5 +- .../org/apache/cassandra/service/StorageProxy.java | 17 +- .../apache/cassandra/service/StorageService.java | 11 +- .../service/reads/AbstractReadExecutor.java| 4 +- .../reads/ShortReadPartitionsProtection.java | 3 +- .../service/reads/repair/AbstractReadRepair.java | 4 +- .../apache/cassandra/tracing/TraceStateImpl.java | 4 +- .../org/apache/cassandra/tracing/TracingImpl.java | 3 +- .../cassandra/distributed/impl/Instance.java | 6 +- .../org/apache/cassandra/cql3/ViewLongTest.java| 5 +- .../org/apache/cassandra/cql3/ViewComplexTest.java | 5 +- .../apache/cassandra/cql3/ViewFilteringTest.java | 5 +- .../org/apache/cassandra/cql3/ViewSchemaTest.java | 5 +- test/unit/org/apache/cassandra/cql3/ViewTest.java | 5 +- .../org/apache/cassandra/net/MatcherResponse.java | 3 +- 31 files changed, 275 insertions(+), 295 deletions(-) create mode 100644 src/java/org/apache/cassandra/concurrent/JMXEnabledSingleThreadExecutor.java delete mode 100644 src/java/org/apache/cassandra/concurrent/StageManager.java - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-11928: --- Since Version: 4.0-alpha > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Craig Kodman >Priority: Normal > Labels: dtest, flaky > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960886#comment-16960886 ] Michael Semb Wever commented on CASSANDRA-11928: Seems to be a regression in trunk of CASSANDRA-11465, the tracing doesn't use same consistency level as the request. trunk: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/856/testReport/junit/cql_tracing_test/TestCqlTracing/ 3.11: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-3.11-dtest/466/testReport/cql_tracing_test/TestCqlTracing/ My understanding is that tracing data was intended to be eventually consistent, and making it strictly consistency (via `-Dcassandra.wait_for_tracing_events_timeout_secs=xx`) was only for the purpose of testing. If that's true, a simple fix is just to reduce ccm nodes for that test, ie https://github.com/thelastpickle/cassandra-dtest/commit/f22f89fdb3080ac48f4310ee1a5aeb219ac2f093#diff-b866cba7cf982d53e6406cca014e659eR23 [~pauloricardomg], [~jkni], [~mambocab], thoughts? Is it worth bisecting where the regression came from? Or removing the `wait_for_tracing_events_timeout_secs` flag? > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Craig Kodman >Priority: Normal > Labels: dtest, flaky > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960886#comment-16960886 ] Michael Semb Wever edited comment on CASSANDRA-11928 at 10/28/19 9:33 AM: -- Seems to be a regression in trunk of CASSANDRA-11465, the tracing doesn't use same consistency level as the request. trunk: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/856/testReport/junit/cql_tracing_test/TestCqlTracing/ 3.11: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-3.11-dtest/466/testReport/cql_tracing_test/TestCqlTracing/ My understanding is that tracing data was intended to be eventually consistent, and making it strictly consistency (via {{`-Dcassandra.wait_for_tracing_events_timeout_secs=xx`}}) was only for the purpose of testing. If that's true, a simple fix is just to reduce ccm nodes for that test, ie https://github.com/thelastpickle/cassandra-dtest/commit/f22f89fdb3080ac48f4310ee1a5aeb219ac2f093#diff-b866cba7cf982d53e6406cca014e659eR23 [~pauloricardomg], [~jkni], [~mambocab], thoughts? Is it worth bisecting where the regression came from? Or removing the {{wait_for_tracing_events_timeout_secs}} flag? was (Author: michaelsembwever): Seems to be a regression in trunk of CASSANDRA-11465, the tracing doesn't use same consistency level as the request. trunk: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/856/testReport/junit/cql_tracing_test/TestCqlTracing/ 3.11: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-3.11-dtest/466/testReport/cql_tracing_test/TestCqlTracing/ My understanding is that tracing data was intended to be eventually consistent, and making it strictly consistency (via `-Dcassandra.wait_for_tracing_events_timeout_secs=xx`) was only for the purpose of testing. If that's true, a simple fix is just to reduce ccm nodes for that test, ie https://github.com/thelastpickle/cassandra-dtest/commit/f22f89fdb3080ac48f4310ee1a5aeb219ac2f093#diff-b866cba7cf982d53e6406cca014e659eR23 [~pauloricardomg], [~jkni], [~mambocab], thoughts? Is it worth bisecting where the regression came from? Or removing the `wait_for_tracing_events_timeout_secs` flag? > dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test > > > Key: CASSANDRA-11928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11928 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Craig Kodman >Priority: Normal > Labels: dtest, flaky > > example failure: > http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test > Failed on CassCI build cassandra-3.0_dtest #727 > Is it a problem that the tracing message with the query is missing? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15364) Avoid over scanning data directories in LogFile.verify()
[ https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960882#comment-16960882 ] Marcus Eriksson commented on CASSANDRA-15364: - pushed a commit which adds an explanation to the assert and changes the second loop to iterate over the values of the map instead > Avoid over scanning data directories in LogFile.verify() > > > Key: CASSANDRA-15364 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15364 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.x > > > We currently list the data directory for every {{REMOVE}} record in the file > in {{LogFile.verify()}} - this can get very expensive during startup when we > call {{LogTransaction.removeUnfinishedLeftovers()}}. In > {{LogRecord.getExistingFiles(Set absoluteFilePaths)}} we also fully > parse the file name of the sstables found, here we only need to prefix match. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Make ConnectionBurnTest a proper unit test (fixes `ant test-burn`)
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 3522b54 Make ConnectionBurnTest a proper unit test (fixes `ant test-burn`) 3522b54 is described below commit 3522b54f2d7f34c3dc8234c8981a4629ebcf9a50 Author: Mick Semb Wever AuthorDate: Sat Oct 26 22:23:47 2019 +0200 Make ConnectionBurnTest a proper unit test (fixes `ant test-burn`) patch by Mick Semb Wever; reviewed by Benedict Elliott Smith https://the-asf.slack.com/archives/CK23JSY2K/p1572206490030800 --- test/burn/org/apache/cassandra/net/ConnectionBurnTest.java | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java b/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java index 81b6402..57eb726 100644 --- a/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java +++ b/test/burn/org/apache/cassandra/net/ConnectionBurnTest.java @@ -622,7 +622,7 @@ public class ConnectionBurnTest } } -public static void test(GlobalInboundSettings inbound, OutboundConnectionSettings outbound) throws ExecutionException, InterruptedException, NoSuchFieldException, IllegalAccessException, TimeoutException +private void test(GlobalInboundSettings inbound, OutboundConnectionSettings outbound) throws ExecutionException, InterruptedException, NoSuchFieldException, IllegalAccessException, TimeoutException { MessageGenerator small = new UniformPayloadGenerator(0, 1, (1 << 15)); MessageGenerator large = new UniformPayloadGenerator(0, 1, (1 << 16) + (1 << 15)); @@ -635,12 +635,19 @@ public class ConnectionBurnTest .endpoints(4) .inbound(inbound) .outbound(outbound) -.time(2L, TimeUnit.DAYS) +// change the following for a longer burn +.time(2L, TimeUnit.MINUTES) .build().run(); } public static void main(String[] args) throws ExecutionException, InterruptedException, NoSuchFieldException, IllegalAccessException, TimeoutException { +new ConnectionBurnTest().test(); +} + +@org.junit.Test +public void test() throws ExecutionException, InterruptedException, NoSuchFieldException, IllegalAccessException, TimeoutException +{ GlobalInboundSettings inboundSettings = new GlobalInboundSettings() .withQueueCapacity(1 << 18) .withEndpointReserveLimit(1 << 20) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org