[jira] [Commented] (CASSANDRA-16850) Add client warnings and abort to tombstone and coordinator reads which go past a low/high watermark
[ https://issues.apache.org/jira/browse/CASSANDRA-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404790#comment-17404790 ] David Capwell commented on CASSANDRA-16850: --- bq. The only remaining thing is that DefaultTrackWarnings doesn't need the enabled flag, since it's implicitly enabled fixed > Add client warnings and abort to tombstone and coordinator reads which go > past a low/high watermark > --- > > Key: CASSANDRA-16850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16850 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Logging >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.1 > > Time Spent: 40m > Remaining Estimate: 0h > > We currently will abort queries if we hit too many tombstones, but its common > that we would want to also warn clients (client warnings) about this before > we get that point; its also common that different logic would like to be able > to warn/abort about client options (such as reading a large partition). To > allow this we should add a concept of low/high watermarks (warn/abort) to > tombstones and coordinator reads. > Another issue is that current aborts look the same as a random failure, so > from an SLA point of view it would be good to differentiate between user > behavior being rejected and unexplained issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16666) Make SSLContext creation pluggable/extensible
[ https://issues.apache.org/jira/browse/CASSANDRA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404786#comment-17404786 ] Maulin Vasavada commented on CASSANDRA-1: - And I am still going to address two comments by Jon (I've not forgotten them :) ). > Make SSLContext creation pluggable/extensible > - > > Key: CASSANDRA-1 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Maulin Vasavada >Assignee: Maulin Vasavada >Priority: Normal > Fix For: 4.x > > > Currently Cassandra creates the SSLContext via SSLFactory.java. SSLFactory is > a final class with static methods and not overridable. The SSLFactory loads > the keys and certs from the file based artifacts for the same. While this > works for many, in the industry where security is stricter and contextual, > this approach falls short. Many big organizations need flexibility to load > the SSL artifacts from a custom resource (like custom Key Management > Solution, HashiCorp Vault, Amazon KMS etc). While JSSE SecurityProvider > architecture allows us flexibility to build our custom mechanisms to validate > and process security artifacts, many times all we need is to build upon > Java's existing extensibility that Trust/Key Manager interfaces provide to > load keystores from various resources in the absence of any customized > requirements on the Keys/Certificate formats. > My proposal here is to make the SSLContext creation pluggable/extensible and > have the current SSLFactory.java implement an extensible interface. > I contributed a similar change that is live now in Apache Kafka (2.6.0) - > https://issues.apache.org/jira/browse/KAFKA-8890 > I can spare some time writing the pluggable interface and run by the required > reviewers. > > Created [CEP-9: Make SSLContext creation > pluggable|https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-9%3A+Make+SSLContext+creation+pluggable] > > > cc: [~dcapwell] [~djoshi] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16666) Make SSLContext creation pluggable/extensible
[ https://issues.apache.org/jira/browse/CASSANDRA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404783#comment-17404783 ] Maulin Vasavada commented on CASSANDRA-1: - Hi [~jmeredithco] and [~stefan.miklosovic] I've addressed comments on the example. [~mck] For the documentation I feel I can start with doc/source/operating/security.rst and get the review done before adapting to new way of the documentation. This way it would be better for me given that I am new to the asciidoc etc. Please let me know your thoughts. > Make SSLContext creation pluggable/extensible > - > > Key: CASSANDRA-1 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Maulin Vasavada >Assignee: Maulin Vasavada >Priority: Normal > Fix For: 4.x > > > Currently Cassandra creates the SSLContext via SSLFactory.java. SSLFactory is > a final class with static methods and not overridable. The SSLFactory loads > the keys and certs from the file based artifacts for the same. While this > works for many, in the industry where security is stricter and contextual, > this approach falls short. Many big organizations need flexibility to load > the SSL artifacts from a custom resource (like custom Key Management > Solution, HashiCorp Vault, Amazon KMS etc). While JSSE SecurityProvider > architecture allows us flexibility to build our custom mechanisms to validate > and process security artifacts, many times all we need is to build upon > Java's existing extensibility that Trust/Key Manager interfaces provide to > load keystores from various resources in the absence of any customized > requirements on the Keys/Certificate formats. > My proposal here is to make the SSLContext creation pluggable/extensible and > have the current SSLFactory.java implement an extensible interface. > I contributed a similar change that is live now in Apache Kafka (2.6.0) - > https://issues.apache.org/jira/browse/KAFKA-8890 > I can spare some time writing the pluggable interface and run by the required > reviewers. > > Created [CEP-9: Make SSLContext creation > pluggable|https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-9%3A+Make+SSLContext+creation+pluggable] > > > cc: [~dcapwell] [~djoshi] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-16175) Avoid removing batch when it's not created during view replication
[ https://issues.apache.org/jira/browse/CASSANDRA-16175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova reassigned CASSANDRA-16175: --- Assignee: Ekaterina Dimitrova > Avoid removing batch when it's not created during view replication > -- > > Key: CASSANDRA-16175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16175 > Project: Cassandra > Issue Type: Bug > Components: Feature/Materialized Views >Reporter: Zhao Yang >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > When the base replica is also a view replica we don't write a local batchlog, > but they are unnecessarily removed when the view write is successful, what > creates (and persists) a tombstone into the system.batches table. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16850) Add client warnings and abort to tombstone and coordinator reads which go past a low/high watermark
[ https://issues.apache.org/jira/browse/CASSANDRA-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-16850: Status: Ready to Commit (was: Review In Progress) Looks good. The only remaining thing is that {{DefaultTrackWarnings}} doesn't need the {{enabled}} flag, since it's implicitly enabled, but that can be addressed on commit. +1 > Add client warnings and abort to tombstone and coordinator reads which go > past a low/high watermark > --- > > Key: CASSANDRA-16850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16850 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Logging >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.1 > > Time Spent: 40m > Remaining Estimate: 0h > > We currently will abort queries if we hit too many tombstones, but its common > that we would want to also warn clients (client warnings) about this before > we get that point; its also common that different logic would like to be able > to warn/abort about client options (such as reading a large partition). To > allow this we should add a concept of low/high watermarks (warn/abort) to > tombstones and coordinator reads. > Another issue is that current aborts look the same as a random failure, so > from an SLA point of view it would be good to differentiate between user > behavior being rejected and unexplained issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16850) Add client warnings and abort to tombstone and coordinator reads which go past a low/high watermark
[ https://issues.apache.org/jira/browse/CASSANDRA-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-16850: Status: Review In Progress (was: Patch Available) > Add client warnings and abort to tombstone and coordinator reads which go > past a low/high watermark > --- > > Key: CASSANDRA-16850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16850 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Logging >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.1 > > Time Spent: 40m > Remaining Estimate: 0h > > We currently will abort queries if we hit too many tombstones, but its common > that we would want to also warn clients (client warnings) about this before > we get that point; its also common that different logic would like to be able > to warn/abort about client options (such as reading a large partition). To > allow this we should add a concept of low/high watermarks (warn/abort) to > tombstones and coordinator reads. > Another issue is that current aborts look the same as a random failure, so > from an SLA point of view it would be good to differentiate between user > behavior being rejected and unexplained issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14557) Consider adding default and required keyspace replication options
[ https://issues.apache.org/jira/browse/CASSANDRA-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404742#comment-17404742 ] Aleksei Zotov commented on CASSANDRA-14557: --- Thanks [~sumanth.pasupuleti]! I replied to your comments at the same same commit: [https://github.com/sumanth-pasupuleti/cassandra/commit/139a01531f1b51f6b3b7dc005a7df929ec9409a5]. Please, take a look and let me know your thoughts. > Consider adding default and required keyspace replication options > - > > Key: CASSANDRA-14557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14557 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Low > Labels: 4.0-feature-freeze-review-requested > Fix For: 4.x > > Attachments: 14557-4.0.txt, 14557-trunk.patch > > > Ending up with a keyspace of RF=1 is unfortunately pretty easy in C* right > now - the system_auth table for example is created with RF=1 (to take into > account single node setups afaict from CASSANDRA-5112), and a user can > further create a keyspace with RF=1 posing availability and streaming risks > (e.g. rebuild). > I propose we add two configuration options in cassandra.yaml: > # {{default_keyspace_rf}} (default: 1) - If replication factors are not > specified, use this number. > # {{required_minimum_keyspace_rf}} (default: unset) - Prevent users from > creating a keyspace with an RF less than what is configured > These settings could further be re-used to: > * Provide defaults for new keyspaces created with SimpleStrategy or > NetworkTopologyStrategy (CASSANDRA-14303) > * Make the automatic token [allocation > algorithm|https://issues.apache.org/jira/browse/CASSANDRA-13701?focusedCommentId=16095662=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16095662] > interface more intuitive allowing easy use of the new token allocation > algorithm. > At the end of the day, if someone really wants to allow RF=1, they simply > don’t set the setting. For backwards compatibility the default remains 1 and > C* would create with RF=1, and would default to current behavior of allowing > any RF on keyspaces. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16879) Verify correct ownership of attached locations on disk at C* startup
[ https://issues.apache.org/jira/browse/CASSANDRA-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16879: Status: Review In Progress (was: Patch Available) > Verify correct ownership of attached locations on disk at C* startup > > > Key: CASSANDRA-16879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16879 > Project: Cassandra > Issue Type: Improvement > Components: Local/Startup and Shutdown >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.x > > > There's two primary things related to startup and disk ownership we should > mitigate. > First, an instance can come up with an incorrectly mounted volume attached as > its configured data directory. This causes the wrong system tables to be > read. If the instance which was previously using the volume is also down, its > token could be taken over by the instance coming up. > Secondly, in a JBOD setup, the non-system keyspaces may reside on a separate > volume to the system tables. In this scenario, we need to ensure that all > directories belong to the same instance, and that as the instance starts up > it can access all the directories it expects to be able to. (including data, > commit log, hints and saved cache dirs) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16879) Verify correct ownership of attached locations on disk at C* startup
[ https://issues.apache.org/jira/browse/CASSANDRA-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404737#comment-17404737 ] Caleb Rackliffe commented on CASSANDRA-16879: - Very minor note...I think we have typically included all authors WRT the {{Co-authored-by}} tag. In this case, I think that would mean both you and Sam are tagged. (I can see how you could go that way with it, of course, if the main author tag already pointed to Sam. Not a huge deal either way I guess, unless we had some reporting/analytics that assumed "primary" authors are also technically co-authors...) > Verify correct ownership of attached locations on disk at C* startup > > > Key: CASSANDRA-16879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16879 > Project: Cassandra > Issue Type: Improvement > Components: Local/Startup and Shutdown >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.x > > > There's two primary things related to startup and disk ownership we should > mitigate. > First, an instance can come up with an incorrectly mounted volume attached as > its configured data directory. This causes the wrong system tables to be > read. If the instance which was previously using the volume is also down, its > token could be taken over by the instance coming up. > Secondly, in a JBOD setup, the non-system keyspaces may reside on a separate > volume to the system tables. In this scenario, we need to ensure that all > directories belong to the same instance, and that as the instance starts up > it can access all the directories it expects to be able to. (including data, > commit log, hints and saved cache dirs) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-16884: - Status: Review In Progress (was: Patch Available) > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-16884: - Status: Patch Available (was: In Progress) > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-16884: - Status: In Progress (was: Patch Available) > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-16884: - Status: Ready to Commit (was: Review In Progress) +1 > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-16884: - Status: Patch Available (was: In Progress) > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-16884: - Status: In Progress (was: Patch Available) > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16842) Allow CommitLogSegmentReader to optionally skip sync marker CRC checks
[ https://issues.apache.org/jira/browse/CASSANDRA-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16842: Fix Version/s: (was: 4.x) 4.1 Source Control Link: https://github.com/apache/cassandra/commit/f9b7c1e6984f5b81aae1e3a2191d4e9599db15ae Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed to trunk as https://github.com/apache/cassandra/commit/f9b7c1e6984f5b81aae1e3a2191d4e9599db15ae > Allow CommitLogSegmentReader to optionally skip sync marker CRC checks > -- > > Key: CASSANDRA-16842 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16842 > Project: Cassandra > Issue Type: Improvement > Components: Local/Commit Log >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.1 > > Time Spent: 40m > Remaining Estimate: 0h > > CommitLog sync markers are written in two phases. In the first, zeroes are > written for the position of the next sync marker and the sync marker CRC > value. In the second, when the next sync marker is written, the actual > position and CRC values are written. If the process shuts down in a > disorderly fashion, it is entirely possible for a valid next marker position > to be written to our memory mapped file but not the final CRC value. Later, > when we attempt to replay the segment, we will fail without recovering any of > the perfectly valid mutations it contains. (This assumes we’re confining > ourselves to the case where there is no compression or encryption.) > {noformat} > ERROR 2020-11-18T10:55:23,888 [main] > org.apache.cassandra.utils.JVMStabilityInspector:102 - Exiting due to error > while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Encountered bad header at position 23091775 of commit log > …/CommitLog-6-1605699607608.log, with invalid CRC. The end of segment marker > should be zero. > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:731) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.readSyncMarker(CommitLogReplayer.java:274) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:436) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:189) > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:170 > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:151) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:332) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:656) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:808) > {noformat} > It may be useful to provide an option that would allow us to override the > default/strict behavior here and skip the CRC check if a non-zero end > position is present, allowing valid mutations to be recovered and startup to > proceed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Allow CommitLogSegmentReader to optionally skip sync marker CRC checks
This is an automated email from the ASF dual-hosted git repository. maedhroz pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new f9b7c1e Allow CommitLogSegmentReader to optionally skip sync marker CRC checks f9b7c1e is described below commit f9b7c1e6984f5b81aae1e3a2191d4e9599db15ae Author: Marcus Eriksson AuthorDate: Mon Jan 11 10:55:44 2021 +0100 Allow CommitLogSegmentReader to optionally skip sync marker CRC checks patch by Caleb Rackliffe; reviewed by Josh McKenzie for CASSANDRA-16842 Co-authored-by: Jordan West Co-authored-by: Caleb Rackliffe Co-authored-by: Marcus Eriksson --- CHANGES.txt| 1 + .../db/commitlog/CommitLogSegmentReader.java | 29 + .../cassandra/db/commitlog/CommitLogTest.java | 128 + 3 files changed, 137 insertions(+), 21 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index a9c8ebd..be3ea40 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.1 + * Allow CommitLogSegmentReader to optionally skip sync marker CRC checks (CASSANDRA-16842) * allow blocking IPs from updating metrics about traffic (CASSANDRA-16859) * Request-Based Native Transport Rate-Limiting (CASSANDRA-16663) * Implement nodetool getauditlog command (CASSANDRA-16725) diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentReader.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentReader.java index e23a915..33e70c1 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentReader.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentReader.java @@ -26,6 +26,10 @@ import javax.crypto.Cipher; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.AbstractIterator; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.cassandra.config.Config; import org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream.ChunkProvider; import org.apache.cassandra.db.commitlog.CommitLogReadHandler.*; import org.apache.cassandra.io.FSReadError; @@ -46,6 +50,11 @@ import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt; */ public class CommitLogSegmentReader implements Iterable { +public static final String ALLOW_IGNORE_SYNC_CRC = Config.PROPERTY_PREFIX + "commitlog.allow_ignore_sync_crc"; +private static volatile boolean allowSkipSyncMarkerCrc = Boolean.getBoolean(ALLOW_IGNORE_SYNC_CRC); + +private static final Logger logger = LoggerFactory.getLogger(CommitLogSegmentReader.class); + private final CommitLogReadHandler handler; private final CommitLogDescriptor descriptor; private final RandomAccessReader reader; @@ -75,6 +84,11 @@ public class CommitLogSegmentReader implements Iterable iterator() { @@ -151,8 +165,23 @@ public class CommitLogSegmentReader implements Iterable generateData() +public static Collection generateData() throws Exception +{ +return Arrays.asList(new Object[][] +{ +{ null, EncryptionContextGenerator.createDisabledContext()}, // No compression, no encryption +{ null, newEncryptionContext() }, // Encryption +{ new ParameterizedClass(LZ4Compressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext() }, +{ new ParameterizedClass(SnappyCompressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext()}, +{ new ParameterizedClass(DeflateCompressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext()}, +{ new ParameterizedClass(ZstdCompressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext()} +}); +} + +private static EncryptionContext newEncryptionContext() throws Exception { -return Arrays.asList(new Object[][]{ -{null, EncryptionContextGenerator.createDisabledContext()}, // No compression, no encryption -{null, EncryptionContextGenerator.createContext(true)}, // Encryption -{new ParameterizedClass(LZ4Compressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext()}, -{new ParameterizedClass(SnappyCompressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext()}, -{new ParameterizedClass(DeflateCompressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext()}, -{new ParameterizedClass(ZstdCompressor.class.getName(), Collections.emptyMap()), EncryptionContextGenerator.createDisabledContext()}}); +EncryptionContext context = EncryptionContextGenerator.createContext(true); +
[jira] [Updated] (CASSANDRA-16879) Verify correct ownership of attached locations on disk at C* startup
[ https://issues.apache.org/jira/browse/CASSANDRA-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16879: Reviewers: Caleb Rackliffe > Verify correct ownership of attached locations on disk at C* startup > > > Key: CASSANDRA-16879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16879 > Project: Cassandra > Issue Type: Improvement > Components: Local/Startup and Shutdown >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.x > > > There's two primary things related to startup and disk ownership we should > mitigate. > First, an instance can come up with an incorrectly mounted volume attached as > its configured data directory. This causes the wrong system tables to be > read. If the instance which was previously using the volume is also down, its > token could be taken over by the instance coming up. > Secondly, in a JBOD setup, the non-system keyspaces may reside on a separate > volume to the system tables. In this scenario, we need to ensure that all > directories belong to the same instance, and that as the instance starts up > it can access all the directories it expects to be able to. (including data, > commit log, hints and saved cache dirs) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16879) Verify correct ownership of attached locations on disk at C* startup
[ https://issues.apache.org/jira/browse/CASSANDRA-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404621#comment-17404621 ] Josh McKenzie edited comment on CASSANDRA-16879 at 8/25/21, 7:23 PM: - A few failures I'm confident are unrelated to the ticket. * JDK11: testUnloggedPartitionsPerBatch which passes locally. Think this was a circle config + env issue w/timeout. * replaceAliveHost which is failing in generally at the moment * JDK8: incompletePropose which OOM'ed - see this on a couple other branches and unrelated to this ticket. Passing fine locally and on JDK11. ||Item|Link| |JDK8 tests|[Link|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/59/workflows/ff19f043-dc27-4d83-baf4-0510614a9c0c]| |JDK11 tests|[Link|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/59/workflows/6580b51c-8254-478b-a1f0-7cd6b6392c31]| |Branch|[Link|https://github.com/apache/cassandra/compare/cassandra-4.0...josh-mckenzie:CASSANDRA-16879?expand=1]| was (Author: jmckenzie): A few failures I'm confident are unrelated to the ticket. * JDK11: testUnloggedPartitionsPerBatch which passes locally. Think this was a circle config + env issue w/timeout. * replaceAliveHost which is failing in generaly atm * JDK8: incompletePropose which OOM'ed - see this on a couple other branches and unrelated to this ticket. Passing fine locally and on JDK11. ||Item|Link| |JDK8 tests|[Link|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/59/workflows/ff19f043-dc27-4d83-baf4-0510614a9c0c]| |JDK11 tests|[Link|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/59/workflows/6580b51c-8254-478b-a1f0-7cd6b6392c31]| |Branch|[Link|https://github.com/apache/cassandra/compare/cassandra-4.0...josh-mckenzie:CASSANDRA-16879?expand=1]| > Verify correct ownership of attached locations on disk at C* startup > > > Key: CASSANDRA-16879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16879 > Project: Cassandra > Issue Type: Improvement > Components: Local/Startup and Shutdown >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.x > > > There's two primary things related to startup and disk ownership we should > mitigate. > First, an instance can come up with an incorrectly mounted volume attached as > its configured data directory. This causes the wrong system tables to be > read. If the instance which was previously using the volume is also down, its > token could be taken over by the instance coming up. > Secondly, in a JBOD setup, the non-system keyspaces may reside on a separate > volume to the system tables. In this scenario, we need to ensure that all > directories belong to the same instance, and that as the instance starts up > it can access all the directories it expects to be able to. (including data, > commit log, hints and saved cache dirs) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16884: -- Change Category: Quality Assurance Complexity: Low Hanging Fruit Reviewers: Dinesh Joshi Priority: High (was: Normal) Status: Open (was: Triage Needed) PR: [https://github.com/apache/cassandra/pull/1170] CI: [https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-16884%2F4.0] This is a simple one-liner patch just to bump the version of zstd-jni. > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
[ https://issues.apache.org/jira/browse/CASSANDRA-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16884: -- Test and Documentation Plan: ci Status: Patch Available (was: Open) > Bump zstd-jni version to 1.5.0-4 > > > Key: CASSANDRA-16884 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: High > > The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has > been a lot of development on the zstd library and the jni binding during the > 2.5 years, including a fuzzer which detected a handful of corruption bugs and > performance improvements. > The version of zstd-jni maps with the one of the native library. The current > native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. > I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16884) Bump zstd-jni version to 1.5.0-4
Yifan Cai created CASSANDRA-16884: - Summary: Bump zstd-jni version to 1.5.0-4 Key: CASSANDRA-16884 URL: https://issues.apache.org/jira/browse/CASSANDRA-16884 Project: Cassandra Issue Type: Task Components: Build Reporter: Yifan Cai Assignee: Yifan Cai The current zstd-jni version (1.3.8-5) was released in 04/12/2019. There has been a lot of development on the zstd library and the jni binding during the 2.5 years, including a fuzzer which detected a handful of corruption bugs and performance improvements. The version of zstd-jni maps with the one of the native library. The current native lib version is 1.5.0, hence 1.5.0-4 for zstd-jni. I am proposing bumping the zstd-jni version to the current. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16721) Repaired data tracking on a read coordinator is susceptible to races between local and remote requests
[ https://issues.apache.org/jira/browse/CASSANDRA-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404655#comment-17404655 ] Sam Tunnicliffe commented on CASSANDRA-16721: - Looks good to me, modulo a [few|https://github.com/apache/cassandra/pull/1160/files#r695998655], [trivial|https://github.com/apache/cassandra/pull/1160/files#r696006742], [nits|https://github.com/apache/cassandra/pull/1160/files#r696008838] (sorry, I forgot it was a PR not just a branch). I'm also a fan of Alex's suggestion to replace {{TEST_FORCE_ASYNC_LOCAL_READS}} with some ByteBuddy manipulation in the test. All of the above can be fixed (or not) on commit, so +1 from me too & thanks! > Repaired data tracking on a read coordinator is susceptible to races between > local and remote requests > -- > > Key: CASSANDRA-16721 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16721 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Sam Tunnicliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0.x, 4.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > At read time on a coordinator which is also a replica, the local and remote > reads can race such that the remote responses are received while the local > read is executing. If the remote responses are mismatching, triggering a > {{DigestMismatchException}} and subsequent round of full data reads and read > repair, the local runnable may find the {{isTrackingRepairedStatus}} flag > flipped mid-execution. If this happens after a certain point in execution, > it would mean > that the RepairedDataInfo instance in use is the singleton null object > {{RepairedDataInfo.NULL_REPAIRED_DATA_INFO}}. If this happens, it can lead to > an NPE when calling {{RepairedDataInfo::extend}} when the local results are > iterated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16721) Repaired data tracking on a read coordinator is susceptible to races between local and remote requests
[ https://issues.apache.org/jira/browse/CASSANDRA-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-16721: Reviewers: Alex Petrov, Caleb Rackliffe, Sam Tunnicliffe, Sam Tunnicliffe (was: Alex Petrov, Caleb Rackliffe, Sam Tunnicliffe) Alex Petrov, Caleb Rackliffe, Sam Tunnicliffe, Sam Tunnicliffe (was: Caleb Rackliffe, Sam Tunnicliffe) Status: Review In Progress (was: Patch Available) > Repaired data tracking on a read coordinator is susceptible to races between > local and remote requests > -- > > Key: CASSANDRA-16721 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16721 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Sam Tunnicliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0.x, 4.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > At read time on a coordinator which is also a replica, the local and remote > reads can race such that the remote responses are received while the local > read is executing. If the remote responses are mismatching, triggering a > {{DigestMismatchException}} and subsequent round of full data reads and read > repair, the local runnable may find the {{isTrackingRepairedStatus}} flag > flipped mid-execution. If this happens after a certain point in execution, > it would mean > that the RepairedDataInfo instance in use is the singleton null object > {{RepairedDataInfo.NULL_REPAIRED_DATA_INFO}}. If this happens, it can lead to > an NPE when calling {{RepairedDataInfo::extend}} when the local results are > iterated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16883) Weak visibility guarantees of Accumulator can lead to failure to recognize digest mismatches
Caleb Rackliffe created CASSANDRA-16883: --- Summary: Weak visibility guarantees of Accumulator can lead to failure to recognize digest mismatches Key: CASSANDRA-16883 URL: https://issues.apache.org/jira/browse/CASSANDRA-16883 Project: Cassandra Issue Type: Bug Components: Consistency/Coordination Reporter: Caleb Rackliffe Assignee: Caleb Rackliffe The context for this problem is largely the same as CASSANDRA-16807. The difference is that for 4.0+, CASSANDRA-16097 added an assertion to {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at least one visible element (although of course only one element trivially cannot generate a mismatch and short-circuits immediately). In 3.0 and 3.11, this assertion does not exist, and when the underlying problem occurs (i.e. zero responses are visible on {{Accumulator}} when there should be 2), we can silently avoid the digest matching entirely. This seems like it would make it both impossible to do a potentially necessary full data read to resolve the correct response and prevent repair. The fix here should be similar to the one in CASSANDRA-16807, although there might be some test infrastructure that needs porting in order to make that work. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16883) Weak visibility guarantees of Accumulator can lead to failure to recognize digest mismatches
[ https://issues.apache.org/jira/browse/CASSANDRA-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16883: Bug Category: Parent values: Correctness(12982)Level 1 values: Consistency(12989) Complexity: Normal Discovered By: Fuzz Test Fix Version/s: 3.11.x 3.0.x Severity: Critical Status: Open (was: Triage Needed) > Weak visibility guarantees of Accumulator can lead to failure to recognize > digest mismatches > > > Key: CASSANDRA-16883 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16883 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > The context for this problem is largely the same as CASSANDRA-16807. The > difference is that for 4.0+, CASSANDRA-16097 added an assertion to > {{DigestResolver#responseMatch()}} that ensures the responses snapshot has at > least one visible element (although of course only one element trivially > cannot generate a mismatch and short-circuits immediately). In 3.0 and 3.11, > this assertion does not exist, and when the underlying problem occurs (i.e. > zero responses are visible on {{Accumulator}} when there should be 2), we can > silently avoid the digest matching entirely. This seems like it would make it > both impossible to do a potentially necessary full data read to resolve the > correct response and prevent repair. > The fix here should be similar to the one in CASSANDRA-16807, although there > might be some test infrastructure that needs porting in order to make that > work. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16873) Tolerate missing DNS entry when completing a host replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404638#comment-17404638 ] Brandon Williams commented on CASSANDRA-16873: -- Alright - I think we've said enough on this ticket - let's follow up on the ML. > Tolerate missing DNS entry when completing a host replacement > - > > Key: CASSANDRA-16873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16873 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.0.x > > > In one of our deployments, after a host replacement a subset of nodes still > saw the nodes as JOINING despite the rest of the cluster seeing it as NORMAL > with a failure to gossip. This was traced to a DNS lookup failure on the > nodes during an interim state leading to an exception being thrown and gossip > state never transitioning. > Rather than implicitly requiring operators to bounce the node by throwing an > exception, we should instead suppress the exception when checking if a node > is replacing the same host address and ID if we get an UnknownHostException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16873) Tolerate missing DNS entry when completing a host replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404637#comment-17404637 ] Benedict Elliott Smith commented on CASSANDRA-16873: bq. And so it is for our sanity to focus development on trunk as much as possible. That is, the stability of the code remains dependent on reviews, where our ability to review is limited, and reviewing patches for multiple branches does costs more (in both attention and in time). I'm not sure this works out as simply as you suppose. The review burden for each patch increases the further the branches drift from each other. This was the very reason I wanted to backport the simulator stability work, so as to reduce the review burden for work that _does_ need to be backported (of which there will be a lot). Limiting ourselves to the least-possible backports potentially makes each backport costlier, reducing our review bandwidth for trunk. In reality this cost starts being accounted for as contributors shy away from back porting necessary work because of the additional burden. bq. I can also see that by encouraging the discussions to establish the waivers we can more organically grow the guideline documentation around it. Again, as far as governance goes there is no need to seek a waiver for anything that isn't a feature - however that is defined. We need to vote on new project governance documents if we want to impose any stronger restrictions that require a waiver. > Tolerate missing DNS entry when completing a host replacement > - > > Key: CASSANDRA-16873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16873 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.0.x > > > In one of our deployments, after a host replacement a subset of nodes still > saw the nodes as JOINING despite the rest of the cluster seeing it as NORMAL > with a failure to gossip. This was traced to a DNS lookup failure on the > nodes during an interim state leading to an exception being thrown and gossip > state never transitioning. > Rather than implicitly requiring operators to bounce the node by throwing an > exception, we should instead suppress the exception when checking if a node > is replacing the same host address and ID if we get an UnknownHostException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16873) Tolerate missing DNS entry when completing a host replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404624#comment-17404624 ] Michael Semb Wever edited comment on CASSANDRA-16873 at 8/25/21, 5:30 PM: -- My two cents, this patch falls into that "small improvement that doesn't do much else but fix something" category, and should be included in 4.0.x bq. Either way, this exchange highlights an issue to address which is that an absolutist policy leads to gamification of terminology. Clearly we shouldn't be calling something a bug fix so it can be included in a release. I agree that it is really important that we avoid any encouragement of gaming what a bug is. We do have some precedence for distinguishing "small improvements ok to go into patch versions" versus normal improvements, and we have been seeing more of it on the ML recently. But we have no guidelines in place how to make the distinction (it's being worked on). IMHO this is going to bite us now that we commit to annual releases (and are still ironing out what "stable trunk" means for us and how to achieve it) with folk encouraged to see the severity or need of an improvement as a reason to re-classify it. Including an improvement into a Patch version should be a decision based on what the patch touches and does. Until we clear up what that actually means, I am against improvements going into a Patch version by default, without first some discussion and consensus (on the ticket) to apply the waiver. My reasoning for voting on taking a more limited approach is… if we are to grow the community, and build momentum, it is going to be much harder for us to ensure quality (stable branches) through reviews. And so it is for our sanity to focus development on trunk as much as possible. That is, the stability of the code remains dependent on reviews, where our ability to review is limited, and reviewing patches for multiple branches does costs more (in both attention and in time). I can also see that by encouraging the discussions to establish the waivers we can more organically grow the guideline documentation around it. was (Author: michaelsembwever): My two cents, this patch falls into that "small improvement that doesn't do much else but fix something" category, and should be included in 4.0.x bq. Either way, this exchange highlights an issue to address which is that an absolutist policy leads to gamification of terminology. Clearly we shouldn't be calling something a bug fix so it can be included in a release. I agree that it is really important that we avoid any encouragement of gaming what a bug is. We do have some precedence for distinguishing "small improvements ok to go into patch versions" versus normal improvements, and we have been seeing more of it on the ML recently. But we have no guidelines in place how to make the distinction (it's being worked on). IMHO this is going to bite us now that we commit to annual releases (and are still ironing out what "stable trunk" means for us and how to achieve it) with folk encouraged to see the severity or need of an improvement as a reason to re-classify it. Including an improvement into a Patch version should be a decision based on what the patch touches and does. Until we clear up what that actually means, I am against improvements going into a Patch version by default, without first some discussion and consensus (on the ticket) to apply the waiver. My reasoning for voting on taking a more limited approach is… if we are to grow the community, and build momentum, it is going to be much harder for us to ensure quality (stable branches) through reviews only. And so it is for our sanity to focus development on trunk as much as possible. That is, the stability of the code remains dependent on reviews, where our ability to review is limited, and reviewing patches for multiple branches does costs more (in both attention and in time). I can also see that by encouraging the discussions to establish the waivers we can more organically grow the guideline documentation around it. > Tolerate missing DNS entry when completing a host replacement > - > > Key: CASSANDRA-16873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16873 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.0.x > > > In one of our deployments, after a host replacement a subset of nodes still > saw the nodes as JOINING despite the rest of the cluster seeing it as NORMAL > with a failure to gossip. This was traced to a DNS lookup failure on the > nodes during an interim state leading to an exception
[jira] [Comment Edited] (CASSANDRA-16873) Tolerate missing DNS entry when completing a host replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404624#comment-17404624 ] Michael Semb Wever edited comment on CASSANDRA-16873 at 8/25/21, 5:26 PM: -- My two cents, this patch falls into that "small improvement that doesn't do much else but fix something" category, and should be included in 4.0.x bq. Either way, this exchange highlights an issue to address which is that an absolutist policy leads to gamification of terminology. Clearly we shouldn't be calling something a bug fix so it can be included in a release. I agree that it is really important that we avoid any encouragement of gaming what a bug is. We do have some precedence for distinguishing "small improvements ok to go into patch versions" versus normal improvements, and we have been seeing more of it on the ML recently. But we have no guidelines in place how to make the distinction (it's being worked on). IMHO this is going to bite us now that we commit to annual releases (and are still ironing out what "stable trunk" means for us and how to achieve it) with folk encouraged to see the severity or need of an improvement as a reason to re-classify it. Including an improvement into a Patch version should be a decision based on what the patch touches and does. Until we clear up what that actually means, I am against improvements going into a Patch version by default, without first some discussion and consensus (on the ticket) to apply the waiver. My reasoning for voting on taking a more limited approach is… if we are to grow the community, and build momentum, it is going to be much harder for us to ensure quality (stable branches) through reviews only. And so it is for our sanity to focus development on trunk as much as possible. That is, the stability of the code remains dependent on reviews, where our ability to review is limited, and reviewing patches for multiple branches does costs more (in both attention and in time). I can also see that by encouraging the discussions to establish the waivers we can more organically grow the guideline documentation around it. was (Author: michaelsembwever): My two cents, this patch falls into that "small improvement that doesn't do much else but fix something" category. bq. Either way, this exchange highlights an issue to address which is that an absolutist policy leads to gamification of terminology. Clearly we shouldn't be calling something a bug fix so it can be included in a release. I agree that it is really important that we avoid any encouragement of gaming what a bug is. We do have some precedence for distinguishing "small improvements ok to go into patch versions" versus normal improvements, and we have been seeing more of it on the ML recently. But we have no guidelines in place how to make the distinction (it's being worked on). IMHO this is going to bite us now that we commit to annual releases (and are still ironing out what "stable trunk" means for us and how to achieve it) with folk encouraged to see the severity or need of an improvement as a reason to re-classify it. Including an improvement into a Patch version should be a decision based on what the patch touches and does. Until we clear up what that actually means, I am against improvements going into a Patch version by default, without first some discussion and consensus (on the ticket) to apply the waiver. My reasoning for voting on taking a more limited approach is… if we are to grow the community, and build momentum, it is going to be much harder for us to ensure quality (stable branches) through reviews only. And so it is for our sanity to focus development on trunk as much as possible. That is, the stability of the code remains dependent on reviews, where our ability to review is limited, and reviewing patches for multiple branches does costs more (in both attention and in time). I can also see that by encouraging the discussions to establish the waivers we can more organically grow the guideline documentation around it. > Tolerate missing DNS entry when completing a host replacement > - > > Key: CASSANDRA-16873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16873 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.0.x > > > In one of our deployments, after a host replacement a subset of nodes still > saw the nodes as JOINING despite the rest of the cluster seeing it as NORMAL > with a failure to gossip. This was traced to a DNS lookup failure on the > nodes during an interim state leading to an exception being thrown and gossip >
[jira] [Commented] (CASSANDRA-16873) Tolerate missing DNS entry when completing a host replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404624#comment-17404624 ] Michael Semb Wever commented on CASSANDRA-16873: My two cents, this patch falls into that "small improvement that doesn't do much else but fix something" category. bq. Either way, this exchange highlights an issue to address which is that an absolutist policy leads to gamification of terminology. Clearly we shouldn't be calling something a bug fix so it can be included in a release. I agree that it is really important that we avoid any encouragement of gaming what a bug is. We do have some precedence for distinguishing "small improvements ok to go into patch versions" versus normal improvements, and we have been seeing more of it on the ML recently. But we have no guidelines in place how to make the distinction (it's being worked on). IMHO this is going to bite us now that we commit to annual releases (and are still ironing out what "stable trunk" means for us and how to achieve it) with folk encouraged to see the severity or need of an improvement as a reason to re-classify it. Including an improvement into a Patch version should be a decision based on what the patch touches and does. Until we clear up what that actually means, I am against improvements going into a Patch version by default, without first some discussion and consensus (on the ticket) to apply the waiver. My reasoning for voting on taking a more limited approach is… if we are to grow the community, and build momentum, it is going to be much harder for us to ensure quality (stable branches) through reviews only. And so it is for our sanity to focus development on trunk as much as possible. That is, the stability of the code remains dependent on reviews, where our ability to review is limited, and reviewing patches for multiple branches does costs more (in both attention and in time). I can also see that by encouraging the discussions to establish the waivers we can more organically grow the guideline documentation around it. > Tolerate missing DNS entry when completing a host replacement > - > > Key: CASSANDRA-16873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16873 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.0.x > > > In one of our deployments, after a host replacement a subset of nodes still > saw the nodes as JOINING despite the rest of the cluster seeing it as NORMAL > with a failure to gossip. This was traced to a DNS lookup failure on the > nodes during an interim state leading to an exception being thrown and gossip > state never transitioning. > Rather than implicitly requiring operators to bounce the node by throwing an > exception, we should instead suppress the exception when checking if a node > is replacing the same host address and ID if we get an UnknownHostException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16879) Verify correct ownership of attached locations on disk at C* startup
[ https://issues.apache.org/jira/browse/CASSANDRA-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-16879: -- Test and Documentation Plan: New testing. Will need to document this on the official docs page based on javadocs in classes. Status: Patch Available (was: Open) > Verify correct ownership of attached locations on disk at C* startup > > > Key: CASSANDRA-16879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16879 > Project: Cassandra > Issue Type: Improvement > Components: Local/Startup and Shutdown >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > > There's two primary things related to startup and disk ownership we should > mitigate. > First, an instance can come up with an incorrectly mounted volume attached as > its configured data directory. This causes the wrong system tables to be > read. If the instance which was previously using the volume is also down, its > token could be taken over by the instance coming up. > Secondly, in a JBOD setup, the non-system keyspaces may reside on a separate > volume to the system tables. In this scenario, we need to ensure that all > directories belong to the same instance, and that as the instance starts up > it can access all the directories it expects to be able to. (including data, > commit log, hints and saved cache dirs) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16879) Verify correct ownership of attached locations on disk at C* startup
[ https://issues.apache.org/jira/browse/CASSANDRA-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-16879: -- Fix Version/s: 4.x > Verify correct ownership of attached locations on disk at C* startup > > > Key: CASSANDRA-16879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16879 > Project: Cassandra > Issue Type: Improvement > Components: Local/Startup and Shutdown >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.x > > > There's two primary things related to startup and disk ownership we should > mitigate. > First, an instance can come up with an incorrectly mounted volume attached as > its configured data directory. This causes the wrong system tables to be > read. If the instance which was previously using the volume is also down, its > token could be taken over by the instance coming up. > Secondly, in a JBOD setup, the non-system keyspaces may reside on a separate > volume to the system tables. In this scenario, we need to ensure that all > directories belong to the same instance, and that as the instance starts up > it can access all the directories it expects to be able to. (including data, > commit log, hints and saved cache dirs) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16879) Verify correct ownership of attached locations on disk at C* startup
[ https://issues.apache.org/jira/browse/CASSANDRA-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404621#comment-17404621 ] Josh McKenzie commented on CASSANDRA-16879: --- A few failures I'm confident are unrelated to the ticket. * JDK11: testUnloggedPartitionsPerBatch which passes locally. Think this was a circle config + env issue w/timeout. * replaceAliveHost which is failing in generaly atm * JDK8: incompletePropose which OOM'ed - see this on a couple other branches and unrelated to this ticket. Passing fine locally and on JDK11. ||Item|Link| |JDK8 tests|[Link|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/59/workflows/ff19f043-dc27-4d83-baf4-0510614a9c0c]| |JDK11 tests|[Link|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/59/workflows/6580b51c-8254-478b-a1f0-7cd6b6392c31]| |Branch|[Link|https://github.com/apache/cassandra/compare/cassandra-4.0...josh-mckenzie:CASSANDRA-16879?expand=1]| > Verify correct ownership of attached locations on disk at C* startup > > > Key: CASSANDRA-16879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16879 > Project: Cassandra > Issue Type: Improvement > Components: Local/Startup and Shutdown >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > > There's two primary things related to startup and disk ownership we should > mitigate. > First, an instance can come up with an incorrectly mounted volume attached as > its configured data directory. This causes the wrong system tables to be > read. If the instance which was previously using the volume is also down, its > token could be taken over by the instance coming up. > Secondly, in a JBOD setup, the non-system keyspaces may reside on a separate > volume to the system tables. In this scenario, we need to ensure that all > directories belong to the same instance, and that as the instance starts up > it can access all the directories it expects to be able to. (including data, > commit log, hints and saved cache dirs) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16877) High priority internode messages which exceed the large message threshold are dropped
[ https://issues.apache.org/jira/browse/CASSANDRA-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-16877: Fix Version/s: (was: 4.0.x) 4.0.1 Since Version: 4.0-alpha1 Source Control Link: https://github.com/apache/cassandra/commit/b8242730918c2e8edec83aeafeeae8255378125d Resolution: Fixed Status: Resolved (was: Ready to Commit) Thanks. Agreed about the tests, so committed (with one nit addressed and one swerved) to 4.0 in {{b8242730918c2e8edec83aeafeeae8255378125d}} and merged up to trunk. > High priority internode messages which exceed the large message threshold are > dropped > - > > Key: CASSANDRA-16877 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16877 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 4.0.1 > > > Currently, there is an assumption that internode messages whose verb has > priority P0 will always fit within a single messaging frame. While this is > usually the case, on occasion it is possible that this assumption does not > hold. One example is gossip messages during the startup shadow round, where > in very large clusters the digest ack can contain all states for every peer. > In this scenario, the respondent fails to send the ack which may lead to the > shadow round and, ultimately, the startup failing. > > We could tweak the shadow round acks to minimise the message size, but a more > robust solution would be to permit high priority messages to be sent on the > large messages connection when necessary. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16871) Add resource flags to CircleCi config generation script
[ https://issues.apache.org/jira/browse/CASSANDRA-16871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-16871: -- Fix Version/s: 4.x 4.0.x 3.11.x 3.0.x > Add resource flags to CircleCi config generation script > --- > > Key: CASSANDRA-16871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16871 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Low > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x > > > Currently we have three versions of the CircleCI config file using different > resources. Changing the resources configuration is as easy as copying the > desired template file, for example: > {code} > cp .circleci/config.yml.MIDRES .circleci/config.yml > {code} > If we want to make changes to the file, for example to set a specific dtest > repo or running the test multiplexer, we can run the provided generation > script, copy the template file and probably exclude the additional changes: > {code} > # edit config-2_1.yml > .circleci/generate.sh > cp .circleci/config.yml.MIDRES .circleci/config.yml > # undo the changes in config.yml.LOWRES, config.yml.MIDRES and > config.yml.HIGHRES > {code} > A very common alternative to this is just editing the environment variables > in the automatically generated {{config.yml}} file, which are repeated some > 19 times across the file: > {code} > cp .circleci/config.yml.MIDRES .circleci/config.yml > # edit config.yml, where env vars are repeated > {code} > I think we could do this slightly easier by adding a set of flags to the > generation script to apply the resources patch directly to {{config.yml}}, > without changing the templates: > {code} > # edit config-2_1.yml > .circleci/generate.sh -m > {code} > This has the advantage of not requiring manually editing the automatically > generated file and also providing some validation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16871) Add resource flags to CircleCi config generation script
[ https://issues.apache.org/jira/browse/CASSANDRA-16871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-16871: -- Test and Documentation Plan: Testing should be done manually by running the modified script. The patch includes changes in the documentation for CircleCI. Status: Patch Available (was: In Progress) > Add resource flags to CircleCi config generation script > --- > > Key: CASSANDRA-16871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16871 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Low > > Currently we have three versions of the CircleCI config file using different > resources. Changing the resources configuration is as easy as copying the > desired template file, for example: > {code} > cp .circleci/config.yml.MIDRES .circleci/config.yml > {code} > If we want to make changes to the file, for example to set a specific dtest > repo or running the test multiplexer, we can run the provided generation > script, copy the template file and probably exclude the additional changes: > {code} > # edit config-2_1.yml > .circleci/generate.sh > cp .circleci/config.yml.MIDRES .circleci/config.yml > # undo the changes in config.yml.LOWRES, config.yml.MIDRES and > config.yml.HIGHRES > {code} > A very common alternative to this is just editing the environment variables > in the automatically generated {{config.yml}} file, which are repeated some > 19 times across the file: > {code} > cp .circleci/config.yml.MIDRES .circleci/config.yml > # edit config.yml, where env vars are repeated > {code} > I think we could do this slightly easier by adding a set of flags to the > generation script to apply the resources patch directly to {{config.yml}}, > without changing the templates: > {code} > # edit config-2_1.yml > .circleci/generate.sh -m > {code} > This has the advantage of not requiring manually editing the automatically > generated file and also providing some validation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16877) High priority internode messages which exceed the large message threshold are dropped
[ https://issues.apache.org/jira/browse/CASSANDRA-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-16877: Status: Ready to Commit (was: Review In Progress) > High priority internode messages which exceed the large message threshold are > dropped > - > > Key: CASSANDRA-16877 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16877 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 4.0.x > > > Currently, there is an assumption that internode messages whose verb has > priority P0 will always fit within a single messaging frame. While this is > usually the case, on occasion it is possible that this assumption does not > hold. One example is gossip messages during the startup shadow round, where > in very large clusters the digest ack can contain all states for every peer. > In this scenario, the respondent fails to send the ack which may lead to the > shadow round and, ultimately, the startup failing. > > We could tweak the shadow round acks to minimise the message size, but a more > robust solution would be to permit high priority messages to be sent on the > large messages connection when necessary. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated (5dd472e -> 39efc83)
This is an automated email from the ASF dual-hosted git repository. samt pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git. from 5dd472e Merge branch 'cassandra-4.0' into trunk new b824273 Remove assumption that all urgent messages are small new 39efc83 Merge branch 'cassandra-4.0' into trunk The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGES.txt| 1 + .../apache/cassandra/net/OutboundConnections.java | 25 +++-- .../cassandra/net/OutboundConnectionsTest.java | 60 +++--- 3 files changed, 62 insertions(+), 24 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] 01/01: Merge branch 'cassandra-4.0' into trunk
This is an automated email from the ASF dual-hosted git repository. samt pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git commit 39efc8307acb62f8f5e9459269d193dc6f319037 Merge: 5dd472e b824273 Author: Sam Tunnicliffe AuthorDate: Wed Aug 25 17:42:50 2021 +0100 Merge branch 'cassandra-4.0' into trunk CHANGES.txt| 1 + .../apache/cassandra/net/OutboundConnections.java | 25 +++-- .../cassandra/net/OutboundConnectionsTest.java | 60 +++--- 3 files changed, 62 insertions(+), 24 deletions(-) diff --cc CHANGES.txt index a60b4e0,9ed3cec..a9c8ebd --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,21 -1,5 +1,22 @@@ -4.0.1 +4.1 + * allow blocking IPs from updating metrics about traffic (CASSANDRA-16859) + * Request-Based Native Transport Rate-Limiting (CASSANDRA-16663) + * Implement nodetool getauditlog command (CASSANDRA-16725) + * Clean up repair code (CASSANDRA-13720) + * Background schedule to clean up orphaned hints files (CASSANDRA-16815) + * Modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built (CASSANDRA-16776) + * Batch the token metadata update to improve the speed (CASSANDRA-15291) + * Reduce the log level on "expected" repair exceptions (CASSANDRA-16775) + * Make JMXTimer expose attributes using consistent time unit (CASSANDRA-16760) + * Remove check on gossip status from DynamicEndpointSnitch::updateScores (CASSANDRA-11671) + * Fix AbstractReadQuery::toCQLString not returning valid CQL (CASSANDRA-16510) + * Log when compacting many tombstones (CASSANDRA-16780) + * Display bytes per level in tablestats for LCS tables (CASSANDRA-16799) + * Add isolated flush timer to CommitLogMetrics and ensure writes correspond to single WaitingOnCommit data points (CASSANDRA-16701) + * Add a system property to set hostId if not yet initialized (CASSANDRA-14582) + * GossiperTest.testHasVersion3Nodes didn't take into account trunk version changes, fixed to rely on latest version (CASSANDRA-16651) +Merged from 4.0: + * Remove assumption that all urgent messages are small (CASSANDRA-16877) * ArrayClustering.unsharedHeapSize does not include the data so undercounts the heap size (CASSANDRA-16845) * Improve help, doc and error messages about sstabledump -k and -x arguments (CASSANDRA-16818) * Add repaired/unrepaired bytes back to nodetool (CASSANDRA-15282) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-4.0 updated: Remove assumption that all urgent messages are small
This is an automated email from the ASF dual-hosted git repository. samt pushed a commit to branch cassandra-4.0 in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/cassandra-4.0 by this push: new b824273 Remove assumption that all urgent messages are small b824273 is described below commit b8242730918c2e8edec83aeafeeae8255378125d Author: Sam Tunnicliffe AuthorDate: Thu Aug 12 10:47:54 2021 +0100 Remove assumption that all urgent messages are small Patch by Sam Tunnicliffe; reviewed by Caleb Rackliffe for CASSANDRA-16877 --- CHANGES.txt| 1 + .../apache/cassandra/net/OutboundConnections.java | 25 +++-- .../cassandra/net/OutboundConnectionsTest.java | 60 +++--- 3 files changed, 62 insertions(+), 24 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index ecd4409..9ed3cec 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0.1 + * Remove assumption that all urgent messages are small (CASSANDRA-16877) * ArrayClustering.unsharedHeapSize does not include the data so undercounts the heap size (CASSANDRA-16845) * Improve help, doc and error messages about sstabledump -k and -x arguments (CASSANDRA-16818) * Add repaired/unrepaired bytes back to nodetool (CASSANDRA-15282) diff --git a/src/java/org/apache/cassandra/net/OutboundConnections.java b/src/java/org/apache/cassandra/net/OutboundConnections.java index f1e1276..3f607d1 100644 --- a/src/java/org/apache/cassandra/net/OutboundConnections.java +++ b/src/java/org/apache/cassandra/net/OutboundConnections.java @@ -36,6 +36,7 @@ import org.apache.cassandra.config.Config; import org.apache.cassandra.gms.Gossiper; import org.apache.cassandra.locator.InetAddressAndPort; import org.apache.cassandra.metrics.InternodeOutboundMetrics; +import org.apache.cassandra.utils.NoSpamLogger; import org.apache.cassandra.utils.concurrent.SimpleCondition; import static org.apache.cassandra.net.MessagingService.current_version; @@ -199,12 +200,26 @@ public class OutboundConnections if (specifyConnection != null) return specifyConnection; -if (msg.verb().priority == Verb.Priority.P0) -return URGENT_MESSAGES; +if (msg.serializedSize(current_version) > LARGE_MESSAGE_THRESHOLD) +{ +if (msg.verb().priority == Verb.Priority.P0) +{ +NoSpamLogger.log(logger, NoSpamLogger.Level.WARN, 1, TimeUnit.MINUTES, + "Enqueued URGENT message which exceeds large message threshold"); + +if (logger.isTraceEnabled()) +logger.trace("{} message with size {} exceeded large message threshold {}", + msg.verb(), + msg.serializedSize(current_version), + LARGE_MESSAGE_THRESHOLD); +} + +return LARGE_MESSAGES; +} -return msg.serializedSize(current_version) <= LARGE_MESSAGE_THRESHOLD - ? SMALL_MESSAGES - : LARGE_MESSAGES; +return msg.verb().priority == Verb.Priority.P0 + ? URGENT_MESSAGES + : SMALL_MESSAGES; } @VisibleForTesting diff --git a/test/unit/org/apache/cassandra/net/OutboundConnectionsTest.java b/test/unit/org/apache/cassandra/net/OutboundConnectionsTest.java index 32faea3..538636a 100644 --- a/test/unit/org/apache/cassandra/net/OutboundConnectionsTest.java +++ b/test/unit/org/apache/cassandra/net/OutboundConnectionsTest.java @@ -35,11 +35,15 @@ import org.junit.Test; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.db.commitlog.CommitLog; import org.apache.cassandra.gms.GossipDigestSyn; +import org.apache.cassandra.io.IVersionedAsymmetricSerializer; import org.apache.cassandra.io.IVersionedSerializer; import org.apache.cassandra.io.util.DataInputPlus; import org.apache.cassandra.io.util.DataOutputPlus; import org.apache.cassandra.locator.InetAddressAndPort; +import static org.apache.cassandra.net.MessagingService.current_version; +import static org.apache.cassandra.net.OutboundConnections.LARGE_MESSAGE_THRESHOLD; + public class OutboundConnectionsTest { static final InetAddressAndPort LOCAL_ADDR = InetAddressAndPort.getByAddressOverrideDefaults(InetAddresses.forString("127.0.0.1"), 9476); @@ -48,6 +52,24 @@ public class OutboundConnectionsTest private static final List INTERNODE_MESSAGING_CONN_TYPES = ImmutableList.of(ConnectionType.URGENT_MESSAGES, ConnectionType.LARGE_MESSAGES, ConnectionType.SMALL_MESSAGES); private OutboundConnections connections; +// for testing messages larger than the size threshold, we just need a serializer to report a size, as fake as it may be +public static final IVersionedSerializer SERIALIZER = new IVersionedSerializer() +{ +public
[jira] [Commented] (CASSANDRA-16871) Add resource flags to CircleCi config generation script
[ https://issues.apache.org/jira/browse/CASSANDRA-16871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404578#comment-17404578 ] Andres de la Peña commented on CASSANDRA-16871: --- Thanks for looking into this :) bq. I would make a point that config.yml is actually the lowres as I saw people being confused about that. Good idea. I have changed the script messages and the readme trying to make it clear that the default {{config.yml}} uses low resources and that it is indeed a copy of {{confilg.yml.LOWRES}}. I have included a very brief [introductory section|https://github.com/adelapena/cassandra/blob/16871-3.0/.circleci/readme.md#circleci-config-files] in the readme in an attempt to give some context. bq. One thing, if you are using higher resources and later in time you decide also to change any of the environment variables, we should make it clear that running only .circleci/generate.sh will return people to default low resources with the new variables. If they want to keep the new resources they should use a flag again. Someone new might get confused. Makes sense, I have added an explicit warning about this [here|https://github.com/apache/cassandra/commit/4852b080a773493cb2e6d0bcb53931a703a2ebfa]. > Add resource flags to CircleCi config generation script > --- > > Key: CASSANDRA-16871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16871 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Low > > Currently we have three versions of the CircleCI config file using different > resources. Changing the resources configuration is as easy as copying the > desired template file, for example: > {code} > cp .circleci/config.yml.MIDRES .circleci/config.yml > {code} > If we want to make changes to the file, for example to set a specific dtest > repo or running the test multiplexer, we can run the provided generation > script, copy the template file and probably exclude the additional changes: > {code} > # edit config-2_1.yml > .circleci/generate.sh > cp .circleci/config.yml.MIDRES .circleci/config.yml > # undo the changes in config.yml.LOWRES, config.yml.MIDRES and > config.yml.HIGHRES > {code} > A very common alternative to this is just editing the environment variables > in the automatically generated {{config.yml}} file, which are repeated some > 19 times across the file: > {code} > cp .circleci/config.yml.MIDRES .circleci/config.yml > # edit config.yml, where env vars are repeated > {code} > I think we could do this slightly easier by adding a set of flags to the > generation script to apply the resources patch directly to {{config.yml}}, > without changing the templates: > {code} > # edit config-2_1.yml > .circleci/generate.sh -m > {code} > This has the advantage of not requiring manually editing the automatically > generated file and also providing some validation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16718) Changing listen_address with prefer_local may lead to issues
[ https://issues.apache.org/jira/browse/CASSANDRA-16718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404538#comment-17404538 ] Brandon Williams commented on CASSANDRA-16718: -- Can you describe the network configuration here that requires prefer_local? > Changing listen_address with prefer_local may lead to issues > > > Key: CASSANDRA-16718 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16718 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Jan Karlsson >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Many container based solution function by assigning new listen_addresses when > nodes are stopped. Changing the listen_address is usually as simple as > turning off the node and changing the yaml file. > However, if prefer_local is enabled, I observed that nodes were unable to > join the cluster and fail with 'Unable to gossip with any seeds'. > Trace shows that the changing node will try to communicate with the existing > node but the response is never received. I assume it is because the existing > node attempts to communicate with the local address during the shadow round. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16882) Save CircleCI resources with optional test jobs
[ https://issues.apache.org/jira/browse/CASSANDRA-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404433#comment-17404433 ] Andres de la Peña edited comment on CASSANDRA-16882 at 8/25/21, 2:46 PM: - I'm adding a fourth option that combines approaches 2 and 3, so the mandatory tests can be started either individually or all together with a single start button: ||Option||Branch||CI|| |1|[16882-option-1-trunk|https://github.com/adelapena/cassandra/tree/16882-option-1-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/9cb8ca7b-ab57-431e-a22b-643d61c92c29] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/3e26fd7e-5c5a-4ec3-8af9-4c247d96556a]| |2|[16882-option-2-trunk|https://github.com/adelapena/cassandra/tree/16882-option-2-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a859cfbc-fdf8-4468-beb9-b2ee17dc1ae3] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a4a86879-e283-4aa9-8121-c51fa79095e6]| |3|[16882-option-3-trunk|https://github.com/adelapena/cassandra/tree/16882-option-3-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/802/workflows/0372f5d6-d1f0-4f0e-91a3-aa75a2712bae] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/802/workflows/3a53f1d3-e43a-4aaa-b163-601b57ca28ac]| |4|[16882-option-4-trunk|https://github.com/adelapena/cassandra/tree/16882-option-4-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/803/workflows/08ae07d5-6a1e-4e5b-bc0c-32bdc9b9f190] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/803/workflows/51f1b801-afdd-45da-93e7-4f8e24067640]| This gives us the flexibility of the second approach with the click savings of the third approach. However, the downside is that is done by duplicating the jobs, because CircleCI doesn't allow disjunctions in job dependencies. That leaves us with a more complex graph, and I'm afraid that could be more confusing than just writing in the doc what tests are mandatory. was (Author: adelapena): I'm adding a fourth option that combines approaches 2 and 3, so the mandatory tests can be started either individually or all together with a single start button: ||Option||Branch||CI|| |1|[16882-option-1-trunk|https://github.com/adelapena/cassandra/tree/16882-option-1-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/9cb8ca7b-ab57-431e-a22b-643d61c92c29] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/3e26fd7e-5c5a-4ec3-8af9-4c247d96556a]| |2|[16882-option-2-trunk|https://github.com/adelapena/cassandra/tree/16882-option-2-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a859cfbc-fdf8-4468-beb9-b2ee17dc1ae3] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a4a86879-e283-4aa9-8121-c51fa79095e6]| |3|[16882-option-3-trunk|https://github.com/adelapena/cassandra/tree/16882-option-3-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/799/workflows/91f90e3a-e032-4d57-ba60-45d925c07c99] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/799/workflows/265a64f2-70b6-4a88-8045-89bdf50e5d8d]| |4|[16882-option-4-trunk|https://github.com/adelapena/cassandra/tree/16882-option-4-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/801/workflows/3b044fbb-0fda-4b30-9544-cdc259f8f09b] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/801/workflows/4c205d19-22ea-4ae8-8618-09c9ec7dcbe9]| This gives us the flexibility of the second approach with the click savings of the third approach. However, the downside is that is done by duplicating the jobs, because CircleCI doesn't allow disjunctions in job dependencies. That leaves us with a more complex graph, and I'm afraid that could be more confusing than just writing in the doc what tests are mandatory. > Save CircleCI resources with optional test jobs > --- > > Key: CASSANDRA-16882 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16882 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > > This ticket implements the addition of approval steps in the CircleCI > workflows as it was proposed in [this > email|https://lists.apache.org/thread.html/r57bab800d037c087af01b3779fd266d83b538cdd29c120f74a5dbe63%40%3Cdev.cassandra.apache.org%3E] > sent to the dev list: > The current CircleCI configuration automatically runs the unit tests, JVM > dtests and cqhshlib tests. This is done by default for every commit or, with > some
[jira] [Commented] (CASSANDRA-16789) Add TTL support to nodetool snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404482#comment-17404482 ] Stefan Miklosovic commented on CASSANDRA-16789: --- +1 > Add TTL support to nodetool snapshots > - > > Key: CASSANDRA-16789 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16789 > Project: Cassandra > Issue Type: Sub-task > Components: Tool/nodetool >Reporter: Paulo Motta >Assignee: Abuli Palagashvili >Priority: Normal > Fix For: 4.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Add new parameter {{--ttl}} to {{nodetool snapshot}} command. This parameter > can be specified in human readable duration (ie. 30mins, 1h, 300d) and should > not be lower than 1 minute. > The expiration date should be added to the snapshot manifest in ISO format. > A periodic thread should efficiently scan snapshots and automatically clear > those past expiration date. The periodicity of the scan thread should be 1 > minute by default but be overridable via a system property. > The command {{nodetool listsnapshots}} should display the expiration date > when the snapshot contains a TTL. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16842) Allow CommitLogSegmentReader to optionally skip sync marker CRC checks
[ https://issues.apache.org/jira/browse/CASSANDRA-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-16842: -- Status: Ready to Commit (was: Review In Progress) > Allow CommitLogSegmentReader to optionally skip sync marker CRC checks > -- > > Key: CASSANDRA-16842 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16842 > Project: Cassandra > Issue Type: Improvement > Components: Local/Commit Log >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 20m > Remaining Estimate: 0h > > CommitLog sync markers are written in two phases. In the first, zeroes are > written for the position of the next sync marker and the sync marker CRC > value. In the second, when the next sync marker is written, the actual > position and CRC values are written. If the process shuts down in a > disorderly fashion, it is entirely possible for a valid next marker position > to be written to our memory mapped file but not the final CRC value. Later, > when we attempt to replay the segment, we will fail without recovering any of > the perfectly valid mutations it contains. (This assumes we’re confining > ourselves to the case where there is no compression or encryption.) > {noformat} > ERROR 2020-11-18T10:55:23,888 [main] > org.apache.cassandra.utils.JVMStabilityInspector:102 - Exiting due to error > while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Encountered bad header at position 23091775 of commit log > …/CommitLog-6-1605699607608.log, with invalid CRC. The end of segment marker > should be zero. > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:731) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.readSyncMarker(CommitLogReplayer.java:274) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:436) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:189) > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:170 > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:151) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:332) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:656) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:808) > {noformat} > It may be useful to provide an option that would allow us to override the > default/strict behavior here and skip the CRC check if a non-zero end > position is present, allowing valid mutations to be recovered and startup to > proceed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16842) Allow CommitLogSegmentReader to optionally skip sync marker CRC checks
[ https://issues.apache.org/jira/browse/CASSANDRA-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404471#comment-17404471 ] Josh McKenzie commented on CASSANDRA-16842: --- A few formatting nits but otherwise +1 > Allow CommitLogSegmentReader to optionally skip sync marker CRC checks > -- > > Key: CASSANDRA-16842 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16842 > Project: Cassandra > Issue Type: Improvement > Components: Local/Commit Log >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 20m > Remaining Estimate: 0h > > CommitLog sync markers are written in two phases. In the first, zeroes are > written for the position of the next sync marker and the sync marker CRC > value. In the second, when the next sync marker is written, the actual > position and CRC values are written. If the process shuts down in a > disorderly fashion, it is entirely possible for a valid next marker position > to be written to our memory mapped file but not the final CRC value. Later, > when we attempt to replay the segment, we will fail without recovering any of > the perfectly valid mutations it contains. (This assumes we’re confining > ourselves to the case where there is no compression or encryption.) > {noformat} > ERROR 2020-11-18T10:55:23,888 [main] > org.apache.cassandra.utils.JVMStabilityInspector:102 - Exiting due to error > while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Encountered bad header at position 23091775 of commit log > …/CommitLog-6-1605699607608.log, with invalid CRC. The end of segment marker > should be zero. > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:731) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.readSyncMarker(CommitLogReplayer.java:274) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:436) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:189) > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:170 > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:151) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:332) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:656) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:808) > {noformat} > It may be useful to provide an option that would allow us to override the > default/strict behavior here and skip the CRC check if a non-zero end > position is present, allowing valid mutations to be recovered and startup to > proceed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16842) Allow CommitLogSegmentReader to optionally skip sync marker CRC checks
[ https://issues.apache.org/jira/browse/CASSANDRA-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh McKenzie updated CASSANDRA-16842: -- Status: Review In Progress (was: Patch Available) > Allow CommitLogSegmentReader to optionally skip sync marker CRC checks > -- > > Key: CASSANDRA-16842 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16842 > Project: Cassandra > Issue Type: Improvement > Components: Local/Commit Log >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.x > > Time Spent: 20m > Remaining Estimate: 0h > > CommitLog sync markers are written in two phases. In the first, zeroes are > written for the position of the next sync marker and the sync marker CRC > value. In the second, when the next sync marker is written, the actual > position and CRC values are written. If the process shuts down in a > disorderly fashion, it is entirely possible for a valid next marker position > to be written to our memory mapped file but not the final CRC value. Later, > when we attempt to replay the segment, we will fail without recovering any of > the perfectly valid mutations it contains. (This assumes we’re confining > ourselves to the case where there is no compression or encryption.) > {noformat} > ERROR 2020-11-18T10:55:23,888 [main] > org.apache.cassandra.utils.JVMStabilityInspector:102 - Exiting due to error > while processing commit log during initialization. > org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: > Encountered bad header at position 23091775 of commit log > …/CommitLog-6-1605699607608.log, with invalid CRC. The end of segment marker > should be zero. > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:731) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.readSyncMarker(CommitLogReplayer.java:274) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:436) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:189) > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:170 > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:151) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:332) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:656) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:808) > {noformat} > It may be useful to provide an option that would allow us to override the > default/strict behavior here and skip the CRC check if a non-zero end > position is present, allowing valid mutations to be recovered and startup to > proceed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16850) Add client warnings and abort to tombstone and coordinator reads which go past a low/high watermark
[ https://issues.apache.org/jira/browse/CASSANDRA-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-16850: Reviewers: Blake Eggleston, Marcus Eriksson (was: Blake Eggleston) > Add client warnings and abort to tombstone and coordinator reads which go > past a low/high watermark > --- > > Key: CASSANDRA-16850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16850 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Logging >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.1 > > Time Spent: 40m > Remaining Estimate: 0h > > We currently will abort queries if we hit too many tombstones, but its common > that we would want to also warn clients (client warnings) about this before > we get that point; its also common that different logic would like to be able > to warn/abort about client options (such as reading a large partition). To > allow this we should add a concept of low/high watermarks (warn/abort) to > tombstones and coordinator reads. > Another issue is that current aborts look the same as a random failure, so > from an SLA point of view it would be good to differentiate between user > behavior being rejected and unexplained issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14309) Make hint window persistent across restarts
[ https://issues.apache.org/jira/browse/CASSANDRA-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-14309: -- Fix Version/s: (was: 4.x) 4.1 > Make hint window persistent across restarts > --- > > Key: CASSANDRA-14309 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14309 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Kurt Greaves >Assignee: Stefan Miklosovic >Priority: Low > Fix For: 4.1 > > Time Spent: 10m > Remaining Estimate: 0h > > The current hint system stores a window of hints as defined by > {{max_hint_window_in_ms}}, however this window is not persistent across > restarts. > Examples (cluster with RF=3 and 3 nodes, A, B, and C): > # A goes down > # X ms of hints are stored for A on B and C > # A is restarted > # A goes down again without hints replaying from B and C > # B and C will store up to another {{max_hint_window_in_ms}} of hints for A > > # A goes down > # X ms of hints are stored for A on B and C > # B is restarted > # B will store up to another {{max_hint_window_in_ms}} of hints for A > > Note that in both these scenarios they can continue forever. If A or B keeps > getting restarted hints will continue to pile up. > > Idea of this ticket is to stop this behaviour from happening and only ever > store up to {{max_hint_window_in_ms}} of hints for a particular node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14309) Make hint window persistent across restarts
[ https://issues.apache.org/jira/browse/CASSANDRA-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-14309: -- Status: Ready to Commit (was: Review In Progress) > Make hint window persistent across restarts > --- > > Key: CASSANDRA-14309 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14309 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Kurt Greaves >Assignee: Stefan Miklosovic >Priority: Low > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > The current hint system stores a window of hints as defined by > {{max_hint_window_in_ms}}, however this window is not persistent across > restarts. > Examples (cluster with RF=3 and 3 nodes, A, B, and C): > # A goes down > # X ms of hints are stored for A on B and C > # A is restarted > # A goes down again without hints replaying from B and C > # B and C will store up to another {{max_hint_window_in_ms}} of hints for A > > # A goes down > # X ms of hints are stored for A on B and C > # B is restarted > # B will store up to another {{max_hint_window_in_ms}} of hints for A > > Note that in both these scenarios they can continue forever. If A or B keeps > getting restarted hints will continue to pile up. > > Idea of this ticket is to stop this behaviour from happening and only ever > store up to {{max_hint_window_in_ms}} of hints for a particular node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14309) Make hint window persistent across restarts
[ https://issues.apache.org/jira/browse/CASSANDRA-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404438#comment-17404438 ] Stefan Miklosovic edited comment on CASSANDRA-14309 at 8/25/21, 1:18 PM: - I made this feature enabled by default and I updated NEWS. Branches are same. I am running the build here https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1062/ I plan to merge on Friday 27th in the afternoon CEST if anybody wants to take a look before. was (Author: stefan.miklosovic): I made this feature enabled by default and I updated NEWS. Branches are same. I am running the build here https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1062/ If plan to merge on Friday 27th in the afternoon CEST if anybody wants to take a look before. > Make hint window persistent across restarts > --- > > Key: CASSANDRA-14309 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14309 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Kurt Greaves >Assignee: Stefan Miklosovic >Priority: Low > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > The current hint system stores a window of hints as defined by > {{max_hint_window_in_ms}}, however this window is not persistent across > restarts. > Examples (cluster with RF=3 and 3 nodes, A, B, and C): > # A goes down > # X ms of hints are stored for A on B and C > # A is restarted > # A goes down again without hints replaying from B and C > # B and C will store up to another {{max_hint_window_in_ms}} of hints for A > > # A goes down > # X ms of hints are stored for A on B and C > # B is restarted > # B will store up to another {{max_hint_window_in_ms}} of hints for A > > Note that in both these scenarios they can continue forever. If A or B keeps > getting restarted hints will continue to pile up. > > Idea of this ticket is to stop this behaviour from happening and only ever > store up to {{max_hint_window_in_ms}} of hints for a particular node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14309) Make hint window persistent across restarts
[ https://issues.apache.org/jira/browse/CASSANDRA-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404438#comment-17404438 ] Stefan Miklosovic commented on CASSANDRA-14309: --- I made this feature enabled by default and I updated NEWS. Branches are same. I am running the build here https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1062/ If plan to merge on Friday 27th in the afternoon CEST if anybody wants to take a look before. > Make hint window persistent across restarts > --- > > Key: CASSANDRA-14309 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14309 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Kurt Greaves >Assignee: Stefan Miklosovic >Priority: Low > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > The current hint system stores a window of hints as defined by > {{max_hint_window_in_ms}}, however this window is not persistent across > restarts. > Examples (cluster with RF=3 and 3 nodes, A, B, and C): > # A goes down > # X ms of hints are stored for A on B and C > # A is restarted > # A goes down again without hints replaying from B and C > # B and C will store up to another {{max_hint_window_in_ms}} of hints for A > > # A goes down > # X ms of hints are stored for A on B and C > # B is restarted > # B will store up to another {{max_hint_window_in_ms}} of hints for A > > Note that in both these scenarios they can continue forever. If A or B keeps > getting restarted hints will continue to pile up. > > Idea of this ticket is to stop this behaviour from happening and only ever > store up to {{max_hint_window_in_ms}} of hints for a particular node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16882) Save CircleCI resources with optional test jobs
[ https://issues.apache.org/jira/browse/CASSANDRA-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404433#comment-17404433 ] Andres de la Peña commented on CASSANDRA-16882: --- I'm adding a fourth option that combines approaches 2 and 3, so the mandatory tests can be started either individually or all together with a single start button: ||Option||Branch||CI|| |1|[16882-option-1-trunk|https://github.com/adelapena/cassandra/tree/16882-option-1-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/9cb8ca7b-ab57-431e-a22b-643d61c92c29] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/3e26fd7e-5c5a-4ec3-8af9-4c247d96556a]| |2|[16882-option-2-trunk|https://github.com/adelapena/cassandra/tree/16882-option-2-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a859cfbc-fdf8-4468-beb9-b2ee17dc1ae3] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a4a86879-e283-4aa9-8121-c51fa79095e6]| |3|[16882-option-3-trunk|https://github.com/adelapena/cassandra/tree/16882-option-3-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/799/workflows/91f90e3a-e032-4d57-ba60-45d925c07c99] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/799/workflows/265a64f2-70b6-4a88-8045-89bdf50e5d8d]| |4|[16882-option-4-trunk|https://github.com/adelapena/cassandra/tree/16882-option-4-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/801/workflows/3b044fbb-0fda-4b30-9544-cdc259f8f09b] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/801/workflows/4c205d19-22ea-4ae8-8618-09c9ec7dcbe9]| This gives us the flexibility of the second approach with the click savings of the third approach. However, the downside is that is done by duplicating the jobs, because CircleCI doesn't allow disjunctions in job dependencies. That leaves us with a more complex graph, and I'm afraid that could be more confusing than just writing in the doc what tests are mandatory. > Save CircleCI resources with optional test jobs > --- > > Key: CASSANDRA-16882 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16882 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > > This ticket implements the addition of approval steps in the CircleCI > workflows as it was proposed in [this > email|https://lists.apache.org/thread.html/r57bab800d037c087af01b3779fd266d83b538cdd29c120f74a5dbe63%40%3Cdev.cassandra.apache.org%3E] > sent to the dev list: > The current CircleCI configuration automatically runs the unit tests, JVM > dtests and cqhshlib tests. This is done by default for every commit or, with > some configuration, for every push. > Along the lifecycle of a ticket it is quite frequent to have multiple commits > and pushes, all running these test jobs. I'd say that frequently it is not > necessary to run the tests for some of those intermediate commits and pushes. > For example, one can show proofs of concept, or have multiple rounds of > review before actually running the tests. Running the tests for every change > can produce an unnecessary expense of CircleCI resources. > I think we could make running those tests optional, as well as clearly > specifying in the documentation what are the tests runs that are mandatory > before actually committing. We could do this in different ways: > # Make the entire CircleCI workflow optional, so the build job requires > manual approval. Once the build is approved the mandatory test jobs would > be run without any further approval, exactly as it's currently done. > # Make all the test jobs optional, so every test job requires manual > approval, and the documentation specifies which tests are mandatory in the > final steps of a ticket. > # Make all the mandatory test jobs depend on a single optional job, so we > have a single button to optionally run all the mandatory tests. > I think any of these changes, or a combination of them, would significantly > reduce the usage of resources without making things less tested. The only > downside I can think of is that we would need some additional clicks on the > CircleCI GUI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16882) Save CircleCI resources with optional test jobs
[ https://issues.apache.org/jira/browse/CASSANDRA-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404404#comment-17404404 ] Andres de la Peña commented on CASSANDRA-16882: --- CC [~edimitrova] > Save CircleCI resources with optional test jobs > --- > > Key: CASSANDRA-16882 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16882 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > > This ticket implements the addition of approval steps in the CircleCI > workflows as it was proposed in [this > email|https://lists.apache.org/thread.html/r57bab800d037c087af01b3779fd266d83b538cdd29c120f74a5dbe63%40%3Cdev.cassandra.apache.org%3E] > sent to the dev list: > The current CircleCI configuration automatically runs the unit tests, JVM > dtests and cqhshlib tests. This is done by default for every commit or, with > some configuration, for every push. > Along the lifecycle of a ticket it is quite frequent to have multiple commits > and pushes, all running these test jobs. I'd say that frequently it is not > necessary to run the tests for some of those intermediate commits and pushes. > For example, one can show proofs of concept, or have multiple rounds of > review before actually running the tests. Running the tests for every change > can produce an unnecessary expense of CircleCI resources. > I think we could make running those tests optional, as well as clearly > specifying in the documentation what are the tests runs that are mandatory > before actually committing. We could do this in different ways: > # Make the entire CircleCI workflow optional, so the build job requires > manual approval. Once the build is approved the mandatory test jobs would > be run without any further approval, exactly as it's currently done. > # Make all the test jobs optional, so every test job requires manual > approval, and the documentation specifies which tests are mandatory in the > final steps of a ticket. > # Make all the mandatory test jobs depend on a single optional job, so we > have a single button to optionally run all the mandatory tests. > I think any of these changes, or a combination of them, would significantly > reduce the usage of resources without making things less tested. The only > downside I can think of is that we would need some additional clicks on the > CircleCI GUI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16882) Save CircleCI resources with optional test jobs
[ https://issues.apache.org/jira/browse/CASSANDRA-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-16882: -- Change Category: Quality Assurance Complexity: Low Hanging Fruit Status: Open (was: Triage Needed) > Save CircleCI resources with optional test jobs > --- > > Key: CASSANDRA-16882 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16882 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > > This ticket implements the addition of approval steps in the CircleCI > workflows as it was proposed in [this > email|https://lists.apache.org/thread.html/r57bab800d037c087af01b3779fd266d83b538cdd29c120f74a5dbe63%40%3Cdev.cassandra.apache.org%3E] > sent to the dev list: > The current CircleCI configuration automatically runs the unit tests, JVM > dtests and cqhshlib tests. This is done by default for every commit or, with > some configuration, for every push. > Along the lifecycle of a ticket it is quite frequent to have multiple commits > and pushes, all running these test jobs. I'd say that frequently it is not > necessary to run the tests for some of those intermediate commits and pushes. > For example, one can show proofs of concept, or have multiple rounds of > review before actually running the tests. Running the tests for every change > can produce an unnecessary expense of CircleCI resources. > I think we could make running those tests optional, as well as clearly > specifying in the documentation what are the tests runs that are mandatory > before actually committing. We could do this in different ways: > # Make the entire CircleCI workflow optional, so the build job requires > manual approval. Once the build is approved the mandatory test jobs would > be run without any further approval, exactly as it's currently done. > # Make all the test jobs optional, so every test job requires manual > approval, and the documentation specifies which tests are mandatory in the > final steps of a ticket. > # Make all the mandatory test jobs depend on a single optional job, so we > have a single button to optionally run all the mandatory tests. > I think any of these changes, or a combination of them, would significantly > reduce the usage of resources without making things less tested. The only > downside I can think of is that we would need some additional clicks on the > CircleCI GUI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16882) Save CircleCI resources with optional test jobs
[ https://issues.apache.org/jira/browse/CASSANDRA-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404399#comment-17404399 ] Andres de la Peña commented on CASSANDRA-16882: --- Here are drafts of what each approach would look like for trunk: ||Option||Branch||CI|| |1|[16882-option-1-trunk|https://github.com/adelapena/cassandra/tree/16882-option-1-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/9cb8ca7b-ab57-431e-a22b-643d61c92c29] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/800/workflows/3e26fd7e-5c5a-4ec3-8af9-4c247d96556a]| |2|[16882-option-2-trunk|https://github.com/adelapena/cassandra/tree/16882-option-2-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a859cfbc-fdf8-4468-beb9-b2ee17dc1ae3] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/798/workflows/a4a86879-e283-4aa9-8121-c51fa79095e6]| |3|[16882-option-3-trunk|https://github.com/adelapena/cassandra/tree/16882-option-3-trunk]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/799/workflows/91f90e3a-e032-4d57-ba60-45d925c07c99] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/799/workflows/265a64f2-70b6-4a88-8045-89bdf50e5d8d]| > Save CircleCI resources with optional test jobs > --- > > Key: CASSANDRA-16882 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16882 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > > This ticket implements the addition of approval steps in the CircleCI > workflows as it was proposed in [this > email|https://lists.apache.org/thread.html/r57bab800d037c087af01b3779fd266d83b538cdd29c120f74a5dbe63%40%3Cdev.cassandra.apache.org%3E] > sent to the dev list: > The current CircleCI configuration automatically runs the unit tests, JVM > dtests and cqhshlib tests. This is done by default for every commit or, with > some configuration, for every push. > Along the lifecycle of a ticket it is quite frequent to have multiple commits > and pushes, all running these test jobs. I'd say that frequently it is not > necessary to run the tests for some of those intermediate commits and pushes. > For example, one can show proofs of concept, or have multiple rounds of > review before actually running the tests. Running the tests for every change > can produce an unnecessary expense of CircleCI resources. > I think we could make running those tests optional, as well as clearly > specifying in the documentation what are the tests runs that are mandatory > before actually committing. We could do this in different ways: > # Make the entire CircleCI workflow optional, so the build job requires > manual approval. Once the build is approved the mandatory test jobs would > be run without any further approval, exactly as it's currently done. > # Make all the test jobs optional, so every test job requires manual > approval, and the documentation specifies which tests are mandatory in the > final steps of a ticket. > # Make all the mandatory test jobs depend on a single optional job, so we > have a single button to optionally run all the mandatory tests. > I think any of these changes, or a combination of them, would significantly > reduce the usage of resources without making things less tested. The only > downside I can think of is that we would need some additional clicks on the > CircleCI GUI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16882) Save CircleCI resources with optional test jobs
Andres de la Peña created CASSANDRA-16882: - Summary: Save CircleCI resources with optional test jobs Key: CASSANDRA-16882 URL: https://issues.apache.org/jira/browse/CASSANDRA-16882 Project: Cassandra Issue Type: Task Components: CI Reporter: Andres de la Peña Assignee: Andres de la Peña This ticket implements the addition of approval steps in the CircleCI workflows as it was proposed in [this email|https://lists.apache.org/thread.html/r57bab800d037c087af01b3779fd266d83b538cdd29c120f74a5dbe63%40%3Cdev.cassandra.apache.org%3E] sent to the dev list: The current CircleCI configuration automatically runs the unit tests, JVM dtests and cqhshlib tests. This is done by default for every commit or, with some configuration, for every push. Along the lifecycle of a ticket it is quite frequent to have multiple commits and pushes, all running these test jobs. I'd say that frequently it is not necessary to run the tests for some of those intermediate commits and pushes. For example, one can show proofs of concept, or have multiple rounds of review before actually running the tests. Running the tests for every change can produce an unnecessary expense of CircleCI resources. I think we could make running those tests optional, as well as clearly specifying in the documentation what are the tests runs that are mandatory before actually committing. We could do this in different ways: # Make the entire CircleCI workflow optional, so the build job requires manual approval. Once the build is approved the mandatory test jobs would be run without any further approval, exactly as it's currently done. # Make all the test jobs optional, so every test job requires manual approval, and the documentation specifies which tests are mandatory in the final steps of a ticket. # Make all the mandatory test jobs depend on a single optional job, so we have a single button to optionally run all the mandatory tests. I think any of these changes, or a combination of them, would significantly reduce the usage of resources without making things less tested. The only downside I can think of is that we would need some additional clicks on the CircleCI GUI. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16873) Tolerate missing DNS entry when completing a host replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404384#comment-17404384 ] Benedict Elliott Smith commented on CASSANDRA-16873: Either way, this exchange highlights an issue to address which is that an absolutist policy leads to gamification of terminology. Clearly we shouldn't be calling something a bug fix _so it can be included in a release_. I agree it would be a good idea to discuss this more on list. Eventually we'll zero in on a policy we can all agree to adopt as well as agree what it means. It's probably sensible to preferentially refactor the _existing_ wiki docs we have on this topic though, and to vote again to modify them. Otherwise we're just going to have a classic "now you have two problems" situation. > Tolerate missing DNS entry when completing a host replacement > - > > Key: CASSANDRA-16873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16873 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.0.x > > > In one of our deployments, after a host replacement a subset of nodes still > saw the nodes as JOINING despite the rest of the cluster seeing it as NORMAL > with a failure to gossip. This was traced to a DNS lookup failure on the > nodes during an interim state leading to an exception being thrown and gossip > state never transitioning. > Rather than implicitly requiring operators to bounce the node by throwing an > exception, we should instead suppress the exception when checking if a node > is replacing the same host address and ID if we get an UnknownHostException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16873) Tolerate missing DNS entry when completing a host replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-16873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404378#comment-17404378 ] Josh McKenzie commented on CASSANDRA-16873: --- bq. if we're calling this an improvement it needs to go into 4.1 as 4.0 is bugfix only. [~mck] is working on a wiki doc about this very issue, and there's some confusion on the topic. Probably worth taking to the ML, but my understanding is roughly as follows: # 4.0->5.0 == protocol/deprecation/API break # 4.0->4.1 == new features and disruptive changes # 4.0.0-4.0.1 == bug fixes and small improvements that are optional, don't change default behavior, are additive, and should not impact existing clusters fwiw, the above matches Mick's draft wiki article but _doesn't_ match Caleb's understanding (he and I were discussing this offline yesterday). TL;DR: Probably should hit the ML. I'm happy for this to go wherever, but we should all get aligned and document this. > Tolerate missing DNS entry when completing a host replacement > - > > Key: CASSANDRA-16873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16873 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership >Reporter: Josh McKenzie >Assignee: Josh McKenzie >Priority: Normal > Fix For: 4.0.x > > > In one of our deployments, after a host replacement a subset of nodes still > saw the nodes as JOINING despite the rest of the cluster seeing it as NORMAL > with a failure to gossip. This was traced to a DNS lookup failure on the > nodes during an interim state leading to an exception being thrown and gossip > state never transitioning. > Rather than implicitly requiring operators to bounce the node by throwing an > exception, we should instead suppress the exception when checking if a node > is replacing the same host address and ID if we get an UnknownHostException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14564) Adding regular column to COMPACT tables without clustering columns should trigger an InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404335#comment-17404335 ] Stefan Miklosovic commented on CASSANDRA-14564: --- I am not completely sure about the test results, I ll try to debug it more. I am not sure if it is just flaky or I introduced some regression. > Adding regular column to COMPACT tables without clustering columns should > trigger an InvalidRequestException > - > > Key: CASSANDRA-14564 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14564 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core, Legacy/CQL >Reporter: Laxmikant Upadhyay >Assignee: Stefan Miklosovic >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > I have upgraded my system from cassandra 2.1.16 to 3.11.2. We had some tables > with COMPACT STORAGE enabled. We see some weird behaviour of cassandra > while adding a column into it. > Cassandra does not give any error while altering however the added column is > invisible. > Same behaviour when we create a new table with compact storage and try to > alter it. Below is the commands ran in sequence: > > {code:java} > x@cqlsh:xuser> CREATE TABLE xuser.employee(emp_id int PRIMARY KEY,emp_name > text, emp_city text, emp_sal varint, emp_phone varint ) WITH COMPACT STORAGE; > x@cqlsh:xuser> desc table xuser.employee ; > CREATE TABLE xuser.employee ( > emp_id int PRIMARY KEY, > emp_city text, > emp_name text, > emp_phone varint, > emp_sal varint > ) WITH COMPACT STORAGE > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE';{code} > Now altering the table by adding a new column: > > {code:java} > x@cqlsh:xuser> alter table employee add profile text; > x@cqlsh:xuser> desc table xuser.employee ; > CREATE TABLE xuser.employee ( > emp_id int PRIMARY KEY, > emp_city text, > emp_name text, > emp_phone varint, > emp_sal varint > ) WITH COMPACT STORAGE > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} > notice that above desc table result does not have newly added column profile. > However when i try to add it again it gives column already exist; > {code:java} > x@cqlsh:xuser> alter table employee add profile text; > InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid > column name profile because it conflicts with an existing column" > x@cqlsh:xuser> select emp_name,profile from employee; > emp_name | profile > --+- > (0 rows) > x@cqlsh:xuser> > {code} > Inserting also behaves strange: > {code:java} > x@cqlsh:xuser> INSERT INTO employee (emp_id , emp_city , emp_name , emp_phone > , emp_sal ,profile) VALUES ( 1, 'ggn', 'john', 123456, 5, 'SE'); > InvalidRequest: Error from server: code=2200 [Invalid query] message="Some > clustering keys are missing: column1" > x@cqlsh:xuser> INSERT INTO employee (emp_id , emp_city , emp_name , emp_phone > , emp_sal ,profile,column1) VALUES ( 1, 'ggn', 'john', 123456, 5, > 'SE',null); > x@cqlsh:xuser> select * from employee; > emp_id | emp_city | emp_name | emp_phone | emp_sal > +--+--+---+- > (0 rows) > {code} > *How to solve that ticket* > ([~blerer])-- > > Adding regular columns to non-dense compact tables should be forbidden as it > is the
[jira] [Updated] (CASSANDRA-14564) Adding regular column to COMPACT tables without clustering columns should trigger an InvalidRequestException
[ https://issues.apache.org/jira/browse/CASSANDRA-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-14564: -- Status: Requires Testing (was: Review In Progress) > Adding regular column to COMPACT tables without clustering columns should > trigger an InvalidRequestException > - > > Key: CASSANDRA-14564 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14564 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core, Legacy/CQL >Reporter: Laxmikant Upadhyay >Assignee: Stefan Miklosovic >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > I have upgraded my system from cassandra 2.1.16 to 3.11.2. We had some tables > with COMPACT STORAGE enabled. We see some weird behaviour of cassandra > while adding a column into it. > Cassandra does not give any error while altering however the added column is > invisible. > Same behaviour when we create a new table with compact storage and try to > alter it. Below is the commands ran in sequence: > > {code:java} > x@cqlsh:xuser> CREATE TABLE xuser.employee(emp_id int PRIMARY KEY,emp_name > text, emp_city text, emp_sal varint, emp_phone varint ) WITH COMPACT STORAGE; > x@cqlsh:xuser> desc table xuser.employee ; > CREATE TABLE xuser.employee ( > emp_id int PRIMARY KEY, > emp_city text, > emp_name text, > emp_phone varint, > emp_sal varint > ) WITH COMPACT STORAGE > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE';{code} > Now altering the table by adding a new column: > > {code:java} > x@cqlsh:xuser> alter table employee add profile text; > x@cqlsh:xuser> desc table xuser.employee ; > CREATE TABLE xuser.employee ( > emp_id int PRIMARY KEY, > emp_city text, > emp_name text, > emp_phone varint, > emp_sal varint > ) WITH COMPACT STORAGE > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} > notice that above desc table result does not have newly added column profile. > However when i try to add it again it gives column already exist; > {code:java} > x@cqlsh:xuser> alter table employee add profile text; > InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid > column name profile because it conflicts with an existing column" > x@cqlsh:xuser> select emp_name,profile from employee; > emp_name | profile > --+- > (0 rows) > x@cqlsh:xuser> > {code} > Inserting also behaves strange: > {code:java} > x@cqlsh:xuser> INSERT INTO employee (emp_id , emp_city , emp_name , emp_phone > , emp_sal ,profile) VALUES ( 1, 'ggn', 'john', 123456, 5, 'SE'); > InvalidRequest: Error from server: code=2200 [Invalid query] message="Some > clustering keys are missing: column1" > x@cqlsh:xuser> INSERT INTO employee (emp_id , emp_city , emp_name , emp_phone > , emp_sal ,profile,column1) VALUES ( 1, 'ggn', 'john', 123456, 5, > 'SE',null); > x@cqlsh:xuser> select * from employee; > emp_id | emp_city | emp_name | emp_phone | emp_sal > +--+--+---+- > (0 rows) > {code} > *How to solve that ticket* > ([~blerer])-- > > Adding regular columns to non-dense compact tables should be forbidden as it > is the case for other column types. To do that {{AlterTableStatement}} should > be modified to fire an {{InvalidRequestException}} when
[jira] [Comment Edited] (CASSANDRA-14557) Consider adding default and required keyspace replication options
[ https://issues.apache.org/jira/browse/CASSANDRA-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404307#comment-17404307 ] Sumanth Pasupuleti edited comment on CASSANDRA-14557 at 8/25/21, 9:25 AM: -- [~azotcsit] latest branch https://github.com/sumanth-pasupuleti/cassandra/tree/14557-trunk reflects the changes from your latest round of review, and you can find responses to your comments at https://github.com/sumanth-pasupuleti/cassandra/commit/139a01531f1b51f6b3b7dc005a7df929ec9409a5. [~stefan.miklosovic] Good call out on CASSANDRA-16203. This does fit philosophically and logically, however, [fromMapWithDefaults|https://github.com/sumanth-pasupuleti/cassandra/blob/14557-trunk/src/java/org/apache/cassandra/schema/ReplicationParams.java#L92] will have to accommodate to this new class. Will add a comment to CASSANDRA-16203 if this gets committed before CASSANDRA-16203. Also, added nodetool commands to override default rf and minimum rf. This is still cumbersome to do it for each node, but yes it serves the purpose of avoiding node restart. We may ideally want to put an endpoint in the [sidecar|https://github.com/apache/cassandra-sidecar], that could potentially change these configurations across the cluster. Latest code at https://github.com/sumanth-pasupuleti/cassandra/tree/14557-trunk was (Author: sumanth.pasupuleti): [~azotcsit] latest branch https://github.com/sumanth-pasupuleti/cassandra/tree/14557-trunk reflects the changes from your latest round of review, and you can find responses to your comments at https://github.com/sumanth-pasupuleti/cassandra/commit/139a01531f1b51f6b3b7dc005a7df929ec9409a5. [~stefan.miklosovic] Good call out on CASSANDRA-16203. This does fit philosophically and logically, however, [fromMapWithDefaults|https://github.com/sumanth-pasupuleti/cassandra/blob/14557-trunk/src/java/org/apache/cassandra/schema/ReplicationParams.java#L92] will have to accommodate to this new class. Will add a comment to CASSANDRA-16203 if this gets committed before CASSANDRA-16203. Also, added nodetool commands to override default rf and minimum rf. This is still cumbersome to do it for each node, but yes it serves the purpose of avoiding node restart. We may ideally want to put an endpoint in the [sidecar|https://github.com/apache/cassandra-sidecar], that could potentially change these configurations across the cluster. > Consider adding default and required keyspace replication options > - > > Key: CASSANDRA-14557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14557 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Low > Labels: 4.0-feature-freeze-review-requested > Fix For: 4.x > > Attachments: 14557-4.0.txt, 14557-trunk.patch > > > Ending up with a keyspace of RF=1 is unfortunately pretty easy in C* right > now - the system_auth table for example is created with RF=1 (to take into > account single node setups afaict from CASSANDRA-5112), and a user can > further create a keyspace with RF=1 posing availability and streaming risks > (e.g. rebuild). > I propose we add two configuration options in cassandra.yaml: > # {{default_keyspace_rf}} (default: 1) - If replication factors are not > specified, use this number. > # {{required_minimum_keyspace_rf}} (default: unset) - Prevent users from > creating a keyspace with an RF less than what is configured > These settings could further be re-used to: > * Provide defaults for new keyspaces created with SimpleStrategy or > NetworkTopologyStrategy (CASSANDRA-14303) > * Make the automatic token [allocation > algorithm|https://issues.apache.org/jira/browse/CASSANDRA-13701?focusedCommentId=16095662=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16095662] > interface more intuitive allowing easy use of the new token allocation > algorithm. > At the end of the day, if someone really wants to allow RF=1, they simply > don’t set the setting. For backwards compatibility the default remains 1 and > C* would create with RF=1, and would default to current behavior of allowing > any RF on keyspaces. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14557) Consider adding default and required keyspace replication options
[ https://issues.apache.org/jira/browse/CASSANDRA-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404307#comment-17404307 ] Sumanth Pasupuleti edited comment on CASSANDRA-14557 at 8/25/21, 9:23 AM: -- [~azotcsit] latest branch https://github.com/sumanth-pasupuleti/cassandra/tree/14557-trunk reflects the changes from your latest round of review, and you can find responses to your comments at https://github.com/sumanth-pasupuleti/cassandra/commit/139a01531f1b51f6b3b7dc005a7df929ec9409a5. [~stefan.miklosovic] Good call out on CASSANDRA-16203. This does fit philosophically and logically, however, [fromMapWithDefaults|https://github.com/sumanth-pasupuleti/cassandra/blob/14557-trunk/src/java/org/apache/cassandra/schema/ReplicationParams.java#L92] will have to accommodate to this new class. Will add a comment to CASSANDRA-16203 if this gets committed before CASSANDRA-16203. Also, added nodetool commands to override default rf and minimum rf. This is still cumbersome to do it for each node, but yes it serves the purpose of avoiding node restart. We may ideally want to put an endpoint in the [sidecar|https://github.com/apache/cassandra-sidecar], that could potentially change these configurations across the cluster. was (Author: sumanth.pasupuleti): [~azotcsit] latest branch https://github.com/sumanth-pasupuleti/cassandra/tree/14557-trunk reflects the changes from your latest round of review, and you can find responses to your comments at https://github.com/sumanth-pasupuleti/cassandra/commit/139a01531f1b51f6b3b7dc005a7df929ec9409a5. [~stefan.miklosovic] Good call out on CASSANDRA-16203. This does fit philosophically and logically, however, fromMapWithDefaults will have to accommodate to this new class. Will add a comment to CASSANDRA-16203 if this gets committed before CASSANDRA-16203. Also, added nodetool commands to override default rf and minimum rf. This is still cumbersome to do it for each node, but yes it serves the purpose of avoiding node restart. We may ideally want to put an endpoint in the [sidecar|https://github.com/apache/cassandra-sidecar], that could potentially change these configurations across the cluster. > Consider adding default and required keyspace replication options > - > > Key: CASSANDRA-14557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14557 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Low > Labels: 4.0-feature-freeze-review-requested > Fix For: 4.x > > Attachments: 14557-4.0.txt, 14557-trunk.patch > > > Ending up with a keyspace of RF=1 is unfortunately pretty easy in C* right > now - the system_auth table for example is created with RF=1 (to take into > account single node setups afaict from CASSANDRA-5112), and a user can > further create a keyspace with RF=1 posing availability and streaming risks > (e.g. rebuild). > I propose we add two configuration options in cassandra.yaml: > # {{default_keyspace_rf}} (default: 1) - If replication factors are not > specified, use this number. > # {{required_minimum_keyspace_rf}} (default: unset) - Prevent users from > creating a keyspace with an RF less than what is configured > These settings could further be re-used to: > * Provide defaults for new keyspaces created with SimpleStrategy or > NetworkTopologyStrategy (CASSANDRA-14303) > * Make the automatic token [allocation > algorithm|https://issues.apache.org/jira/browse/CASSANDRA-13701?focusedCommentId=16095662=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16095662] > interface more intuitive allowing easy use of the new token allocation > algorithm. > At the end of the day, if someone really wants to allow RF=1, they simply > don’t set the setting. For backwards compatibility the default remains 1 and > C* would create with RF=1, and would default to current behavior of allowing > any RF on keyspaces. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14557) Consider adding default and required keyspace replication options
[ https://issues.apache.org/jira/browse/CASSANDRA-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404307#comment-17404307 ] Sumanth Pasupuleti commented on CASSANDRA-14557: [~azotcsit] latest branch https://github.com/sumanth-pasupuleti/cassandra/tree/14557-trunk reflects the changes from your latest round of review, and you can find responses to your comments at https://github.com/sumanth-pasupuleti/cassandra/commit/139a01531f1b51f6b3b7dc005a7df929ec9409a5. [~stefan.miklosovic] Good call out on CASSANDRA-16203. This does fit philosophically and logically, however, fromMapWithDefaults will have to accommodate to this new class. Will add a comment to CASSANDRA-16203 if this gets committed before CASSANDRA-16203. Also, added nodetool commands to override default rf and minimum rf. This is still cumbersome to do it for each node, but yes it serves the purpose of avoiding node restart. We may ideally want to put an endpoint in the [sidecar|https://github.com/apache/cassandra-sidecar], that could potentially change these configurations across the cluster. > Consider adding default and required keyspace replication options > - > > Key: CASSANDRA-14557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14557 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Low > Labels: 4.0-feature-freeze-review-requested > Fix For: 4.x > > Attachments: 14557-4.0.txt, 14557-trunk.patch > > > Ending up with a keyspace of RF=1 is unfortunately pretty easy in C* right > now - the system_auth table for example is created with RF=1 (to take into > account single node setups afaict from CASSANDRA-5112), and a user can > further create a keyspace with RF=1 posing availability and streaming risks > (e.g. rebuild). > I propose we add two configuration options in cassandra.yaml: > # {{default_keyspace_rf}} (default: 1) - If replication factors are not > specified, use this number. > # {{required_minimum_keyspace_rf}} (default: unset) - Prevent users from > creating a keyspace with an RF less than what is configured > These settings could further be re-used to: > * Provide defaults for new keyspaces created with SimpleStrategy or > NetworkTopologyStrategy (CASSANDRA-14303) > * Make the automatic token [allocation > algorithm|https://issues.apache.org/jira/browse/CASSANDRA-13701?focusedCommentId=16095662=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16095662] > interface more intuitive allowing easy use of the new token allocation > algorithm. > At the end of the day, if someone really wants to allow RF=1, they simply > don’t set the setting. For backwards compatibility the default remains 1 and > C* would create with RF=1, and would default to current behavior of allowing > any RF on keyspaces. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12734) Materialized View schema file for snapshots created as tables
[ https://issues.apache.org/jira/browse/CASSANDRA-12734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404277#comment-17404277 ] Benjamin Lerer commented on CASSANDRA-12734: [~e.dimitrova] I ran some tests on 4.0 and hit another issue related to DESCRIBE. I need to dig a deeper. > Materialized View schema file for snapshots created as tables > - > > Key: CASSANDRA-12734 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12734 > Project: Cassandra > Issue Type: Bug > Components: Feature/Materialized Views, Legacy/Tools >Reporter: Hau Phan >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x > > > The materialized view schema file that gets created and stored with the > sstables is created as a table instead of a materialized view. > Can the materialized view be created and added to the corresponding table's > schema file? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org