[jira] [Commented] (CASSANDRA-13315) Semantically meaningful Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904598#comment-15904598 ] Jon Haddad commented on CASSANDRA-13315: +1. Will help most users pick the right thing early on. Later they can learn the nuance, rather than punching them in the face with it up front. > Semantically meaningful Consistency Levels > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13319) Fix typo in cassandra-stress
Ian Macalinao created CASSANDRA-13319: - Summary: Fix typo in cassandra-stress Key: CASSANDRA-13319 URL: https://issues.apache.org/jira/browse/CASSANDRA-13319 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Ian Macalinao Priority: Trivial Fix typo in cassandra-stress. https://github.com/apache/cassandra/pull/97/files -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13216) testall failure in org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages
[ https://issues.apache.org/jira/browse/CASSANDRA-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904461#comment-15904461 ] Michael Kjellman commented on CASSANDRA-13216: -- who's reviewing this? would be great to get all our tests passing on trunk! [~jasobrown] [~aweisberg] I agree this seems like a reasonable approach because MessagingService is a singleton and we can't reset most of the metrics core objects without any other ideas on this i'm +1 on alex's commit. > testall failure in > org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages > > > Key: CASSANDRA-13216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13216 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Sean McCarthy >Assignee: Alex Petrov > Labels: test-failure, testall > Attachments: TEST-org.apache.cassandra.net.MessagingServiceTest.log > > > example failure: > http://cassci.datastax.com/job/cassandra-3.11_testall/81/testReport/org.apache.cassandra.net/MessagingServiceTest/testDroppedMessages > {code} > Error Message > expected:<... dropped latency: 27[30 ms and Mean cross-node dropped latency: > 2731] ms> but was:<... dropped latency: 27[28 ms and Mean cross-node dropped > latency: 2730] ms> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<... dropped latency: 27[30 ms > and Mean cross-node dropped latency: 2731] ms> but was:<... dropped latency: > 27[28 ms and Mean cross-node dropped latency: 2730] ms> > at > org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages(MessagingServiceTest.java:83) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-8780) cassandra-stress should support multiple table operations
[ https://issues.apache.org/jira/browse/CASSANDRA-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Slater updated CASSANDRA-8780: -- Status: Awaiting Feedback (was: In Progress) > cassandra-stress should support multiple table operations > - > > Key: CASSANDRA-8780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8780 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Ben Slater > Labels: stress > Fix For: 3.11.x > > Attachments: 8780-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13317) Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due to includeCallerData being false by default no appender
[ https://issues.apache.org/jira/browse/CASSANDRA-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904074#comment-15904074 ] Michael Kjellman commented on CASSANDRA-13317: -- p.s. [~aweisberg] there is a performance penalty here which is why it's disabled by default in logback and log4j2.. I think it's helpful to have the filename and line number personally while debugging so I think it makes sense to keep %F:%L in the pattern for logback-test.xml... For the actual default conf we ship I wonder if it might make sense to remove "%F:%L" from the pattern instead of fixing the issue The thing is though that we ship this "ASYNCDEBUGLOG" appender enabled by default though which already has true. so if we decide it's not worth the performance overhead to log the filename and line number for the actual default (non-test) logback config we ship we should also make ASYNCDEBUGLOG disabled by default... {code} {code} > Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due > to includeCallerData being false by default no appender > > > Key: CASSANDRA-13317 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13317 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: 13317_v1.diff > > > We specify the logging pattern as "%-5level [%thread] %date{ISO8601} %F:%L - > %msg%n". > %F:%L is intended to print the Filename:Line Number. For performance reasons > logback (like log4j2) disables tracking line numbers as it requires the > entire stack to be materialized every time. > This causes logs to look like: > WARN [main] 2017-03-09 13:27:11,272 ?:? - Protocol Version 5/v5-beta not > supported by java driver > INFO [main] 2017-03-09 13:27:11,813 ?:? - No commitlog files found; skipping > replay > INFO [main] 2017-03-09 13:27:12,477 ?:? - Initialized prepared statement > caches with 14 MB > INFO [main] 2017-03-09 13:27:12,727 ?:? - Initializing system.IndexInfo > When instead you'd expect something like: > INFO [main] 2017-03-09 13:23:44,204 ColumnFamilyStore.java:419 - > Initializing system.available_ranges > INFO [main] 2017-03-09 13:23:44,210 ColumnFamilyStore.java:419 - > Initializing system.transferred_ranges > INFO [main] 2017-03-09 13:23:44,215 ColumnFamilyStore.java:419 - > Initializing system.views_builds_in_progress > The fix is to add "true" to the > appender config to enable the line number and stack tracing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13317) Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due to includeCallerData being false by default no appender
[ https://issues.apache.org/jira/browse/CASSANDRA-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-13317: --- Reviewer: Ariel Weisberg > Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due > to includeCallerData being false by default no appender > > > Key: CASSANDRA-13317 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13317 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: 13317_v1.diff > > > We specify the logging pattern as "%-5level [%thread] %date{ISO8601} %F:%L - > %msg%n". > %F:%L is intended to print the Filename:Line Number. For performance reasons > logback (like log4j2) disables tracking line numbers as it requires the > entire stack to be materialized every time. > This causes logs to look like: > WARN [main] 2017-03-09 13:27:11,272 ?:? - Protocol Version 5/v5-beta not > supported by java driver > INFO [main] 2017-03-09 13:27:11,813 ?:? - No commitlog files found; skipping > replay > INFO [main] 2017-03-09 13:27:12,477 ?:? - Initialized prepared statement > caches with 14 MB > INFO [main] 2017-03-09 13:27:12,727 ?:? - Initializing system.IndexInfo > When instead you'd expect something like: > INFO [main] 2017-03-09 13:23:44,204 ColumnFamilyStore.java:419 - > Initializing system.available_ranges > INFO [main] 2017-03-09 13:23:44,210 ColumnFamilyStore.java:419 - > Initializing system.transferred_ranges > INFO [main] 2017-03-09 13:23:44,215 ColumnFamilyStore.java:419 - > Initializing system.views_builds_in_progress > The fix is to add "true" to the > appender config to enable the line number and stack tracing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13315) Semantically meaningful Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Summary: Semantically meaningful Consistency Levels (was: Consistency is confusing for new users) > Semantically meaningful Consistency Levels > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13318) Include number of messages attempted when logging message drops
[ https://issues.apache.org/jira/browse/CASSANDRA-13318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13318: Summary: Include number of messages attempted when logging message drops (was: Include # Of messages attempted when logging message drops) > Include number of messages attempted when logging message drops > --- > > Key: CASSANDRA-13318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13318 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > I use the log messages for mutation drops a lot for diagnostics, it'd be > helpful if we included the number of messages attempted so we can get a > glance by looking at the logs for the load during that time. > 1131 MUTATION messages dropped in last 5000ms > to > > 1131 MUTATION messages dropped out of 5000 MUTATION messages in last 5000ms -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13318) Include # Of messages attempted when logging message drops
Ryan Svihla created CASSANDRA-13318: --- Summary: Include # Of messages attempted when logging message drops Key: CASSANDRA-13318 URL: https://issues.apache.org/jira/browse/CASSANDRA-13318 Project: Cassandra Issue Type: Improvement Reporter: Ryan Svihla I use the log messages for mutation drops a lot for diagnostics, it'd be helpful if we included the number of messages attempted so we can get a glance by looking at the logs for the load during that time. 1131 MUTATION messages dropped in last 5000ms to 1131 MUTATION messages dropped out of 5000 MUTATION messages in last 5000ms -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13317) Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due to includeCallerData being false by default no appender
[ https://issues.apache.org/jira/browse/CASSANDRA-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Kjellman updated CASSANDRA-13317: - Status: Patch Available (was: Open) > Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due > to includeCallerData being false by default no appender > > > Key: CASSANDRA-13317 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13317 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: 13317_v1.diff > > > We specify the logging pattern as "%-5level [%thread] %date{ISO8601} %F:%L - > %msg%n". > %F:%L is intended to print the Filename:Line Number. For performance reasons > logback (like log4j2) disables tracking line numbers as it requires the > entire stack to be materialized every time. > This causes logs to look like: > WARN [main] 2017-03-09 13:27:11,272 ?:? - Protocol Version 5/v5-beta not > supported by java driver > INFO [main] 2017-03-09 13:27:11,813 ?:? - No commitlog files found; skipping > replay > INFO [main] 2017-03-09 13:27:12,477 ?:? - Initialized prepared statement > caches with 14 MB > INFO [main] 2017-03-09 13:27:12,727 ?:? - Initializing system.IndexInfo > When instead you'd expect something like: > INFO [main] 2017-03-09 13:23:44,204 ColumnFamilyStore.java:419 - > Initializing system.available_ranges > INFO [main] 2017-03-09 13:23:44,210 ColumnFamilyStore.java:419 - > Initializing system.transferred_ranges > INFO [main] 2017-03-09 13:23:44,215 ColumnFamilyStore.java:419 - > Initializing system.views_builds_in_progress > The fix is to add "true" to the > appender config to enable the line number and stack tracing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13317) Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due to includeCallerData being false by default no appender
[ https://issues.apache.org/jira/browse/CASSANDRA-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Kjellman updated CASSANDRA-13317: - Attachment: 13317_v1.diff > Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due > to includeCallerData being false by default no appender > > > Key: CASSANDRA-13317 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13317 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: 13317_v1.diff > > > We specify the logging pattern as "%-5level [%thread] %date{ISO8601} %F:%L - > %msg%n". > %F:%L is intended to print the Filename:Line Number. For performance reasons > logback (like log4j2) disables tracking line numbers as it requires the > entire stack to be materialized every time. > This causes logs to look like: > WARN [main] 2017-03-09 13:27:11,272 ?:? - Protocol Version 5/v5-beta not > supported by java driver > INFO [main] 2017-03-09 13:27:11,813 ?:? - No commitlog files found; skipping > replay > INFO [main] 2017-03-09 13:27:12,477 ?:? - Initialized prepared statement > caches with 14 MB > INFO [main] 2017-03-09 13:27:12,727 ?:? - Initializing system.IndexInfo > When instead you'd expect something like: > INFO [main] 2017-03-09 13:23:44,204 ColumnFamilyStore.java:419 - > Initializing system.available_ranges > INFO [main] 2017-03-09 13:23:44,210 ColumnFamilyStore.java:419 - > Initializing system.transferred_ranges > INFO [main] 2017-03-09 13:23:44,215 ColumnFamilyStore.java:419 - > Initializing system.views_builds_in_progress > The fix is to add "true" to the > appender config to enable the line number and stack tracing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13317) Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due to includeCallerData being false by default no appender
[ https://issues.apache.org/jira/browse/CASSANDRA-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904000#comment-15904000 ] Michael Kjellman commented on CASSANDRA-13317: -- [~aweisberg] wanna +1 this? > Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due > to includeCallerData being false by default no appender > > > Key: CASSANDRA-13317 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13317 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: 13317_v1.diff > > > We specify the logging pattern as "%-5level [%thread] %date{ISO8601} %F:%L - > %msg%n". > %F:%L is intended to print the Filename:Line Number. For performance reasons > logback (like log4j2) disables tracking line numbers as it requires the > entire stack to be materialized every time. > This causes logs to look like: > WARN [main] 2017-03-09 13:27:11,272 ?:? - Protocol Version 5/v5-beta not > supported by java driver > INFO [main] 2017-03-09 13:27:11,813 ?:? - No commitlog files found; skipping > replay > INFO [main] 2017-03-09 13:27:12,477 ?:? - Initialized prepared statement > caches with 14 MB > INFO [main] 2017-03-09 13:27:12,727 ?:? - Initializing system.IndexInfo > When instead you'd expect something like: > INFO [main] 2017-03-09 13:23:44,204 ColumnFamilyStore.java:419 - > Initializing system.available_ranges > INFO [main] 2017-03-09 13:23:44,210 ColumnFamilyStore.java:419 - > Initializing system.transferred_ranges > INFO [main] 2017-03-09 13:23:44,215 ColumnFamilyStore.java:419 - > Initializing system.views_builds_in_progress > The fix is to add "true" to the > appender config to enable the line number and stack tracing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13317) Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due to includeCallerData being false by default no appender
Michael Kjellman created CASSANDRA-13317: Summary: Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern due to includeCallerData being false by default no appender Key: CASSANDRA-13317 URL: https://issues.apache.org/jira/browse/CASSANDRA-13317 Project: Cassandra Issue Type: Bug Components: Core Reporter: Michael Kjellman Assignee: Michael Kjellman We specify the logging pattern as "%-5level [%thread] %date{ISO8601} %F:%L - %msg%n". %F:%L is intended to print the Filename:Line Number. For performance reasons logback (like log4j2) disables tracking line numbers as it requires the entire stack to be materialized every time. This causes logs to look like: WARN [main] 2017-03-09 13:27:11,272 ?:? - Protocol Version 5/v5-beta not supported by java driver INFO [main] 2017-03-09 13:27:11,813 ?:? - No commitlog files found; skipping replay INFO [main] 2017-03-09 13:27:12,477 ?:? - Initialized prepared statement caches with 14 MB INFO [main] 2017-03-09 13:27:12,727 ?:? - Initializing system.IndexInfo When instead you'd expect something like: INFO [main] 2017-03-09 13:23:44,204 ColumnFamilyStore.java:419 - Initializing system.available_ranges INFO [main] 2017-03-09 13:23:44,210 ColumnFamilyStore.java:419 - Initializing system.transferred_ranges INFO [main] 2017-03-09 13:23:44,215 ColumnFamilyStore.java:419 - Initializing system.views_builds_in_progress The fix is to add "true" to the appender config to enable the line number and stack tracing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13316) Build error because of dependent jar (byteman-install-3.0.3.jar) currupted
Sam Ding created CASSANDRA-13316: Summary: Build error because of dependent jar (byteman-install-3.0.3.jar) currupted Key: CASSANDRA-13316 URL: https://issues.apache.org/jira/browse/CASSANDRA-13316 Project: Cassandra Issue Type: Bug Components: Testing Environment: Platform: Amd64 OS: CentOS Linux 7 Reporter: Sam Ding Fix For: 3.10 When build cassandra 3.10 on amd64, CentOS Linux 7, there is a build error caused by corrupted jar file (byteman-install-3.0.3.jar). Here is the replicated steps: After install necessary dependent packages and apache-ant, git clone cassandra 3.10: 1) git clone https://github.com/apache/cassandra.git cd cassandra git checkout cassandra-3.10 ant Then gets errors like: " build-project: [echo] apache-cassandra: /cassandra/build.xml [javac] Compiling 45 source files to /cassandra/build/classes/thrift [javac] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 [javac] error: error reading /cassandra/build/lib/jars/byteman-install-3.0.3.jar; error in opening zip file [javac] Compiling 1474 source files to /cassandra/build/classes/main [javac] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 [javac] error: error reading /cassandra/build/lib/jars/byteman-install-3.0.3.jar; error in opening zip file [javac] Creating empty /cassandra/build/classes/main/org/apache/cassandra/hints/package-info.class " 2) To check the jar and get: # jar -i /cassandra/build/lib/jars/byteman-install-3.0.3.jar Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.(ZipFile.java:219) at java.util.zip.ZipFile.(ZipFile.java:149) at java.util.jar.JarFile.(JarFile.java:166) at java.util.jar.JarFile.(JarFile.java:103) at sun.tools.jar.Main.getJarPath(Main.java:1163) at sun.tools.jar.Main.genIndex(Main.java:1195) at sun.tools.jar.Main.run(Main.java:317) at sun.tools.jar.Main.main(Main.java:1288) 3) if download the jar and replace it, the build will be successful. wget http://downloads.jboss.org/byteman/3.0.3/byteman-download-3.0.3-bin.zip unzip byteman-download-3.0.3-bin.zip -d /tmp rm -f build/lib/jars/byteman-install-3.0.3.jar cp /tmp/byteman-download-3.0.3/lib/byteman-install.jar build/lib/jars/byteman-install-3.0.3.jar ant BUILD SUCCESSFUL Total time: 36 seconds -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903890#comment-15903890 ] Ryan Svihla edited comment on CASSANDRA-13315 at 3/9/17 9:43 PM: - I don't think any should be removed, but nearly every time I dig into an EACH_QUORUM use case with someone..it ends up being not what they want. EACH_QUORUM doesn't roll back the writes, so even if it fails because of a lack of replicas you can still be returning the 'failed write' successfully on the nodes it did succeed on, so in effect during DC connection outages unless you just turn writes off you get divergence between the 2 DCs and reads in one DC show up and not in another. Also several customers with EACH_QUORUM have had downgrading retry policy on..defeating it entirely. was (Author: rssvihla): I don't think any should be removed, but nearly every time I dig into an EACH_QUORUM use case with someone..it ends up being not what they want. EACH_QUORUM doesn't roll back the writes, so even if it fails because of a lack of replicas you can still be returning the 'failed write' successfully on the nodes it did succeed on, so in effect during DC connection outages unless you just turn writes off you get divergence between the TWO DCs and reads in one DC show up and not in another. Also several customers with EACH_QUORUM have had downgrading retry policy on..defeating it entirely. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903890#comment-15903890 ] Ryan Svihla commented on CASSANDRA-13315: - I don't think any should be removed, but nearly every time I dig into an EACH_QUORUM use case with someone..it ends up being not what they want. EACH_QUORUM doesn't roll back the writes, so even if it fails because of a lack of replicas you can still be returning the 'failed write' successfully on the nodes it did succeed on, so in effect during DC connection outages unless you just turn writes off you get divergence between the TWO DCs and reads in one DC show up and not in another. Also several customers with EACH_QUORUM have had downgrading retry policy on..defeating it entirely. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903888#comment-15903888 ] Russell Spitzer commented on CASSANDRA-13315: - +1 On moving towards semantically meaningful terms instead of technically correct ones. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903882#comment-15903882 ] DOAN DuyHai commented on CASSANDRA-13315: - As long as you don't remove the EACH_QUORUM I'm fine. There are very rare cases (where 2 DCs are very close geographically to each other) where customer want EACH_QUORUM to be sure that mutations has been applied in both DCs. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13291) Replace usages of MessageDigest with Guava's Hasher
[ https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-13291: - Reviewer: Robert Stupp Yup, can do. > Replace usages of MessageDigest with Guava's Hasher > --- > > Key: CASSANDRA-13291 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13291 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: CASSANDRA-13291-trunk.diff > > > During my profiling of C* I frequently see lots of aggregate time across > threads being spent inside the MD5 MessageDigest implementation. Given that > there are tons of modern alternative hashing functions better than MD5 > available -- both in terms of providing better collision resistance and > actual computational speed -- I wanted to switch out our usage of MD5 for > alternatives (like adler128 or murmur3_128) and test for performance > improvements. > Unfortunately, I found given the fact we use MessageDigest everywhere -- > switching out the hashing function to something like adler128 or murmur3_128 > (for example) -- which don't ship with the JDK -- wasn't straight forward. > The goal of this ticket is to propose switching out usages of MessageDigest > directly in favor of Hasher from Guava. This means going forward we can > change a single line of code to switch the hashing algorithm being used > (assuming there is an implementation in Guava). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13291) Replace usages of MessageDigest with Guava's Hasher
[ https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903856#comment-15903856 ] Michael Kjellman commented on CASSANDRA-13291: -- [~snazy] any chance you want to review this one? [~jasobrown] was gonna try to get to it -- it should be pretty straight forward unless I really suck hehe... the real review and thought will come in CASSANDRA-13292 > Replace usages of MessageDigest with Guava's Hasher > --- > > Key: CASSANDRA-13291 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13291 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Attachments: CASSANDRA-13291-trunk.diff > > > During my profiling of C* I frequently see lots of aggregate time across > threads being spent inside the MD5 MessageDigest implementation. Given that > there are tons of modern alternative hashing functions better than MD5 > available -- both in terms of providing better collision resistance and > actual computational speed -- I wanted to switch out our usage of MD5 for > alternatives (like adler128 or murmur3_128) and test for performance > improvements. > Unfortunately, I found given the fact we use MessageDigest everywhere -- > switching out the hashing function to something like adler128 or murmur3_128 > (for example) -- which don't ship with the JDK -- wasn't straight forward. > The goal of this ticket is to propose switching out usages of MessageDigest > directly in favor of Hasher from Guava. This means going forward we can > change a single line of code to switch the hashing algorithm being used > (assuming there is an implementation in Guava). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13300) Upgrade the jna version to 4.3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Kjellman updated CASSANDRA-13300: - Reviewer: Michael Kjellman (was: Jason Brown) > Upgrade the jna version to 4.3.0 > > > Key: CASSANDRA-13300 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13300 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Amitkumar Ghatwal >Assignee: Jason Brown > > Could you please upgrade the jna version present in the github cassandra > location : https://github.com/apache/cassandra/blob/trunk/lib/jna-4.0.0.jar > to below latest version - 4.3.0 - > http://repo1.maven.org/maven2/net/java/dev/jna/jna/4.3.0/jna-4.3.0-javadoc.jar -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903649#comment-15903649 ] Caleb Rackliffe commented on CASSANDRA-13315: - +1 on the idea of synonyms as recipes, and I'll add some naming ideas, because why not? EVENTUAL_SAME_DC STRONG_SAME_DC SERIAL_SAME_DC > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903481#comment-15903481 ] Ryan Svihla commented on CASSANDRA-13315: - For SERIAL/TRANSACTIONAL I think there is a better middle ground name in there..LIGHTWEIGHT_TRANSACTION maybe, more cassandra specific and a hint for new users without being as technically incorrect as TRANSACTIONAL > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. o 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditions still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: EDIT better names bases on comments. * EVENTUALLY = LOCAL_ONE reads and writes * STRONG = LOCAL_QUORUM reads and writes * SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as correct as Id like) for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditions still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903468#comment-15903468 ] Ryan Svihla commented on CASSANDRA-13315: - Jeff how about just tagging it with _DC..slight push back everyone is happily calling multidc rdbms acid compliant even when thats only inside a dc > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903459#comment-15903459 ] Benjamin Roth commented on CASSANDRA-13315: --- I had the same problems in the beginning, so generally +1. But IMHO this should go along with an explaining section in the official docs. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903457#comment-15903457 ] Jeff Jirsa commented on CASSANDRA-13315: Bikeshed: Calling {{LOCAL_}} anything highly or strong consistency is probably asking for trouble. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903419#comment-15903419 ] Ryan Svihla edited comment on CASSANDRA-13315 at 3/9/17 5:14 PM: - On the power users it'll just be up to the driver implementers how they handle that (different packages and methods for the power user for example). was (Author: rssvihla): On the power users it'll just be up to the driver implementers right how they handle that (different packages and methods for the power user for example). > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903419#comment-15903419 ] Ryan Svihla commented on CASSANDRA-13315: - On the power users it'll just be up to the driver implementers right how they handle that (different packages and methods for the power user for example). > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditions still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. > Consistency is confusing for new users >
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified in the primary CL. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. > Consistency is confusing for
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. > Consistency is confusing for new users >
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified in the primary CL. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. >
[jira] [Comment Edited] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903404#comment-15903404 ] Ryan Svihla edited comment on CASSANDRA-13315 at 3/9/17 5:04 PM: - those are better names +1 on that. Dual CL yeah I mistated that and we're on the same page with intent, as I've stated it's hard to even talk about it in text without getting bewildered. Just as long as we have only a single bucket to set and we require a condition for SERIAL mutations I'm fine. was (Author: rssvihla): those are better names +1 on that. Dual CL yeah I mistated that and we're on the same page with intent, as I've stated it's hard to even talk about it in text without getting bewildered. Just as long as we have only a single bucket to set and we require a condition for SERIAL I'm fine. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps: > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket at the protocol level. > 2. To enable #1 just reject writes or updates done without a condition when > SERIAL/LOCAL_SERIAL is specified. > 3. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903404#comment-15903404 ] Ryan Svihla commented on CASSANDRA-13315: - those are better names +1 on that. Dual CL yeah I mistated that and we're on the same page with intent, as I've stated it's hard to even talk about it in text without getting bewildered. Just as long as we have only a single bucket to set and we require a condition for SERIAL I'm fine. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps: > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket at the protocol level. > 2. To enable #1 just reject writes or updates done without a condition when > SERIAL/LOCAL_SERIAL is specified. > 3. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903379#comment-15903379 ] Jonathan Ellis commented on CASSANDRA-13315: I like this idea a lot. We have a lot more experience now with how people use and misuse CL in the wild so I am comfortable getting a lot more opinionated in how we push people towards certain options and away from others. 1/2: The dual CL for Serial isn't for what to do w/ no condition, it's for the "commit" to EC land from the Paxos sandbox. So mandating a condition (don't we already?) doesn't make that go away. But, I think we could make that default to Q and call it good. (I'm having trouble thinking of a situation where you would need LWT, which requires a quorum to participate already, but also need lower CL on commit.) 3: I would bikeshed this to # EVENTUAL # STRONG # SERIAL 4. It sounds like we can do all of this at the drivers level except for adding some aliases to CQLSH. I don't see any benefit to adding synonyms at the protocol level. 5. How do we give power users the ability to use classic CL if they need it? > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps: > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket at the protocol level. > 2. To enable #1 just reject writes or updates done without a condition when > SERIAL/LOCAL_SERIAL is specified. > 3. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13315) Consistency is confusing for new users
Ryan Svihla created CASSANDRA-13315: --- Summary: Consistency is confusing for new users Key: CASSANDRA-13315 URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 Project: Cassandra Issue Type: Improvement Reporter: Ryan Svihla New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13130) Strange result of several list updates in a single request
[ https://issues.apache.org/jira/browse/CASSANDRA-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903247#comment-15903247 ] Sylvain Lebresne commented on CASSANDRA-13130: -- Sorry, it appears I missed that one and so it may require rebase, but +1 on the patches otherwise. > Strange result of several list updates in a single request > -- > > Key: CASSANDRA-13130 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13130 > Project: Cassandra > Issue Type: Bug >Reporter: Mikhail Krupitskiy >Assignee: Benjamin Lerer >Priority: Trivial > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > Let's assume that we have a row with the 'listColumn' column and value > \{1,2,3,4\}. > For me it looks logical to expect that the following two pieces of code will > ends up with the same result but it isn't so. > Code1: > {code} > UPDATE t SET listColumn[2] = 7, listColumn[2] = 8 WHERE id = 1; > {code} > Expected result: listColumn=\{1,2,8,4\} > Actual result: listColumn=\{1,2,7,8,4\} > Code2: > {code} > UPDATE t SET listColumn[2] = 7 WHERE id = 1; > UPDATE t SET listColumn[2] = 8 WHERE id = 1; > {code} > Expected result: listColumn=\{1,2,8,4\} > Actual result: listColumn=\{1,2,8,4\} > So the question is why Code1 and Code2 give different results? > Looks like Code1 should give the same result as Code2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13130) Strange result of several list updates in a single request
[ https://issues.apache.org/jira/browse/CASSANDRA-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-13130: - Status: Ready to Commit (was: Patch Available) > Strange result of several list updates in a single request > -- > > Key: CASSANDRA-13130 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13130 > Project: Cassandra > Issue Type: Bug >Reporter: Mikhail Krupitskiy >Assignee: Benjamin Lerer >Priority: Trivial > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > Let's assume that we have a row with the 'listColumn' column and value > \{1,2,3,4\}. > For me it looks logical to expect that the following two pieces of code will > ends up with the same result but it isn't so. > Code1: > {code} > UPDATE t SET listColumn[2] = 7, listColumn[2] = 8 WHERE id = 1; > {code} > Expected result: listColumn=\{1,2,8,4\} > Actual result: listColumn=\{1,2,7,8,4\} > Code2: > {code} > UPDATE t SET listColumn[2] = 7 WHERE id = 1; > UPDATE t SET listColumn[2] = 8 WHERE id = 1; > {code} > Expected result: listColumn=\{1,2,8,4\} > Actual result: listColumn=\{1,2,8,4\} > So the question is why Code1 and Code2 give different results? > Looks like Code1 should give the same result as Code2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12653) In-flight shadow round requests
[ https://issues.apache.org/jira/browse/CASSANDRA-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903212#comment-15903212 ] Joel Knighton commented on CASSANDRA-12653: --- Sure - while I'd argue that a need for a change in the future could be introduced in the future patch, I agree that this distinction is very minor and won't cause any problems. Thanks for the patch and your patience! > In-flight shadow round requests > --- > > Key: CASSANDRA-12653 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12653 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > Bootstrapping or replacing a node in the cluster requires to gather and check > some host IDs or tokens by doing a gossip "shadow round" once before joining > the cluster. This is done by sending a gossip SYN to all seeds until we > receive a response with the cluster state, from where we can move on in the > bootstrap process. Receiving a response will call the shadow round done and > calls {{Gossiper.resetEndpointStateMap}} for cleaning up the received state > again. > The issue here is that at this point there might be other in-flight requests > and it's very likely that shadow round responses from other seeds will be > received afterwards, while the current state of the bootstrap process doesn't > expect this to happen (e.g. gossiper may or may not be enabled). > One side effect will be that MigrationTasks are spawned for each shadow round > reply except the first. Tasks might or might not execute based on whether at > execution time {{Gossiper.resetEndpointStateMap}} had been called, which > effects the outcome of {{FailureDetector.instance.isAlive(endpoint))}} at > start of the task. You'll see error log messages such as follows when this > happend: > {noformat} > INFO [SharedPool-Worker-1] 2016-09-08 08:36:39,255 Gossiper.java:993 - > InetAddress /xx.xx.xx.xx is now UP > ERROR [MigrationStage:1]2016-09-08 08:36:39,255 FailureDetector.java:223 > - unknown endpoint /xx.xx.xx.xx > {noformat} > Although is isn't pretty, I currently don't see any serious harm from this, > but it would be good to get a second opinion (feel free to close as "wont > fix"). > /cc [~Stefania] [~thobbs] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12773) cassandra-stress error for one way SSL
[ https://issues.apache.org/jira/browse/CASSANDRA-12773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903117#comment-15903117 ] Stefan Podkowinski commented on CASSANDRA-12773: Patch has been updated and CI results are in. ||2.2||3.0||3.11||trunk|| |[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-12733-2.2]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-12733-3.0]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-12733-3.11]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-12733-trunk]| |[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-12733-2.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-12733-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-12733-3.11-dtest/]|| |[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-12733-2.2-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-12733-3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-12733-3.11-testall/]|| > cassandra-stress error for one way SSL > --- > > Key: CASSANDRA-12773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12773 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Jane Deng >Assignee: Stefan Podkowinski > Fix For: 2.2.x > > Attachments: 12773-2.2.patch > > > CASSANDRA-9325 added keystore/truststore configuration into cassandra-stress. > However, for one way ssl (require_client_auth=false), there is no need to > pass keystore info into ssloptions. Cassadra-stress errored out: > {noformat} > java.lang.RuntimeException: java.io.IOException: Error creating the > initializing the SSL Context > at > org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:200) > > at > org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesNative(SettingsSchema.java:79) > > at > org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:69) > > at > org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:207) > > at org.apache.cassandra.stress.StressAction.run(StressAction.java:55) > at org.apache.cassandra.stress.Stress.main(Stress.java:117) > Caused by: java.io.IOException: Error creating the initializing the SSL > Context > at > org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:151) > > at > org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:128) > > at > org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:191) > > ... 5 more > Caused by: java.io.IOException: Keystore was tampered with, or password was > incorrect > at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772) > at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55) > at java.security.KeyStore.load(KeyStore.java:1445) > at > org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:129) > > ... 7 more > Caused by: java.security.UnrecoverableKeyException: Password verification > failed > at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:770) > ... 10 more > {noformat} > It's a bug from CASSANDRA-9325. When the keystore is absent, the keystore is > assigned to the path of the truststore, but the password isn't taken care. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13308) Hint files not being deleted on nodetool decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903097#comment-15903097 ] Aleksey Yeschenko commented on CASSANDRA-13308: --- We don't need to. I guess reusing {{completeDispatchBlockingly}} there was chosen as an option to simplify dealing with leftovers, to avoid the race between hints still replaying and dropping the files for the departing node. What we minimally need to do is to cancel blockingly - rather than wait for completion - and then remove the leftovers (excise). > Hint files not being deleted on nodetool decommission > - > > Key: CASSANDRA-13308 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13308 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Using Cassandra version 3.0.9 >Reporter: Arijit > Attachments: 28207.stack, logs, logs_decommissioned_node > > > How to reproduce the issue I'm seeing: > Shut down Cassandra on one node of the cluster and wait until we accumulate a > ton of hints. Start Cassandra on the node and immediately run "nodetool > decommission" on it. > The node streams its replicas and marks itself as DECOMMISSIONED, but other > nodes do not seem to see this message. "nodetool status" shows the > decommissioned node in state "UL" on all other nodes (it is also present in > system.peers), and Cassandra logs show that gossip tasks on nodes are not > proceeding (number of pending tasks keeps increasing). Jstack suggests that a > gossip task is blocked on hints dispatch (I can provide traces if this is not > obvious). Because the cluster is large and there are a lot of hints, this is > taking a while. > On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint > files for the decommissioned node. Documentation seems to suggest that these > hints should be deleted during "nodetool decommission", but it does not seem > to be the case here. This is the bug being reported. > To recover from this scenario, if I manually delete hint files on the nodes, > the hints dispatcher threads throw a bunch of exceptions and the > decommissioned node is now in state "DL" (perhaps it missed some gossip > messages?). The node is still in my "system.peers" table > Restarting Cassandra on all nodes after this step does not fix the issue (the > node remains in the peers table). In fact, after this point the > decommissioned node is in state "DN" -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12915) SASI: Index intersection with an empty range really inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903086#comment-15903086 ] Corentin Chary commented on CASSANDRA-12915: LGTM, Thanks for cleaning up, this is way better now > SASI: Index intersection with an empty range really inefficient > --- > > Key: CASSANDRA-12915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12915 > Project: Cassandra > Issue Type: Improvement > Components: sasi >Reporter: Corentin Chary >Assignee: Corentin Chary > Fix For: 3.11.x, 4.x > > > It looks like RangeIntersectionIterator.java and be pretty inefficient in > some cases. Let's take the following query: > SELECT data FROM table WHERE index1 = 'foo' AND index2 = 'bar'; > In this case: > * index1 = 'foo' will match 2 items > * index2 = 'bar' will match ~300k items > On my setup, the query will take ~1 sec, most of the time being spent in > disk.TokenTree.getTokenAt(). > if I patch RangeIntersectionIterator so that it doesn't try to do the > intersection (and effectively only use 'index1') the query will run in a few > tenth of milliseconds. > I see multiple solutions for that: > * Add a static thresold to avoid the use of the index for the intersection > when we know it will be slow. Probably when the range size factor is very > small and the range size is big. > * CASSANDRA-10765 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12811) testall failure in org.apache.cassandra.cql3.validation.operations.DeleteTest.testDeleteWithOneClusteringColumns-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-12811: Status: Patch Available (was: Open) The patch is ready. 3.0 is slightly different, but it's just a single line [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12811-3.0#diff-2e17efa5977a71330df6651d3bec0d12R739]. The issue was rather difficult to reproduce, since as I mentioned it looks like a combination of unfortunate events (incidental flush + unmeant range tombstones), but you can stably reproduce it in the mentioned `DeleteTest` by making sure `flush` is happening right before [this delete|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12811-3.0#diff-70d3a7f61389330811d6eb2f7d2d1b76L443]. To prove that this is in fact the same issue (on CI and reproduced one), here's an mem/sstable dump from the CI, you can see there that, even though the run was with {{forceFlush == false}}, the memtable was non-empty. And it happened right before the delete occurred, so it went to the memtable, whereas all the data (including the {{value}} column tombstone) went to the sstable. {code} Memtables: INFO [main] 2017-03-06 10:46:20,411 partition.partitionKey() = DecoratedKey(-3485513579396041028, ) INFO [main] 2017-03-06 10:46:20,411 row = Marker '@'1488797180382000 INFO [main] 2017-03-06 10:46:20,411 ByteBufferUtil.bytesToHex(buffer) = 0001 INFO [main] 2017-03-06 10:46:20,411 row = Marker '@'1488797180382000 INFO [main] 2017-03-06 10:46:20,411 ByteBufferUtil.bytesToHex(buffer) = 0001 SSTables: INFO [main] 2017-03-06 10:46:20,411 partition.partitionKey() = DecoratedKey(-4069959284402364209, 0001) INFO [main] 2017-03-06 10:46:20,412 row = [[value=6 ts=1488797180328000]] INFO [main] 2017-03-06 10:46:20,412 ByteBufferUtil.bytesToHex(buffer) = INFO [main] 2017-03-06 10:46:20,412 partition.partitionKey() = DecoratedKey(-3485513579396041028, ) INFO [main] 2017-03-06 10:46:20,412 row = [[value=0 ts=1488797180279000]] INFO [main] 2017-03-06 10:46:20,412 ByteBufferUtil.bytesToHex(buffer) = INFO [main] 2017-03-06 10:46:20,412 row = [[value= ts=1488797180337000 ldt=1488797180]] INFO [main] 2017-03-06 10:46:20,412 ByteBufferUtil.bytesToHex(buffer) = 0001 INFO [main] 2017-03-06 10:46:20,412 row = [[value=2 ts=1488797180309000]] INFO [main] 2017-03-06 10:46:20,412 ByteBufferUtil.bytesToHex(buffer) = 0002 INFO [main] 2017-03-06 10:46:20,412 row = [[value=3 ts=1488797180313000]] INFO [main] 2017-03-06 10:46:20,412 ByteBufferUtil.bytesToHex(buffer) = 0003 INFO [main] 2017-03-06 10:46:20,412 row = [[value=4 ts=148879718032]] INFO [main] 2017-03-06 10:46:20,412 ByteBufferUtil.bytesToHex(buffer) = 0004 INFO [main] 2017-03-06 10:46:20,412 row = [[value=5 ts=1488797180323000]] INFO [main] 2017-03-06 10:46:20,412 ByteBufferUtil.bytesToHex(buffer) = 0005 {code} I've split it into several commits: * fixing the test itself (as most likely brackets in delete statement were unintentional) * writing tests to cover more cases with 2 sstables, mem/sstable, an sstable and a memtable * fixing the issue with memtable reads by removing `hasNext` from the `searchIterator` as it can not reliably report whether or not there is next item, since it doesn't know what clustering {{next(Clustering clustering)}} is going to be called with. This is particularly important in cases with all-rangetombstone partitions. * fixing one more issue discovered by the test, in cases when we have two sstables and one of them has only a range tombstone. |[3.0|https://github.com/ifesdjeen/cassandra/tree/12811-3.0]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12811-3.0-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12811-3.0-dtest/]| |[3.11|https://github.com/ifesdjeen/cassandra/tree/12811-3.11]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12811-3.11-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12811-3.11-dtest/]| |[trunk|https://github.com/ifesdjeen/cassandra/tree/12811-preliminary]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12811-preliminary-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12811-preliminary-dtest/]| > testall failure in > org.apache.cassandra.cql3.validation.operations.DeleteTest.testDeleteWithOneClusteringColumns-compression > > > Key: CASSANDRA-12811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12811 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Alex
[jira] [Comment Edited] (CASSANDRA-12915) SASI: Index intersection with an empty range really inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902793#comment-15902793 ] Alex Petrov edited comment on CASSANDRA-12915 at 3/9/17 1:04 PM: - CI looks pretty broken, but not because of this patch: |[patch|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12915-alternative]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-dtest/lastCompletedBuild/testReport/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-testall/lastCompletedBuild/testReport/] UPD: looks like the branch was a bit outdated, which might have caused the dtest failures, rebased and re-running now. UPD2: unit tests look ok now, dtests are still failing, but unrelated to the patch. +1 from my side, if you +1 as well, I'll get it committed. was (Author: ifesdjeen): CI looks pretty broken, but not because of this patch: |[patch|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12915-alternative]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-dtest/lastCompletedBuild/testReport/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-testall/lastCompletedBuild/testReport/] UPD: looks like the branch was a bit outdated, which might have caused the dtest failures, rebased and re-running now. > SASI: Index intersection with an empty range really inefficient > --- > > Key: CASSANDRA-12915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12915 > Project: Cassandra > Issue Type: Improvement > Components: sasi >Reporter: Corentin Chary >Assignee: Corentin Chary > Fix For: 3.11.x, 4.x > > > It looks like RangeIntersectionIterator.java and be pretty inefficient in > some cases. Let's take the following query: > SELECT data FROM table WHERE index1 = 'foo' AND index2 = 'bar'; > In this case: > * index1 = 'foo' will match 2 items > * index2 = 'bar' will match ~300k items > On my setup, the query will take ~1 sec, most of the time being spent in > disk.TokenTree.getTokenAt(). > if I patch RangeIntersectionIterator so that it doesn't try to do the > intersection (and effectively only use 'index1') the query will run in a few > tenth of milliseconds. > I see multiple solutions for that: > * Add a static thresold to avoid the use of the index for the intersection > when we know it will be slow. Probably when the range size factor is very > small and the range size is big. > * CASSANDRA-10765 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13305) Slice.isEmpty() returns false for some empty slices
[ https://issues.apache.org/jira/browse/CASSANDRA-13305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-13305: - Fix Version/s: (was: 3.0.12) 3.0.13 > Slice.isEmpty() returns false for some empty slices > --- > > Key: CASSANDRA-13305 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13305 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 3.0.13, 3.11.0 > > > {{Slice.isEmpty}} is currently defined as {{comparator.compare(end, start) < > 0}} but this shouldn't be a strict inequality. Indeed, the way > {{Slice.Bound}} is defined, having a start equal to an end implies a range > like {{[1, 1)}}, but that range is definitively empty and something we > shouldn't let in as that would break merging and other range tombstone > related code. > In practice, if you can currently insert such empty range (with something > like {{DELETE FROM t WHERE k = 'foo' AND i >= 1 AND i < 1}}), and that can > trigger assertions in {{RangeTomstoneList}} (and possibly other problem). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13305) Slice.isEmpty() returns false for some empty slices
[ https://issues.apache.org/jira/browse/CASSANDRA-13305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-13305: - Resolution: Fixed Fix Version/s: (was: 3.11.x) (was: 3.0.x) 3.11.0 3.0.12 Status: Resolved (was: Ready to Commit) Committed, thanks. > Slice.isEmpty() returns false for some empty slices > --- > > Key: CASSANDRA-13305 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13305 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 3.0.12, 3.11.0 > > > {{Slice.isEmpty}} is currently defined as {{comparator.compare(end, start) < > 0}} but this shouldn't be a strict inequality. Indeed, the way > {{Slice.Bound}} is defined, having a start equal to an end implies a range > like {{[1, 1)}}, but that range is definitively empty and something we > shouldn't let in as that would break merging and other range tombstone > related code. > In practice, if you can currently insert such empty range (with something > like {{DELETE FROM t WHERE k = 'foo' AND i >= 1 AND i < 1}}), and that can > trigger assertions in {{RangeTomstoneList}} (and possibly other problem). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[2/6] cassandra git commit: Slice.isEmpty() returns false for some empty slices
Slice.isEmpty() returns false for some empty slices patch by Sylvain Lebresne; reviewed by Branimir Lambov for CASSANDRA-13305 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31dec3d5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31dec3d5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31dec3d5 Branch: refs/heads/cassandra-3.11 Commit: 31dec3d548ae2c76d7c8bf4bffa9d506f670f756 Parents: 60d3292 Author: Sylvain LebresneAuthored: Thu Mar 9 11:58:40 2017 +0100 Committer: Sylvain Lebresne Committed: Thu Mar 9 11:58:40 2017 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 2 +- .../cql3/validation/operations/DeleteTest.java| 18 ++ 3 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0979852..1876922 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.13 + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) Merged from 2.2: * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/src/java/org/apache/cassandra/db/Slice.java -- diff --git a/src/java/org/apache/cassandra/db/Slice.java b/src/java/org/apache/cassandra/db/Slice.java index 7fde45e..3c645dc 100644 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@ -160,7 +160,7 @@ public class Slice public static boolean isEmpty(ClusteringComparator comparator, Slice.Bound start, Slice.Bound end) { assert start.isStart() && end.isEnd(); -return comparator.compare(end, start) < 0; +return comparator.compare(end, start) <= 0; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java b/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java index 09098ac..9d7d4a3 100644 --- a/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java +++ b/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java @@ -1292,6 +1292,24 @@ public class DeleteTest extends CQLTester } /** + * Test for CASSANDRA-13305 + */ +@Test +public void testWithEmptyRange() throws Throwable +{ +createTable("CREATE TABLE %s (k text, a int, b int, PRIMARY KEY (k, a, b))"); + +// Both of the following should be doing nothing, but before #13305 this inserted broken ranges. We do it twice +// and the follow-up delete mainly as a way to show the bug as the combination of this will trigger an assertion +// in RangeTombstoneList pre-#13305 showing that something wrong happened. +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 1, 1); +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 1, 1); + +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 0, 2); +} + + +/** * Checks if the memtable is empty or not * @return {@code true} if the memtable is empty, {@code false} otherwise. */
[3/6] cassandra git commit: Slice.isEmpty() returns false for some empty slices
Slice.isEmpty() returns false for some empty slices patch by Sylvain Lebresne; reviewed by Branimir Lambov for CASSANDRA-13305 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31dec3d5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31dec3d5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31dec3d5 Branch: refs/heads/trunk Commit: 31dec3d548ae2c76d7c8bf4bffa9d506f670f756 Parents: 60d3292 Author: Sylvain LebresneAuthored: Thu Mar 9 11:58:40 2017 +0100 Committer: Sylvain Lebresne Committed: Thu Mar 9 11:58:40 2017 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 2 +- .../cql3/validation/operations/DeleteTest.java| 18 ++ 3 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0979852..1876922 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.13 + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) Merged from 2.2: * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/src/java/org/apache/cassandra/db/Slice.java -- diff --git a/src/java/org/apache/cassandra/db/Slice.java b/src/java/org/apache/cassandra/db/Slice.java index 7fde45e..3c645dc 100644 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@ -160,7 +160,7 @@ public class Slice public static boolean isEmpty(ClusteringComparator comparator, Slice.Bound start, Slice.Bound end) { assert start.isStart() && end.isEnd(); -return comparator.compare(end, start) < 0; +return comparator.compare(end, start) <= 0; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java b/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java index 09098ac..9d7d4a3 100644 --- a/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java +++ b/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java @@ -1292,6 +1292,24 @@ public class DeleteTest extends CQLTester } /** + * Test for CASSANDRA-13305 + */ +@Test +public void testWithEmptyRange() throws Throwable +{ +createTable("CREATE TABLE %s (k text, a int, b int, PRIMARY KEY (k, a, b))"); + +// Both of the following should be doing nothing, but before #13305 this inserted broken ranges. We do it twice +// and the follow-up delete mainly as a way to show the bug as the combination of this will trigger an assertion +// in RangeTombstoneList pre-#13305 showing that something wrong happened. +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 1, 1); +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 1, 1); + +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 0, 2); +} + + +/** * Checks if the memtable is empty or not * @return {@code true} if the memtable is empty, {@code false} otherwise. */
[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk * cassandra-3.11: Slice.isEmpty() returns false for some empty slices Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e8e8914 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e8e8914 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e8e8914 Branch: refs/heads/trunk Commit: 9e8e8914d8a4e55a2d647c4c462e1cb7b622a930 Parents: b5a5fbe dc65a57 Author: Sylvain LebresneAuthored: Thu Mar 9 11:59:42 2017 +0100 Committer: Sylvain Lebresne Committed: Thu Mar 9 11:59:42 2017 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 2 +- .../cql3/validation/operations/DeleteTest.java| 18 ++ 3 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e8e8914/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e8e8914/src/java/org/apache/cassandra/db/Slice.java --
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 * cassandra-3.0: Slice.isEmpty() returns false for some empty slices Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dc65a576 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dc65a576 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dc65a576 Branch: refs/heads/trunk Commit: dc65a576553f8766076920c8b639d80763d6e1f5 Parents: 7707a0e 31dec3d Author: Sylvain LebresneAuthored: Thu Mar 9 11:59:32 2017 +0100 Committer: Sylvain Lebresne Committed: Thu Mar 9 11:59:32 2017 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 2 +- .../cql3/validation/operations/DeleteTest.java| 18 ++ 3 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dc65a576/CHANGES.txt -- diff --cc CHANGES.txt index f73dc12,1876922..2772fc2 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,15 -1,11 +1,16 @@@ -3.0.13 +3.11.0 + * Fix equality comparisons of columns using the duration type (CASSANDRA-13174) + * Obfuscate password in stress-graphs (CASSANDRA-12233) + * Move to FastThreadLocalThread and FastThreadLocal (CASSANDRA-13034) + * nodetool stopdaemon errors out (CASSANDRA-13030) + * Tables in system_distributed should not use gcgs of 0 (CASSANDRA-12954) + * Fix primary index calculation for SASI (CASSANDRA-12910) + * More fixes to the TokenAllocator (CASSANDRA-12990) + * NoReplicationTokenAllocator should work with zero replication factor (CASSANDRA-12983) + * Address message coalescing regression (CASSANDRA-12676) +Merged from 3.0: + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) -Merged from 2.2: - * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) - - -3.0.12 * Prevent data loss on upgrade 2.1 - 3.0 by adding component separator to LogRecord absolute path (CASSANDRA-13294) * Improve testing on macOS by eliminating sigar logging (CASSANDRA-13233) * Cqlsh copy-from should error out when csv contains invalid data for collections (CASSANDRA-13071) http://git-wip-us.apache.org/repos/asf/cassandra/blob/dc65a576/src/java/org/apache/cassandra/db/Slice.java -- diff --cc src/java/org/apache/cassandra/db/Slice.java index c3da222,3c645dc..4b36677 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@@ -157,10 -157,10 +157,10 @@@ public class Slic * @return whether the slice formed by {@code start} and {@code end} is * empty or not. */ -public static boolean isEmpty(ClusteringComparator comparator, Slice.Bound start, Slice.Bound end) +public static boolean isEmpty(ClusteringComparator comparator, ClusteringBound start, ClusteringBound end) { assert start.isStart() && end.isEnd(); - return comparator.compare(end, start) < 0; + return comparator.compare(end, start) <= 0; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/dc65a576/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java --
[1/6] cassandra git commit: Slice.isEmpty() returns false for some empty slices
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 60d3292b0 -> 31dec3d54 refs/heads/cassandra-3.11 7707a0ed5 -> dc65a5765 refs/heads/trunk b5a5fbe1f -> 9e8e8914d Slice.isEmpty() returns false for some empty slices patch by Sylvain Lebresne; reviewed by Branimir Lambov for CASSANDRA-13305 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/31dec3d5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/31dec3d5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/31dec3d5 Branch: refs/heads/cassandra-3.0 Commit: 31dec3d548ae2c76d7c8bf4bffa9d506f670f756 Parents: 60d3292 Author: Sylvain LebresneAuthored: Thu Mar 9 11:58:40 2017 +0100 Committer: Sylvain Lebresne Committed: Thu Mar 9 11:58:40 2017 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 2 +- .../cql3/validation/operations/DeleteTest.java| 18 ++ 3 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 0979852..1876922 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.13 + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) Merged from 2.2: * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/src/java/org/apache/cassandra/db/Slice.java -- diff --git a/src/java/org/apache/cassandra/db/Slice.java b/src/java/org/apache/cassandra/db/Slice.java index 7fde45e..3c645dc 100644 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@ -160,7 +160,7 @@ public class Slice public static boolean isEmpty(ClusteringComparator comparator, Slice.Bound start, Slice.Bound end) { assert start.isStart() && end.isEnd(); -return comparator.compare(end, start) < 0; +return comparator.compare(end, start) <= 0; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/31dec3d5/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java b/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java index 09098ac..9d7d4a3 100644 --- a/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java +++ b/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java @@ -1292,6 +1292,24 @@ public class DeleteTest extends CQLTester } /** + * Test for CASSANDRA-13305 + */ +@Test +public void testWithEmptyRange() throws Throwable +{ +createTable("CREATE TABLE %s (k text, a int, b int, PRIMARY KEY (k, a, b))"); + +// Both of the following should be doing nothing, but before #13305 this inserted broken ranges. We do it twice +// and the follow-up delete mainly as a way to show the bug as the combination of this will trigger an assertion +// in RangeTombstoneList pre-#13305 showing that something wrong happened. +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 1, 1); +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 1, 1); + +execute("DELETE FROM %s WHERE k = ? AND a >= ? AND a < ?", "a", 0, 2); +} + + +/** * Checks if the memtable is empty or not * @return {@code true} if the memtable is empty, {@code false} otherwise. */
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 * cassandra-3.0: Slice.isEmpty() returns false for some empty slices Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dc65a576 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dc65a576 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dc65a576 Branch: refs/heads/cassandra-3.11 Commit: dc65a576553f8766076920c8b639d80763d6e1f5 Parents: 7707a0e 31dec3d Author: Sylvain LebresneAuthored: Thu Mar 9 11:59:32 2017 +0100 Committer: Sylvain Lebresne Committed: Thu Mar 9 11:59:32 2017 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 2 +- .../cql3/validation/operations/DeleteTest.java| 18 ++ 3 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dc65a576/CHANGES.txt -- diff --cc CHANGES.txt index f73dc12,1876922..2772fc2 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,15 -1,11 +1,16 @@@ -3.0.13 +3.11.0 + * Fix equality comparisons of columns using the duration type (CASSANDRA-13174) + * Obfuscate password in stress-graphs (CASSANDRA-12233) + * Move to FastThreadLocalThread and FastThreadLocal (CASSANDRA-13034) + * nodetool stopdaemon errors out (CASSANDRA-13030) + * Tables in system_distributed should not use gcgs of 0 (CASSANDRA-12954) + * Fix primary index calculation for SASI (CASSANDRA-12910) + * More fixes to the TokenAllocator (CASSANDRA-12990) + * NoReplicationTokenAllocator should work with zero replication factor (CASSANDRA-12983) + * Address message coalescing regression (CASSANDRA-12676) +Merged from 3.0: + * Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305) * Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238) -Merged from 2.2: - * Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053) - - -3.0.12 * Prevent data loss on upgrade 2.1 - 3.0 by adding component separator to LogRecord absolute path (CASSANDRA-13294) * Improve testing on macOS by eliminating sigar logging (CASSANDRA-13233) * Cqlsh copy-from should error out when csv contains invalid data for collections (CASSANDRA-13071) http://git-wip-us.apache.org/repos/asf/cassandra/blob/dc65a576/src/java/org/apache/cassandra/db/Slice.java -- diff --cc src/java/org/apache/cassandra/db/Slice.java index c3da222,3c645dc..4b36677 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@@ -157,10 -157,10 +157,10 @@@ public class Slic * @return whether the slice formed by {@code start} and {@code end} is * empty or not. */ -public static boolean isEmpty(ClusteringComparator comparator, Slice.Bound start, Slice.Bound end) +public static boolean isEmpty(ClusteringComparator comparator, ClusteringBound start, ClusteringBound end) { assert start.isStart() && end.isEnd(); - return comparator.compare(end, start) < 0; + return comparator.compare(end, start) <= 0; } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/dc65a576/test/unit/org/apache/cassandra/cql3/validation/operations/DeleteTest.java --
[jira] [Comment Edited] (CASSANDRA-12915) SASI: Index intersection with an empty range really inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902793#comment-15902793 ] Alex Petrov edited comment on CASSANDRA-12915 at 3/9/17 9:53 AM: - CI looks pretty broken, but not because of this patch: |[patch|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12915-alternative]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-dtest/lastCompletedBuild/testReport/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-testall/lastCompletedBuild/testReport/] UPD: looks like the branch was a bit outdated, which might have caused the dtest failures, rebased and re-running now. was (Author: ifesdjeen): CI looks pretty broken, but not because of this patch: |[patch|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12915-alternative]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-dtest/lastCompletedBuild/testReport/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-testall/lastCompletedBuild/testReport/] > SASI: Index intersection with an empty range really inefficient > --- > > Key: CASSANDRA-12915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12915 > Project: Cassandra > Issue Type: Improvement > Components: sasi >Reporter: Corentin Chary >Assignee: Corentin Chary > Fix For: 3.11.x, 4.x > > > It looks like RangeIntersectionIterator.java and be pretty inefficient in > some cases. Let's take the following query: > SELECT data FROM table WHERE index1 = 'foo' AND index2 = 'bar'; > In this case: > * index1 = 'foo' will match 2 items > * index2 = 'bar' will match ~300k items > On my setup, the query will take ~1 sec, most of the time being spent in > disk.TokenTree.getTokenAt(). > if I patch RangeIntersectionIterator so that it doesn't try to do the > intersection (and effectively only use 'index1') the query will run in a few > tenth of milliseconds. > I see multiple solutions for that: > * Add a static thresold to avoid the use of the index for the intersection > when we know it will be slow. Probably when the range size factor is very > small and the range size is big. > * CASSANDRA-10765 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-12915) SASI: Index intersection with an empty range really inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902793#comment-15902793 ] Alex Petrov edited comment on CASSANDRA-12915 at 3/9/17 9:51 AM: - CI looks pretty broken, but not because of this patch: |[patch|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12915-alternative]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-dtest/lastCompletedBuild/testReport/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-testall/lastCompletedBuild/testReport/] was (Author: ifesdjeen): CI looks pretty broken, but not because of this patch: |[patch|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12915-alternative]|[utest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-testall/lastCompletedBuild/testReport/] > SASI: Index intersection with an empty range really inefficient > --- > > Key: CASSANDRA-12915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12915 > Project: Cassandra > Issue Type: Improvement > Components: sasi >Reporter: Corentin Chary >Assignee: Corentin Chary > Fix For: 3.11.x, 4.x > > > It looks like RangeIntersectionIterator.java and be pretty inefficient in > some cases. Let's take the following query: > SELECT data FROM table WHERE index1 = 'foo' AND index2 = 'bar'; > In this case: > * index1 = 'foo' will match 2 items > * index2 = 'bar' will match ~300k items > On my setup, the query will take ~1 sec, most of the time being spent in > disk.TokenTree.getTokenAt(). > if I patch RangeIntersectionIterator so that it doesn't try to do the > intersection (and effectively only use 'index1') the query will run in a few > tenth of milliseconds. > I see multiple solutions for that: > * Add a static thresold to avoid the use of the index for the intersection > when we know it will be slow. Probably when the range size factor is very > small and the range size is big. > * CASSANDRA-10765 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12915) SASI: Index intersection with an empty range really inefficient
[ https://issues.apache.org/jira/browse/CASSANDRA-12915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902793#comment-15902793 ] Alex Petrov commented on CASSANDRA-12915: - CI looks pretty broken, but not because of this patch: |[patch|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:12915-alternative]|[utest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12915-alternative-testall/lastCompletedBuild/testReport/] > SASI: Index intersection with an empty range really inefficient > --- > > Key: CASSANDRA-12915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12915 > Project: Cassandra > Issue Type: Improvement > Components: sasi >Reporter: Corentin Chary >Assignee: Corentin Chary > Fix For: 3.11.x, 4.x > > > It looks like RangeIntersectionIterator.java and be pretty inefficient in > some cases. Let's take the following query: > SELECT data FROM table WHERE index1 = 'foo' AND index2 = 'bar'; > In this case: > * index1 = 'foo' will match 2 items > * index2 = 'bar' will match ~300k items > On my setup, the query will take ~1 sec, most of the time being spent in > disk.TokenTree.getTokenAt(). > if I patch RangeIntersectionIterator so that it doesn't try to do the > intersection (and effectively only use 'index1') the query will run in a few > tenth of milliseconds. > I see multiple solutions for that: > * Add a static thresold to avoid the use of the index for the intersection > when we know it will be slow. Probably when the range size factor is very > small and the range size is big. > * CASSANDRA-10765 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12653) In-flight shadow round requests
[ https://issues.apache.org/jira/browse/CASSANDRA-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902788#comment-15902788 ] Stefan Podkowinski commented on CASSANDRA-12653: The value might be effectively reduced to a boolean with the latest version, but it doesn't have to stay that way in the future. But I honestly don't really feel that the level of additional complexity we're talking about here is worth further discussion. At this point it's probably more a matter of personal code style preferences. So are you ok keeping this issue as "ready to commit", [~jkni]? > In-flight shadow round requests > --- > > Key: CASSANDRA-12653 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12653 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > Bootstrapping or replacing a node in the cluster requires to gather and check > some host IDs or tokens by doing a gossip "shadow round" once before joining > the cluster. This is done by sending a gossip SYN to all seeds until we > receive a response with the cluster state, from where we can move on in the > bootstrap process. Receiving a response will call the shadow round done and > calls {{Gossiper.resetEndpointStateMap}} for cleaning up the received state > again. > The issue here is that at this point there might be other in-flight requests > and it's very likely that shadow round responses from other seeds will be > received afterwards, while the current state of the bootstrap process doesn't > expect this to happen (e.g. gossiper may or may not be enabled). > One side effect will be that MigrationTasks are spawned for each shadow round > reply except the first. Tasks might or might not execute based on whether at > execution time {{Gossiper.resetEndpointStateMap}} had been called, which > effects the outcome of {{FailureDetector.instance.isAlive(endpoint))}} at > start of the task. You'll see error log messages such as follows when this > happend: > {noformat} > INFO [SharedPool-Worker-1] 2016-09-08 08:36:39,255 Gossiper.java:993 - > InetAddress /xx.xx.xx.xx is now UP > ERROR [MigrationStage:1]2016-09-08 08:36:39,255 FailureDetector.java:223 > - unknown endpoint /xx.xx.xx.xx > {noformat} > Although is isn't pretty, I currently don't see any serious harm from this, > but it would be good to get a second opinion (feel free to close as "wont > fix"). > /cc [~Stefania] [~thobbs] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13153) Reappeared Data when Mixing Incremental and Full Repairs
[ https://issues.apache.org/jira/browse/CASSANDRA-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902783#comment-15902783 ] Marcus Eriksson commented on CASSANDRA-13153: - oops, sorry for the delay, +1 > Reappeared Data when Mixing Incremental and Full Repairs > > > Key: CASSANDRA-13153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13153 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Tools > Environment: Apache Cassandra 2.2 >Reporter: Amanda Debrot >Assignee: Stefan Podkowinski > Labels: Cassandra > Attachments: log-Reappeared-Data.txt, > Step-by-Step-Simulate-Reappeared-Data.txt > > > This happens for both LeveledCompactionStrategy and > SizeTieredCompactionStrategy. I've only tested it on Cassandra version 2.2 > but it most likely also affects all Cassandra versions after 2.2, if they > have anticompaction with full repair. > When mixing incremental and full repairs, there are a few scenarios where the > Data SSTable is marked as unrepaired and the Tombstone SSTable is marked as > repaired. Then if it is past gc_grace, and the tombstone and data has been > compacted out on other replicas, the next incremental repair will push the > Data to other replicas without the tombstone. > Simplified scenario: > 3 node cluster with RF=3 > Intial config: > Node 1 has data and tombstone in separate SSTables. > Node 2 has data and no tombstone. > Node 3 has data and tombstone in separate SSTables. > Incremental repair (nodetool repair -pr) is run every day so now we have > tombstone on each node. > Some minor compactions have happened since so data and tombstone get merged > to 1 SSTable on Nodes 1 and 3. > Node 1 had a minor compaction that merged data with tombstone. 1 > SSTable with tombstone. > Node 2 has data and tombstone in separate SSTables. > Node 3 had a minor compaction that merged data with tombstone. 1 > SSTable with tombstone. > Incremental repairs keep running every day. > Full repairs run weekly (nodetool repair -full -pr). > Now there are 2 scenarios where the Data SSTable will get marked as > "Unrepaired" while Tombstone SSTable will get marked as "Repaired". > Scenario 1: > Since the Data and Tombstone SSTable have been marked as "Repaired" > and anticompacted, they have had minor compactions with other SSTables > containing keys from other ranges. During full repair, if the last node to > run it doesn't own this particular key in it's partitioner range, the Data > and Tombstone SSTable will get anticompacted and marked as "Unrepaired". Now > in the next incremental repair, if the Data SSTable is involved in a minor > compaction during the repair but the Tombstone SSTable is not, the resulting > compacted SSTable will be marked "Unrepaired" and Tombstone SSTable is marked > "Repaired". > Scenario 2: > Only the Data SSTable had minor compaction with other SSTables > containing keys from other ranges after being marked as "Repaired". The > Tombstone SSTable was never involved in a minor compaction so therefore all > keys in that SSTable belong to 1 particular partitioner range. During full > repair, if the last node to run it doesn't own this particular key in it's > partitioner range, the Data SSTable will get anticompacted and marked as > "Unrepaired". The Tombstone SSTable stays marked as Repaired. > Then it’s past gc_grace. Since Node’s #1 and #3 only have 1 SSTable for that > key, the tombstone will get compacted out. > Node 1 has nothing. > Node 2 has data (in unrepaired SSTable) and tombstone (in repaired > SSTable) in separate SSTables. > Node 3 has nothing. > Now when the next incremental repair runs, it will only use the Data SSTable > to build the merkle tree since the tombstone SSTable is flagged as repaired > and data SSTable is marked as unrepaired. And the data will get repaired > against the other two nodes. > Node 1 has data. > Node 2 has data and tombstone in separate SSTables. > Node 3 has data. > If a read request hits Node 1 and 3, it will return data. If it hits 1 and > 2, or 2 and 3, however, it would return no data. > Tested this with single range tokens for simplicity. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13153) Reappeared Data when Mixing Incremental and Full Repairs
[ https://issues.apache.org/jira/browse/CASSANDRA-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13153: Status: Ready to Commit (was: Patch Available) > Reappeared Data when Mixing Incremental and Full Repairs > > > Key: CASSANDRA-13153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13153 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Tools > Environment: Apache Cassandra 2.2 >Reporter: Amanda Debrot >Assignee: Stefan Podkowinski > Labels: Cassandra > Attachments: log-Reappeared-Data.txt, > Step-by-Step-Simulate-Reappeared-Data.txt > > > This happens for both LeveledCompactionStrategy and > SizeTieredCompactionStrategy. I've only tested it on Cassandra version 2.2 > but it most likely also affects all Cassandra versions after 2.2, if they > have anticompaction with full repair. > When mixing incremental and full repairs, there are a few scenarios where the > Data SSTable is marked as unrepaired and the Tombstone SSTable is marked as > repaired. Then if it is past gc_grace, and the tombstone and data has been > compacted out on other replicas, the next incremental repair will push the > Data to other replicas without the tombstone. > Simplified scenario: > 3 node cluster with RF=3 > Intial config: > Node 1 has data and tombstone in separate SSTables. > Node 2 has data and no tombstone. > Node 3 has data and tombstone in separate SSTables. > Incremental repair (nodetool repair -pr) is run every day so now we have > tombstone on each node. > Some minor compactions have happened since so data and tombstone get merged > to 1 SSTable on Nodes 1 and 3. > Node 1 had a minor compaction that merged data with tombstone. 1 > SSTable with tombstone. > Node 2 has data and tombstone in separate SSTables. > Node 3 had a minor compaction that merged data with tombstone. 1 > SSTable with tombstone. > Incremental repairs keep running every day. > Full repairs run weekly (nodetool repair -full -pr). > Now there are 2 scenarios where the Data SSTable will get marked as > "Unrepaired" while Tombstone SSTable will get marked as "Repaired". > Scenario 1: > Since the Data and Tombstone SSTable have been marked as "Repaired" > and anticompacted, they have had minor compactions with other SSTables > containing keys from other ranges. During full repair, if the last node to > run it doesn't own this particular key in it's partitioner range, the Data > and Tombstone SSTable will get anticompacted and marked as "Unrepaired". Now > in the next incremental repair, if the Data SSTable is involved in a minor > compaction during the repair but the Tombstone SSTable is not, the resulting > compacted SSTable will be marked "Unrepaired" and Tombstone SSTable is marked > "Repaired". > Scenario 2: > Only the Data SSTable had minor compaction with other SSTables > containing keys from other ranges after being marked as "Repaired". The > Tombstone SSTable was never involved in a minor compaction so therefore all > keys in that SSTable belong to 1 particular partitioner range. During full > repair, if the last node to run it doesn't own this particular key in it's > partitioner range, the Data SSTable will get anticompacted and marked as > "Unrepaired". The Tombstone SSTable stays marked as Repaired. > Then it’s past gc_grace. Since Node’s #1 and #3 only have 1 SSTable for that > key, the tombstone will get compacted out. > Node 1 has nothing. > Node 2 has data (in unrepaired SSTable) and tombstone (in repaired > SSTable) in separate SSTables. > Node 3 has nothing. > Now when the next incremental repair runs, it will only use the Data SSTable > to build the merkle tree since the tombstone SSTable is flagged as repaired > and data SSTable is marked as unrepaired. And the data will get repaired > against the other two nodes. > Node 1 has data. > Node 2 has data and tombstone in separate SSTables. > Node 3 has data. > If a read request hits Node 1 and 3, it will return data. If it hits 1 and > 2, or 2 and 3, however, it would return no data. > Tested this with single range tokens for simplicity. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13153) Reappeared Data when Mixing Incremental and Full Repairs
[ https://issues.apache.org/jira/browse/CASSANDRA-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902779#comment-15902779 ] Stefan Podkowinski commented on CASSANDRA-13153: [~krummas], any feedback on the latest, simplified patch version? > Reappeared Data when Mixing Incremental and Full Repairs > > > Key: CASSANDRA-13153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13153 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Tools > Environment: Apache Cassandra 2.2 >Reporter: Amanda Debrot >Assignee: Stefan Podkowinski > Labels: Cassandra > Attachments: log-Reappeared-Data.txt, > Step-by-Step-Simulate-Reappeared-Data.txt > > > This happens for both LeveledCompactionStrategy and > SizeTieredCompactionStrategy. I've only tested it on Cassandra version 2.2 > but it most likely also affects all Cassandra versions after 2.2, if they > have anticompaction with full repair. > When mixing incremental and full repairs, there are a few scenarios where the > Data SSTable is marked as unrepaired and the Tombstone SSTable is marked as > repaired. Then if it is past gc_grace, and the tombstone and data has been > compacted out on other replicas, the next incremental repair will push the > Data to other replicas without the tombstone. > Simplified scenario: > 3 node cluster with RF=3 > Intial config: > Node 1 has data and tombstone in separate SSTables. > Node 2 has data and no tombstone. > Node 3 has data and tombstone in separate SSTables. > Incremental repair (nodetool repair -pr) is run every day so now we have > tombstone on each node. > Some minor compactions have happened since so data and tombstone get merged > to 1 SSTable on Nodes 1 and 3. > Node 1 had a minor compaction that merged data with tombstone. 1 > SSTable with tombstone. > Node 2 has data and tombstone in separate SSTables. > Node 3 had a minor compaction that merged data with tombstone. 1 > SSTable with tombstone. > Incremental repairs keep running every day. > Full repairs run weekly (nodetool repair -full -pr). > Now there are 2 scenarios where the Data SSTable will get marked as > "Unrepaired" while Tombstone SSTable will get marked as "Repaired". > Scenario 1: > Since the Data and Tombstone SSTable have been marked as "Repaired" > and anticompacted, they have had minor compactions with other SSTables > containing keys from other ranges. During full repair, if the last node to > run it doesn't own this particular key in it's partitioner range, the Data > and Tombstone SSTable will get anticompacted and marked as "Unrepaired". Now > in the next incremental repair, if the Data SSTable is involved in a minor > compaction during the repair but the Tombstone SSTable is not, the resulting > compacted SSTable will be marked "Unrepaired" and Tombstone SSTable is marked > "Repaired". > Scenario 2: > Only the Data SSTable had minor compaction with other SSTables > containing keys from other ranges after being marked as "Repaired". The > Tombstone SSTable was never involved in a minor compaction so therefore all > keys in that SSTable belong to 1 particular partitioner range. During full > repair, if the last node to run it doesn't own this particular key in it's > partitioner range, the Data SSTable will get anticompacted and marked as > "Unrepaired". The Tombstone SSTable stays marked as Repaired. > Then it’s past gc_grace. Since Node’s #1 and #3 only have 1 SSTable for that > key, the tombstone will get compacted out. > Node 1 has nothing. > Node 2 has data (in unrepaired SSTable) and tombstone (in repaired > SSTable) in separate SSTables. > Node 3 has nothing. > Now when the next incremental repair runs, it will only use the Data SSTable > to build the merkle tree since the tombstone SSTable is flagged as repaired > and data SSTable is marked as unrepaired. And the data will get repaired > against the other two nodes. > Node 1 has data. > Node 2 has data and tombstone in separate SSTables. > Node 3 has data. > If a read request hits Node 1 and 3, it will return data. If it hits 1 and > 2, or 2 and 3, however, it would return no data. > Tested this with single range tokens for simplicity. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13259) Use platform specific X.509 default algorithm
[ https://issues.apache.org/jira/browse/CASSANDRA-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-13259: --- Status: Awaiting Feedback (was: In Progress) > Use platform specific X.509 default algorithm > - > > Key: CASSANDRA-13259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13259 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > We should replace the hardcoded "SunX509" default algorithm and use the JRE > default instead. This implementation will currently not work on less popular > platforms (e.g. IBM) and won't get any further updates. > See also: > https://bugs.openjdk.java.net/browse/JDK-8169745 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13259) Use platform specific X.509 default algorithm
[ https://issues.apache.org/jira/browse/CASSANDRA-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-13259: --- Reviewer: Jason Brown > Use platform specific X.509 default algorithm > - > > Key: CASSANDRA-13259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13259 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > We should replace the hardcoded "SunX509" default algorithm and use the JRE > default instead. This implementation will currently not work on less popular > platforms (e.g. IBM) and won't get any further updates. > See also: > https://bugs.openjdk.java.net/browse/JDK-8169745 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13259) Use platform specific X.509 default algorithm
[ https://issues.apache.org/jira/browse/CASSANDRA-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902763#comment-15902763 ] Stefan Podkowinski commented on CASSANDRA-13259: I'd prefer to keep discussing my latest proposal within this ticket, as I tend to get confused having to many highly related tickets and I've also just created CASSANDRA-13314 to discuss how we could go even further, after this ticket has been solved. > Use platform specific X.509 default algorithm > - > > Key: CASSANDRA-13259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13259 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > We should replace the hardcoded "SunX509" default algorithm and use the JRE > default instead. This implementation will currently not work on less popular > platforms (e.g. IBM) and won't get any further updates. > See also: > https://bugs.openjdk.java.net/browse/JDK-8169745 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13259) Use platform specific X.509 default algorithm
[ https://issues.apache.org/jira/browse/CASSANDRA-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-13259: --- Status: In Progress (was: Ready to Commit) > Use platform specific X.509 default algorithm > - > > Key: CASSANDRA-13259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13259 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > We should replace the hardcoded "SunX509" default algorithm and use the JRE > default instead. This implementation will currently not work on less popular > platforms (e.g. IBM) and won't get any further updates. > See also: > https://bugs.openjdk.java.net/browse/JDK-8169745 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13314) Config file based SSL settings
Stefan Podkowinski created CASSANDRA-13314: -- Summary: Config file based SSL settings Key: CASSANDRA-13314 URL: https://issues.apache.org/jira/browse/CASSANDRA-13314 Project: Cassandra Issue Type: Improvement Components: Configuration, Tools Reporter: Stefan Podkowinski Assignee: Stefan Podkowinski Priority: Minor Fix For: 4.x As follow up of CASSANDRA-13259, I'd like to continue discussing how we can make SSL less awkward to use and further move SSL related code out of our code base. Currently we construct our own SSLContext in SSLFactory based on EncryptionOptions passed by the MessagingService or any individual tool where we need to offer SSL support. This leads to a situation where the user has not only to learn how to enable the correct settings in cassandra.yaml, but these settings must also be reflected in each tool's own command line options. As argued in CASSANDRA-13259, these settings could be done as well by setting the appropriate system and security properties ([overview|http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#InstallationAndCustomization]) and we should just point the user to the right files to do that (jvm.options and java.security) and make sure that daemon and all affected tools will source them. Since giving this a quick try on my WIP branch, I've noticed the following issues in doing so: * Keystore passwords will show up in process list (-Djavax.net.ssl.keyStorePassword=..). We should keep the password setting in cassandra.yaml and clis and do a System.setProperty() if they have been provided. * It's only possible to configure settings for a single default key-/truststore. Since we currently allow configuring both ServerEncryptionOptions and ClientEncryptionOptions with different settings, we'd have to make this a breaking change. I don't really see why you would want to use different stores for node-to-node and node-to-client, but that wouldn't be possible anymore. * This would probably only make sense if we really remove the affected CLI options, or we'll end up with just another way to configure this stuff. This will break existing scripts and obsolete existing documentation. Any opinions? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CASSANDRA-13275) Cassandra throws an exception during CQL select query filtering on map key
[ https://issues.apache.org/jira/browse/CASSANDRA-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-13275: -- Assignee: Benjamin Lerer > Cassandra throws an exception during CQL select query filtering on map key > --- > > Key: CASSANDRA-13275 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13275 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Abderrahmane CHRAIBI >Assignee: Benjamin Lerer > > Env: cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4 > Using this table structure: > {code}CREATE TABLE mytable ( > mymap frozen