[jira] [Commented] (CASSANDRA-11223) Queries with LIMIT filtering on clustering columns can return less rows than expected
[ https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095785#comment-16095785 ] Stefania commented on CASSANDRA-11223: -- The problem does not affect 2.2 because NamesQueryFilter.getLiveCount() is unchanged, and in any case GroupByPrefix will count 1 live row if toGroup is zero, which would be the case for the GroupByPrefix used by NamesQueryFilter.columnCounter(). The queries on this type of tables will never use SliceQueryFilter. For 3.0+, it's a bit of a pain to revert, so I have created the follow-up [patch|https://github.com/apache/cassandra/compare/trunk...stef1927:11223-3.0]. CI is currently running. The patch ensures that static rows are always counted for static compact tables (which is sufficient to fix the problem) but also assumes that a NamesQueryFilter is selecting the entire partition if it has no clustering values. This second part is not necessary but I think it's more correct because in this case NamesQueryFilter is not restricting anything, and existing usages of selectsAllPartition where limited to check if the filter restricts anything, in order to create CQL string representations of a query. Benjamin should be back in 1 week, but I can find a reviewer sooner if required. > Queries with LIMIT filtering on clustering columns can return less rows than > expected > - > > Key: CASSANDRA-11223 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11223 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0 > > > A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can > return less row than expected if the table has some static columns and some > of the partition have no rows matching b = 1. > The problem can be reproduced with the following unit test: > {code} > public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() > throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, s int static, c int, > primary key (a, b))"); > for (int i = 0; i < 3; i++) > { > execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i); > for (int j = 0; j < 3; j++) > if (!(i == 0 && j == 1)) > execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", > i, j, i + j); > } > assertRows(execute("SELECT * FROM %s"), > row(1, 0, 1, 1), > row(1, 1, 1, 2), > row(1, 2, 1, 3), > row(0, 0, 0, 0), > row(0, 2, 0, 2), > row(2, 0, 2, 2), > row(2, 1, 2, 3), > row(2, 2, 2, 4)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW > FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); // < FAIL It returns only one > row because the static row of partition 0 is counted and filtered out in > SELECT statement > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-11223) Queries with LIMIT filtering on clustering columns can return less rows than expected
[ https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-11223: - Status: Patch Available (was: Reopened) > Queries with LIMIT filtering on clustering columns can return less rows than > expected > - > > Key: CASSANDRA-11223 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11223 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0 > > > A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can > return less row than expected if the table has some static columns and some > of the partition have no rows matching b = 1. > The problem can be reproduced with the following unit test: > {code} > public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() > throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, s int static, c int, > primary key (a, b))"); > for (int i = 0; i < 3; i++) > { > execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i); > for (int j = 0; j < 3; j++) > if (!(i == 0 && j == 1)) > execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", > i, j, i + j); > } > assertRows(execute("SELECT * FROM %s"), > row(1, 0, 1, 1), > row(1, 1, 1, 2), > row(1, 2, 1, 3), > row(0, 0, 0, 0), > row(0, 2, 0, 2), > row(2, 0, 2, 2), > row(2, 1, 2, 3), > row(2, 2, 2, 4)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW > FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); // < FAIL It returns only one > row because the static row of partition 0 is counted and filtered out in > SELECT statement > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13715) Allow TRACE logging on upgrade dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-13715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13715: --- Reviewer: Michael Kjellman > Allow TRACE logging on upgrade dtests > - > > Key: CASSANDRA-13715 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13715 > Project: Cassandra > Issue Type: Improvement > Components: Testing >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Trivial > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13142) Upgradesstables cancels compactions unnecessarily
[ https://issues.apache.org/jira/browse/CASSANDRA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095672#comment-16095672 ] Kurt Greaves commented on CASSANDRA-13142: -- TBH leaving out normal compactions kind of sucks (was the original reason i created this ticket), however as an alternative, what if we make the fix for views/index builds part of the default behaviour and add a new option to upgradesstables that has the new behaviour in that it won't wait for compactions to complete before finishing (new option might could even let you filter what to interrupt?). > Upgradesstables cancels compactions unnecessarily > - > > Key: CASSANDRA-13142 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13142 > Project: Cassandra > Issue Type: Bug >Reporter: Kurt Greaves >Assignee: Kurt Greaves > Attachments: 13142-v1.patch > > > Since at least 1.2 upgradesstables will cancel any compactions bar > validations when run. This was originally determined as a non-issue in > CASSANDRA-3430 however can be quite annoying (especially with STCS) as a > compaction will output the new version anyway. Furthermore, as per > CASSANDRA-12243 it also stops things like view builds and I assume secondary > index builds as well which is not ideal. > We should avoid cancelling compactions unnecessarily. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095662#comment-16095662 ] Kurt Greaves commented on CASSANDRA-13701: -- Obviously the bug Jeff mentioned, but also we probably need to do some work on making configuration a bit more straightforward, as just enforcing it will make startup of a new cluster slightly more complicated. At the moment you won't be able to start a node if the keyspace you specify doesn't exist. This is a kind of chicken and egg problem for seed nodes, which I believe is currently solved by seed nodes using random allocation, at which point you can create the keyspace and add the keyspace to the yaml for any new nodes. Honestly this is probably a little bit convoluted for new users. DSE has obviously realised this as they have changed their yaml property to actually specify the RF, rather than the keyspace, which means you can specify this before creating the keyspace. Kind of works but not sure if it's the best choice. IMO we'll need to come up with some novel way so that you can have it so you can configure and start a multi node cluster in a straightforward manner. I'd say it's reasonable that all nodes should be able to have the same underlying configuration, i.e, the minimum set of yaml properties is the same for all nodes. This would make config management for clusters much simpler rather than having to be aware of this, and special case seed nodes. > Lower default num_tokens > > > Key: CASSANDRA-13701 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13701 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > > For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not > necessary. It is very expensive for operations processes and scanning. Its > come up a lot and its pretty standard and known now to always reduce the > num_tokens within the community. We should just lower the defaults. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13701) Lower default num_tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095638#comment-16095638 ] Jeremy Hanna edited comment on CASSANDRA-13701 at 7/21/17 1:35 AM: --- Adding datacenters with the new algorithm requires some additional configuration. We would need to make users aware of that trade-off when using that algorithm and the benefits of fewer token ranges per node. It's talked about [here|http://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html] but we should make it clearer in the apache docs as well. We can point to those in the comments around vnode tokens. So it would be nice to add some more information [here|http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#allocate-tokens-for-keyspace] and then perhaps in [here|http://cassandra.apache.org/doc/latest/operating/topo_changes.html] with an additional section about adding a datacenter. And Jeff: good point about the token allocation - would be good to track that down before making the new algorithm the default. However even still I think even with the old algorithm we could at the very least halve the number of default vnode ranges. was (Author: jeromatron): Adding datacenters with the new algorithm requires some additional configuration. We would need to make users aware of that trade-off when using that algorithm and the benefits of fewer token ranges per node. It's talked about [here|http://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html] but we should make it clearer in the apache docs as well. We can point to those in the comments around vnode tokens. So it would be nice to add some more information [here|http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#allocate-tokens-for-keyspace] and then perhaps in [here|http://cassandra.apache.org/doc/latest/operating/topo_changes.html] with an additional section about adding a datacenter. > Lower default num_tokens > > > Key: CASSANDRA-13701 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13701 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > > For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not > necessary. It is very expensive for operations processes and scanning. Its > come up a lot and its pretty standard and known now to always reduce the > num_tokens within the community. We should just lower the defaults. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13701) Lower default num_tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095638#comment-16095638 ] Jeremy Hanna edited comment on CASSANDRA-13701 at 7/21/17 1:31 AM: --- Adding datacenters with the new algorithm requires some additional configuration. We would need to make users aware of that trade-off when using that algorithm and the benefits of fewer token ranges per node. It's talked about [here|http://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html] but we should make it clearer in the apache docs as well. We can point to those in the comments around vnode tokens. So it would be nice to add some more information [here|http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#allocate-tokens-for-keyspace] and then perhaps in [here|http://cassandra.apache.org/doc/latest/operating/topo_changes.html] with an additional section about adding a datacenter. was (Author: jeromatron): Adding datacenters with the new algorithm requires some additional configuration. We would need to make users aware of that trade-off when using that algorithm and the benefits of fewer token ranges per node. It's talked about [here|http://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html] but we should make it clearer in the apache docs as well. We can point to those in the comments around vnode tokens. > Lower default num_tokens > > > Key: CASSANDRA-13701 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13701 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > > For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not > necessary. It is very expensive for operations processes and scanning. Its > come up a lot and its pretty standard and known now to always reduce the > num_tokens within the community. We should just lower the defaults. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095638#comment-16095638 ] Jeremy Hanna commented on CASSANDRA-13701: -- Adding datacenters with the new algorithm requires some additional configuration. We would need to make users aware of that trade-off when using that algorithm and the benefits of fewer token ranges per node. It's talked about [here|http://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html] but we should make it clearer in the apache docs as well. We can point to those in the comments around vnode tokens. > Lower default num_tokens > > > Key: CASSANDRA-13701 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13701 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > > For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not > necessary. It is very expensive for operations processes and scanning. Its > come up a lot and its pretty standard and known now to always reduce the > num_tokens within the community. We should just lower the defaults. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13715) Allow TRACE logging on upgrade dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-13715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-13715: Summary: Allow TRACE logging on upgrade dtests (was: Allow TRACE logging on upgrade tests) > Allow TRACE logging on upgrade dtests > - > > Key: CASSANDRA-13715 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13715 > Project: Cassandra > Issue Type: Improvement > Components: Testing >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Trivial > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13715) Allow TRACE logging on upgrade tests
[ https://issues.apache.org/jira/browse/CASSANDRA-13715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095623#comment-16095623 ] Jason Brown commented on CASSANDRA-13715: - trivial patch [here|https://github.com/jasobrown/cassandra-dtest/tree/13715] > Allow TRACE logging on upgrade tests > > > Key: CASSANDRA-13715 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13715 > Project: Cassandra > Issue Type: Improvement > Components: Testing >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Trivial > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13715) Allow TRACE logging on upgrade tests
Jason Brown created CASSANDRA-13715: --- Summary: Allow TRACE logging on upgrade tests Key: CASSANDRA-13715 URL: https://issues.apache.org/jira/browse/CASSANDRA-13715 Project: Cassandra Issue Type: Improvement Components: Testing Reporter: Jason Brown Assignee: Jason Brown Priority: Trivial -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095529#comment-16095529 ] Jeff Jirsa commented on CASSANDRA-13701: Right now the new algorithm double assigns tokens, so that's at least one reason > Lower default num_tokens > > > Key: CASSANDRA-13701 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13701 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > > For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not > necessary. It is very expensive for operations processes and scanning. Its > come up a lot and its pretty standard and known now to always reduce the > num_tokens within the community. We should just lower the defaults. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095509#comment-16095509 ] Nate McCall commented on CASSANDRA-13701: - bq. I don't think we should be reducing the num_tokens default unless we also enforce the new allocation algorithm by default Excellent point [~KurtG]. Is there a reason why we would not want to do this? > Lower default num_tokens > > > Key: CASSANDRA-13701 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13701 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > > For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not > necessary. It is very expensive for operations processes and scanning. Its > come up a lot and its pretty standard and known now to always reduce the > num_tokens within the community. We should just lower the defaults. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11825) NPE in gossip
[ https://issues.apache.org/jira/browse/CASSANDRA-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095417#comment-16095417 ] Jason Brown commented on CASSANDRA-11825: - [~jkni] and I discussed this in relation into CASSANDRA-13700, and, once again, this seems like a race between two threads updating gossip state I think the safest thing is to execute Gossiper's {{#start(int)}} and {{#stop()}} methods on the gossip stage. If need be, {{StorageService#stopGossiping()}} and {{#startGossiping()}} can block on a future from the stage executor to complete. > NPE in gossip > - > > Key: CASSANDRA-11825 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11825 > Project: Cassandra > Issue Type: Bug >Reporter: T Jake Luciani >Assignee: Joel Knighton > Labels: fallout > Fix For: 3.0.x > > > We have a test that causes an NPE in gossip code: > It's basically calling nodetool enable/disable gossip > From the debug log > {quote} > WARN [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,423 > StorageService.java:395 - Starting gossip by operator request > DEBUG [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 > StorageService.java:1996 - Node /172.31.24.76 state NORMAL, token > [-9223372036854775808] > INFO [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 > StorageService.java:1999 - Node /172.31.24.76 state jump to NORMAL > DEBUG [RMI TCP Connection(17)-54.153.70.214] 2016-05-17 18:58:44,424 > YamlConfigurationLoader.java:102 - Loading settings from > file:/mnt/ephemeral/automaton/cassandra-src/conf/cassandra.yaml > DEBUG [PendingRangeCalculator:1] 2016-05-17 18:58:44,425 > PendingRangeCalculatorService.java:66 - finished calculation for 5 keyspaces > in 0ms > DEBUG [GossipStage:1] 2016-05-17 18:58:45,346 FailureDetector.java:456 - > Ignoring interval time of 75869093776 for /172.31.31.1 > DEBUG [GossipStage:1] 2016-05-17 18:58:45,347 FailureDetector.java:456 - > Ignoring interval time of 75869214424 for /172.31.17.32 > INFO [GossipStage:1] 2016-05-17 18:58:45,347 Gossiper.java:1028 - Node > /172.31.31.1 has restarted, now UP > DEBUG [GossipStage:1] 2016-05-17 18:58:45,347 StorageService.java:1996 - Node > /172.31.31.1 state NORMAL, token [-3074457345618258603] > INFO [GossipStage:1] 2016-05-17 18:58:45,347 StorageService.java:1999 - Node > /172.31.31.1 state jump to NORMAL > INFO [HANDSHAKE-/172.31.31.1] 2016-05-17 18:58:45,348 > OutboundTcpConnection.java:514 - Handshaking version with /172.31.31.1 > ERROR [GossipStage:1] 2016-05-17 18:58:45,354 CassandraDaemon.java:195 - > Exception in thread Thread[GossipStage:1,5,main] > java.lang.NullPointerException: null > at org.apache.cassandra.gms.Gossiper.getHostId(Gossiper.java:846) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2008) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.onChange(StorageService.java:1729) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2446) > ~[main/:na] > at > org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1050) > ~[main/:na] > at > org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1133) > ~[main/:na] > at > org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) > ~[main/:na] > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_40] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_40] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_40] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_40] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] > INFO [GossipStage:1] 2016-05-17 18:58:45,355 Gossiper.java:1028 - Node > /172.31.17.32 has restarted, now UP > DEBUG [GossipStage:1] 2016-05-17 18:58:45,355 StorageService.java:1996 - Node > /172.31.17.32 state NORMAL, token [3074457345618258602] > INFO [GossipStage:1] 2016-05-17 18:58:45,356 StorageService.java:1999 - Node > /172.31.17.32 state jump to NORMAL > INFO [HANDSHAKE-/172.31.17.32] 2016-05-17 18:58:45,356 > OutboundTcpConnection.java:514 - Handshaking version with /172.31.17.32 > DEBUG [PendingRangeCalculator:1] 2016-05-17 18:58:45,357 > PendingRangeCalculatorService.java:66 - finished calculation for 5 keyspaces > in 0ms > DEBUG [GossipStage:1] 2016-05-17 18:58:45,357 MigrationManager.java:94 - Not > pulling schema because versions match or shouldPullSchemaFrom returned
[jira] [Created] (CASSANDRA-13714) response to EchoMessage is sent on wrong connection
Jason Brown created CASSANDRA-13714: --- Summary: response to EchoMessage is sent on wrong connection Key: CASSANDRA-13714 URL: https://issues.apache.org/jira/browse/CASSANDRA-13714 Project: Cassandra Issue Type: Bug Components: Distributed Metadata Reporter: Jason Brown Priority: Trivial Followup to CASSANDRA-13713. To force the {{EchoResponse}} response onto the correct stage, we should create a new message type, {{EchoResponseMessage}}, and map it appropriately in {{MessagingService.verbStages}}. Mapping the response message correctly will allow the response to be sent on the gossip connection, and then allow us to process it immediately on the gossip stage, rather the request_response stage. One serious problem to consider is the upgrade scenario, where the non-upgraded node expects a simple RequestResponse message that maps to a callback. If the upgraded node tries to send the new {{EchoResponseMessage}}, it will be ignored by the old node. And thus we get into some weird state where gossip can't communicate directly, even though the actual TCP connection and wrapper channel is setup correctly. (I haven't thought about all the oddball fall out that can occur as a rolling upgrade rolls out). Thus, due to that complexity, versus the triviality/near-zero impact of the bug (sending the response on the wrong channel is not a big deal), I feel this ticket is largely not worth bothering with. That said, I at least want to capture the problem for posterity. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13713) Move processing of EchoMessage response to gossip stage
[ https://issues.apache.org/jira/browse/CASSANDRA-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095402#comment-16095402 ] Jason Brown commented on CASSANDRA-13713: - A simple fix available here: ||trunk|| |[branch|https://github.com/jasobrown/cassandra/tree/13713-trunk]| |[utests|https://circleci.com/gh/jasobrown/cassandra/tree/13713-trunk/]| This incorrect behavior has been around for a long time. However, I'm not sure how far back to apply the change (the change will be the same for back to 2.1). The existing execution of {{Gossiper#realMarkAlive}} is already asynchronous (nothing is dependent on it, per se); the only downside is another (short-lived, once per peer) task to be executed on the gossip stage. I feel that's a tiny price to pay for reducing the number of different threads that can modify the state of gossip. > Move processing of EchoMessage response to gossip stage > --- > > Key: CASSANDRA-13713 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13713 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Minor > Fix For: 4.x > > > Currently, when a node receives an {{EchoMessage}}, is sends a simple ACK > reply back (see {{EchoVerbHandler}}). The ACK is sent on the small message > connection, and because it is 'generically' typed as > {{Verb.REQUEST_RESPONSE}}, is consumed on a {{Stage.REQUEST_RESPONSE}} > thread. The proper thread for this response to be consumed is > {{Stage.GOSSIP}}, that way we can move more of the updating of the gossip > state to a single, centralized thread, and less abuse of gossip's shared > mutable state can occur. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13713) Move processing of EchoMessage response to gossip stage
Jason Brown created CASSANDRA-13713: --- Summary: Move processing of EchoMessage response to gossip stage Key: CASSANDRA-13713 URL: https://issues.apache.org/jira/browse/CASSANDRA-13713 Project: Cassandra Issue Type: Bug Components: Distributed Metadata Reporter: Jason Brown Assignee: Jason Brown Priority: Minor Fix For: 4.x Currently, when a node receives an {{EchoMessage}}, is sends a simple ACK reply back (see {{EchoVerbHandler}}). The ACK is sent on the small message connection, and because it is 'generically' typed as {{Verb.REQUEST_RESPONSE}}, is consumed on a {{Stage.REQUEST_RESPONSE}} thread. The proper thread for this response to be consumed is {{Stage.GOSSIP}}, that way we can move more of the updating of the gossip state to a single, centralized thread, and less abuse of gossip's shared mutable state can occur. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12996) update slf4j dependency to 1.7.21
[ https://issues.apache.org/jira/browse/CASSANDRA-12996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-12996: --- Resolution: Fixed Reviewer: Robert Stupp Status: Resolved (was: Patch Available) Merged in 948fdfc67922ae7ecd410 > update slf4j dependency to 1.7.21 > - > > Key: CASSANDRA-12996 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12996 > Project: Cassandra > Issue Type: Improvement > Components: Libraries >Reporter: Tomas Repik >Assignee: Stefan Podkowinski > Fix For: 4.0 > > Attachments: cassandra-3.11.0-slf4j.patch, > jcl-over-slf4j-1.7.25.jar.asc, log4j-over-slf4j-1.7.25.jar.asc, > slf4j-api-1.7.25.jar.asc > > > Cassandra 3.11.0 is about to be included in Fedora. There are some tweaks to > the sources we need to do in order to successfully build it. Cassandra > depends on slf4j 1.7.7, but In Fedora we have the latest upstream version > 1.7.21 It was released some time ago on April 6 2016. I attached a patch > updating Cassandra sources to depend on the newer slf4j sources. The only > actual change is the number of parameters accepted by SubstituteLogger class. > Please consider updating. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12996) update slf4j dependency to 1.7.21
[ https://issues.apache.org/jira/browse/CASSANDRA-12996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-12996: --- Attachment: jcl-over-slf4j-1.7.25.jar.asc log4j-over-slf4j-1.7.25.jar.asc slf4j-api-1.7.25.jar.asc > update slf4j dependency to 1.7.21 > - > > Key: CASSANDRA-12996 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12996 > Project: Cassandra > Issue Type: Improvement > Components: Libraries >Reporter: Tomas Repik >Assignee: Stefan Podkowinski > Fix For: 4.0 > > Attachments: cassandra-3.11.0-slf4j.patch, > jcl-over-slf4j-1.7.25.jar.asc, log4j-over-slf4j-1.7.25.jar.asc, > slf4j-api-1.7.25.jar.asc > > > Cassandra 3.11.0 is about to be included in Fedora. There are some tweaks to > the sources we need to do in order to successfully build it. Cassandra > depends on slf4j 1.7.7, but In Fedora we have the latest upstream version > 1.7.21 It was released some time ago on April 6 2016. I attached a patch > updating Cassandra sources to depend on the newer slf4j sources. The only > actual change is the number of parameters accepted by SubstituteLogger class. > Please consider updating. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Upgrade slf4j to 1.7.25
Repository: cassandra Updated Branches: refs/heads/trunk 12d4e2f18 -> 948fdfc67 Upgrade slf4j to 1.7.25 patch by Stefan Podkowinski; reviewed by Robert Stupp for CASSANDRA-12996 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/948fdfc6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/948fdfc6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/948fdfc6 Branch: refs/heads/trunk Commit: 948fdfc67922ae7ecd410149b8d386c224ca4a95 Parents: 12d4e2f Author: Stefan PodkowinskiAuthored: Fri Jun 2 18:57:21 2017 +0200 Committer: Stefan Podkowinski Committed: Thu Jul 20 20:41:55 2017 +0200 -- CHANGES.txt | 1 + build.xml | 6 ++--- lib/jcl-over-slf4j-1.7.25.jar | Bin 0 -> 16515 bytes lib/jcl-over-slf4j-1.7.7.jar| Bin 16519 -> 0 bytes lib/licenses/jcl-over-slf4j-1.7.25.txt | 24 +++ lib/licenses/jcl-over-slf4j-1.7.7.txt | 20 lib/licenses/log4j-over-slf4j-1.7.25.txt| 24 +++ lib/licenses/log4j-over-slf4j-1.7.7.txt | 20 lib/licenses/slf4j-api-1.7.25.txt | 24 +++ lib/licenses/slf4j-api-1.7.7.txt| 20 lib/log4j-over-slf4j-1.7.25.jar | Bin 0 -> 23645 bytes lib/log4j-over-slf4j-1.7.7.jar | Bin 24220 -> 0 bytes lib/slf4j-api-1.7.25.jar| Bin 0 -> 41203 bytes lib/slf4j-api-1.7.7.jar | Bin 29257 -> 0 bytes .../cassandra/utils/NoSpamLoggerTest.java | 2 +- 15 files changed, 77 insertions(+), 64 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/948fdfc6/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 80d82dd..7632337 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Upgrade SLF4J from 1.7.7 to 1.7.25 * Default for start_native_transport now true if not set in config (CASSANDRA-13656) * Don't add localhost to the graph when calculating where to stream from (CASSANDRA-13583) * Allow skipping equality-restricted clustering columns in ORDER BY clause (CASSANDRA-10271) http://git-wip-us.apache.org/repos/asf/cassandra/blob/948fdfc6/build.xml -- diff --git a/build.xml b/build.xml index ec2466d..7421329 100644 --- a/build.xml +++ b/build.xml @@ -372,9 +372,9 @@ - - - + + + http://git-wip-us.apache.org/repos/asf/cassandra/blob/948fdfc6/lib/jcl-over-slf4j-1.7.25.jar -- diff --git a/lib/jcl-over-slf4j-1.7.25.jar b/lib/jcl-over-slf4j-1.7.25.jar new file mode 100644 index 000..8e7fec8 Binary files /dev/null and b/lib/jcl-over-slf4j-1.7.25.jar differ http://git-wip-us.apache.org/repos/asf/cassandra/blob/948fdfc6/lib/jcl-over-slf4j-1.7.7.jar -- diff --git a/lib/jcl-over-slf4j-1.7.7.jar b/lib/jcl-over-slf4j-1.7.7.jar deleted file mode 100644 index ed8d4dd..000 Binary files a/lib/jcl-over-slf4j-1.7.7.jar and /dev/null differ http://git-wip-us.apache.org/repos/asf/cassandra/blob/948fdfc6/lib/licenses/jcl-over-slf4j-1.7.25.txt -- diff --git a/lib/licenses/jcl-over-slf4j-1.7.25.txt b/lib/licenses/jcl-over-slf4j-1.7.25.txt new file mode 100644 index 000..315bd49 --- /dev/null +++ b/lib/licenses/jcl-over-slf4j-1.7.25.txt @@ -0,0 +1,24 @@ +Copyright (c) 2004-2017 QOS.ch +All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY,FITNESSFORA PARTICULARPURPOSEAND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily
[ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095169#comment-16095169 ] Jay Zhuang commented on CASSANDRA-13696: Updated based on the reviews: | Branch | uTest | | [13696-3.0|https://github.com/cooldoger/cassandra/tree/13696-3.0] | [circleci#31|https://circleci.com/gh/cooldoger/cassandra/31]| | [13696-3.11|https://github.com/cooldoger/cassandra/tree/13696-3.11] | [circleci#32|https://circleci.com/gh/cooldoger/cassandra/32]| | [13696-trunk|https://github.com/cooldoger/cassandra/tree/13696-trunk] | [circleci#30|https://circleci.com/gh/cooldoger/cassandra/30]| Also, {{resetCrc()}} could also cause CRC mismatch for the next hint read in the same file. Another question is why dropTable + write traffic will cause hints file, do you think it's an issue? I create a separate ticket to track that: CASSANDRA-13712 > Digest mismatch Exception if hints file has UnknownColumnFamily > --- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - > Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - > table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 > HintsDispatchExecutor.java:234 - Failed to dispatch hints file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch > exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) > [main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [main/:na] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111] > Caused by: java.io.IOException: Digest mismatch exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) > ~[main/:na] > ... 16 common frames omitted > {noformat} > It causes multiple cassandra nodes stop [by > default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188]. > Here is the reproduce steps on a 3 nodes cluster, RF=3: > 1. stop node1 > 2. send some data with quorum (or one), it will generate hints file on > node2/node3 > 3. drop the table > 4. start node1 > node2/node3 will report "corrupted hints file" and stop. The impact is very > bad for a large cluster, when it happens, almost all the nodes are down at > the same time and we have to remove all the hints files (which contain the > dropped table) to bring the node back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CASSANDRA-13712) DropTable could cause hints
Jay Zhuang created CASSANDRA-13712: -- Summary: DropTable could cause hints Key: CASSANDRA-13712 URL: https://issues.apache.org/jira/browse/CASSANDRA-13712 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jay Zhuang Priority: Minor While dropping table and there's on going write traffic, we saw hints generated on each node. You can find hints dispatch message in the log. Not sure if it's an issue. Here are the reproduce steps: 1. Create a 3 nodes cluster: {{$ ccm create test13696 -v 3.0.14 && ccm populate -n 3 && ccm start}} 2. Send some traffics with cassandra-stress (blogpost.yaml is only in trunk, if you use another yaml file, change the RF=3) {{$ tools/bin/cassandra-stress user profile=test/resources/blogpost.yaml cl=QUORUM truncate=never ops\(insert=1\) duration=30m -rate threads=2 -mode native cql3 -node 127.0.0.1}} 3. While the traffic is running, drop table {{$ cqlsh -e "drop table stresscql.blogposts"}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095090#comment-16095090 ] Jason Brown commented on CASSANDRA-13700: - re: CASSANDRA-11825. Yes, shared mutable state strikes again, and thanks for addressing them separately ;) > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-13700: Status: Ready to Commit (was: Patch Available) > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095087#comment-16095087 ] Jason Brown commented on CASSANDRA-13700: - +1 > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()
[ https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095030#comment-16095030 ] Ariel Weisberg edited comment on CASSANDRA-13594 at 7/20/17 5:23 PM: - The dtest you started has almost certainly been aged out. https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/145/ I think we need to set up archiving of information from Apache Jenkins since they are throwing away everything after a mere 10 builds. was (Author: aweisberg): The dtest you started has almost certainly been aged out. I think we need to set up archiving of information from Apache Jenkins since they are throwing away everything after a mere 10 builds. > Use an ExecutorService for repair commands instead of new Thread(..).start() > > > Key: CASSANDRA-13594 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13594 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > Currently when starting a new repair, we create a new Thread and start it > immediately > It would be nice to be able to 1) limit the number of threads and 2) reject > starting new repair commands if we are already running too many. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()
[ https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095030#comment-16095030 ] Ariel Weisberg commented on CASSANDRA-13594: The dtest you started has almost certainly been aged out. I think we need to set up archiving of information from Apache Jenkins since they are throwing away everything after a mere 10 builds. > Use an ExecutorService for repair commands instead of new Thread(..).start() > > > Key: CASSANDRA-13594 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13594 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > Currently when starting a new repair, we create a new Thread and start it > immediately > It would be nice to be able to 1) limit the number of threads and 2) reject > starting new repair commands if we are already running too many. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13688) Anticompaction race can leak sstables/txn
[ https://issues.apache.org/jira/browse/CASSANDRA-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094985#comment-16094985 ] Ariel Weisberg commented on CASSANDRA-13688: Will we ever get a clean run of this? https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/142/ > Anticompaction race can leak sstables/txn > - > > Key: CASSANDRA-13688 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13688 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston > Fix For: 4.0 > > > At the top of {{CompactionManager#performAntiCompaction}}, the parent repair > session is loaded, if the session can't be found, a RuntimeException is > thrown. This can happen if a participant is evicted after the IR prepare > message is received, but before the anticompaction starts. This exception is > thrown outside of the try/finally block that guards the sstable and lifecycle > transaction, causing them to leak, and preventing the sstables from ever > being removed from View.compacting. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-13664: --- Status: Ready to Commit (was: Patch Available) > RangeFetchMapCalculator should not try to optimise 'trivial' ranges > --- > > Key: CASSANDRA-13664 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13664 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams > out of each node as even as possible. > In a typical multi-dc ring the nodes in the dcs are setup using token + 1, > creating many tiny ranges. If we only try to optimise over the number of > streams, it is likely that the amount of data streamed out of each node is > unbalanced. > We should ignore those trivial ranges and only optimise the big ones, then > share the tiny ones over the nodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[Cassandra Wiki] Update of "Committers" by StefanPodkowinski
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "Committers" page has been changed by StefanPodkowinski: https://wiki.apache.org/cassandra/Committers?action=diff=73=74 ||Branimir Lambov ||November 2016 ||Datastax || || ||Paulo Motta || November 2016 ||Datastax || || ||Sankalp Kohli || November 2016 ||Apple || PMC member || - ||Stefan Podkowinski ||February 2017 ||Independent || || + ||Stefan Podkowinski ||February 2017 ||1&1 || || ||Ariel Weisberg ||February 2017 ||Apple || || ||Blake Eggleston ||February 2017 ||Apple || || ||Alex Petrov ||February 2017 ||Datastax || || - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094756#comment-16094756 ] Romain Hardouin edited comment on CASSANDRA-13699 at 7/20/17 4:30 PM: -- I see random failures/errors in CircleCI. EDIT: https://circleci.com/gh/rhardouin/cassandra/9 is successful. was (Author: rha): I see random failures/errors in CircleCI > Allow to set batch_size_warn_threshold_in_kb via JMX > > > Key: CASSANDRA-13699 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13699 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Romain Hardouin >Priority: Minor > Fix For: 4.x > > Attachments: 13699-trunk.txt > > > We can set {{batch_size_fail_threshold_in_kb}} via JMX but not > {{batch_size_warn_threshold_in_kb}}. > The patch allows to set it dynamically and adds a INFO log for both > thresholds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[Cassandra Wiki] Update of "ContributorsGroup" by MichaelShuler
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "ContributorsGroup" page has been changed by MichaelShuler: https://wiki.apache.org/cassandra/ContributorsGroup?action=diff=73=74 * SergeRider * SmartCat * StefaniaAlborghetti + * StefanPodkowinski * StephenBlackheath * StephenConnolly * StuHood - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094911#comment-16094911 ] Joel Knighton commented on CASSANDRA-13700: --- Thanks, Jason! In this case, I agree the first option is safer for this issue. Something like the second likely makes sense eventually, at least as part of a larger audit of correctness issues in gossip. I believe your volatile suggestion is correct. I don't have a lot of helpful information to reproduce this; it reproduces in larger clusters, particularly with higher latency levels. We can see the effects locally with a few well-timed sleeps in MessagingService, but that isn't terribly representative. Branches pushed here: ||branch|| |[13700-2.1|https://github.com/jkni/cassandra/tree/13700-2.1]|| |[13700-2.2|https://github.com/jkni/cassandra/tree/13700-2.2]|| |[13700-3.0|https://github.com/jkni/cassandra/tree/13700-3.0]|| |[13700-3.11|https://github.com/jkni/cassandra/tree/13700-3.11]|| |[13700-trunk|https://github.com/jkni/cassandra/tree/13700-trunk]|| There's a somewhat conceptually similar issue when we bump the gossip generation in the middle of constructing a reply - I believe that's the cause in [CASSANDRA-11825], which presents similar problems. I'm choosing to address them separately because they're indeed distinct problems and 11825 requires an additional trigger (enabling and disabling gossip during runtime). > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-13700: -- Status: Patch Available (was: In Progress) > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13329) max_hints_delivery_threads does not work
[ https://issues.apache.org/jira/browse/CASSANDRA-13329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-13329: Fix Version/s: 3.11.0 4.0 > max_hints_delivery_threads does not work > > > Key: CASSANDRA-13329 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13329 > Project: Cassandra > Issue Type: Bug >Reporter: Fuud >Assignee: Aleksandr Sorokoumov > Labels: lhf > Fix For: 3.11.0, 4.0 > > > HintsDispatchExecutor creates JMXEnabledThreadPoolExecutor with corePoolSize > == 1 and maxPoolSize==max_hints_delivery_threads and unbounded > LinkedBlockingQueue. > In this configuration additional threads will not be created. > Same problem with PerSSTableIndexWriter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily
[ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094903#comment-16094903 ] Alex Petrov commented on CASSANDRA-13696: - bq. Simply resetting the CRC state isn't enough, True. Re-read the issue description/first comment, now it's quite obvious. Thanks for explanation. Adding a comment would be great though! > Digest mismatch Exception if hints file has UnknownColumnFamily > --- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - > Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - > table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 > HintsDispatchExecutor.java:234 - Failed to dispatch hints file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch > exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) > [main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [main/:na] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111] > Caused by: java.io.IOException: Digest mismatch exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) > ~[main/:na] > ... 16 common frames omitted > {noformat} > It causes multiple cassandra nodes stop [by > default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188]. > Here is the reproduce steps on a 3 nodes cluster, RF=3: > 1. stop node1 > 2. send some data with quorum (or one), it will generate hints file on > node2/node3 > 3. drop the table > 4. start node1 > node2/node3 will report "corrupted hints file" and stop. The impact is very > bad for a large cluster, when it happens, almost all the nodes are down at > the same time and we have to remove all the hints files (which contain the > dropped table) to bring the node back. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13711) Invalid writetime for null columns in cqlsh
Jeff Jirsa created CASSANDRA-13711: -- Summary: Invalid writetime for null columns in cqlsh Key: CASSANDRA-13711 URL: https://issues.apache.org/jira/browse/CASSANDRA-13711 Project: Cassandra Issue Type: Bug Reporter: Jeff Jirsa Fix For: 3.0.x, 3.11.x, 4.x >From the user list: https://lists.apache.org/thread.html/448731c029eee72e499fc6acd44d257d1671193f850a68521c2c6681@%3Cuser.cassandra.apache.org%3E {code} (oss-ccm) MacBook-Pro:~ jjirsa$ ccm create test -n 1 -s -v 3.0.10 Current cluster is now: test (oss-ccm) MacBook-Pro:~ jjirsa$ ccm node1 cqlsh Connected to test at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.0.10 | CQL spec 3.4.0 | Native protocol v4] Use HELP for help. cqlsh> CREATE KEYSPACE test WITH replication = {'class':'SimpleStrategy', 'replication_factor': 1}; cqlsh> CREATE TABLE test.t ( a text primary key, b text ); cqlsh> insert into test.t(a) values('z'); cqlsh> insert into test.t(a) values('w'); cqlsh> insert into test.t(a) values('e'); cqlsh> insert into test.t(a) values('r'); cqlsh> insert into test.t(a) values('t'); cqlsh> select a,b, writetime (b) from test.t; a | b | writetime(b) ---+--+-- z | null | null e | null | null r | null | null w | null | null t | null | null (5 rows) cqlsh> cqlsh> insert into test.t(a,b) values('t','x'); cqlsh> insert into test.t(a) values('b'); cqlsh> select a,b, writetime (b) from test.t; a | b| writetime(b) ---+--+-- z | null | null e | null | null r | null | null w | null | null t |x | 1500565131354883 b | null | 1500565131354883 (6 rows) {code} Data on disk: {code} MacBook-Pro:~ jjirsa$ ~/.ccm/repository/3.0.14/tools/bin/sstabledump /Users/jjirsa/.ccm/test/node1/data0/test/t-bed196006d0511e7904be9daad294861/mc-1-big-Data.db [ { "partition" : { "key" : [ "z" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 20, "liveness_info" : { "tstamp" : "2017-07-20T04:41:54.818118Z" }, "cells" : [ ] } ] }, { "partition" : { "key" : [ "e" ], "position" : 21 }, "rows" : [ { "type" : "row", "position" : 44, "liveness_info" : { "tstamp" : "2017-07-20T04:42:04.288547Z" }, "cells" : [ ] } ] }, { "partition" : { "key" : [ "r" ], "position" : 45 }, "rows" : [ { "type" : "row", "position" : 68, "liveness_info" : { "tstamp" : "2017-07-20T04:42:08.991417Z" }, "cells" : [ ] } ] }, { "partition" : { "key" : [ "w" ], "position" : 69 }, "rows" : [ { "type" : "row", "position" : 92, "liveness_info" : { "tstamp" : "2017-07-20T04:41:59.005382Z" }, "cells" : [ ] } ] }, { "partition" : { "key" : [ "t" ], "position" : 93 }, "rows" : [ { "type" : "row", "position" : 120, "liveness_info" : { "tstamp" : "2017-07-20T15:38:51.354883Z" }, "cells" : [ { "name" : "b", "value" : "x" } ] } ] }, { "partition" : { "key" : [ "b" ], "position" : 121 }, "rows" : [ { "type" : "row", "position" : 146, "liveness_info" : { "tstamp" : "2017-07-20T15:39:03.631297Z" }, "cells" : [ ] } ] } ]MacBook-Pro:~ jjirsa$ {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily
[ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094602#comment-16094602 ] Alex Petrov edited comment on CASSANDRA-13696 at 7/20/17 3:43 PM: -- I agree we should also return a correct version from the hints service (as [~jay.zhuang] already mentioned), like [here|https://github.com/apache/cassandra/compare/cassandra-3.0...cooldoger:13696.2-3.0?expand=1] same as we do in commit log descriptor. This also would make the issue for same-version go away, and since it would make the service to pick a different code path I'd say it's also necessary to include it. WRT to the patch itself, might be it's better to just call {{resetCrc}} explicitly and still return null like I did [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0#diff-cf15f9cac67d8b2f3e581129d617df16R242]? {{hint}} is a local variable, and setting it and carrying on makes the logic a bit harder to understand. For example, for me it was non-obvious that this boolean method would also do some buffer rewinding / state resetting under the hood. was (Author: ifesdjeen): I agree we should also return a correct version from the hints service (as [~jay.zhuang] already mentioned), like [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0] same as we do in commit log descriptor. This also would make the issue for same-version go away, and since it would make the service to pick a different code path I'd say it's also necessary to include it. WRT to the patch itself, might be it's better to just call {{resetCrc}} explicitly and still return null like I did [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0#diff-cf15f9cac67d8b2f3e581129d617df16R242]? {{hint}} is a local variable, and setting it and carrying on makes the logic a bit harder to understand. For example, for me it was non-obvious that this boolean method would also do some buffer rewinding / state resetting under the hood. > Digest mismatch Exception if hints file has UnknownColumnFamily > --- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - > Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - > table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 > HintsDispatchExecutor.java:234 - Failed to dispatch hints file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch > exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) > [main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at >
[jira] [Comment Edited] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily
[ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094602#comment-16094602 ] Alex Petrov edited comment on CASSANDRA-13696 at 7/20/17 3:43 PM: -- I agree we should also return a correct version from the hints service (as [~jay.zhuang] already mentioned), like [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0] same as we do in commit log descriptor. This also would make the issue for same-version go away, and since it would make the service to pick a different code path I'd say it's also necessary to include it. WRT to the patch itself, might be it's better to just call {{resetCrc}} explicitly and still return null like I did [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0#diff-cf15f9cac67d8b2f3e581129d617df16R242]? {{hint}} is a local variable, and setting it and carrying on makes the logic a bit harder to understand. For example, for me it was non-obvious that this boolean method would also do some buffer rewinding / state resetting under the hood. was (Author: ifesdjeen): I think we should also return a correct version from the hints service [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0] same as we do in commit log descriptor. This also would make the issue for same-version go away, and since it would make the service to pick a different code path I'd say it's also necessary to include it. WRT to the patch itself, might be it's better to just call {{resetCrc}} explicitly and still return null like I did [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0#diff-cf15f9cac67d8b2f3e581129d617df16R242]? {{hint}} is a local variable, and setting it and carrying on makes the logic a bit harder to understand. For example, for me it was non-obvious that this boolean method would also do some buffer rewinding / state resetting under the hood. > Digest mismatch Exception if hints file has UnknownColumnFamily > --- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - > Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - > table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 > HintsDispatchExecutor.java:234 - Failed to dispatch hints file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch > exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) > [main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [main/:na] > at
[jira] [Comment Edited] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092496#comment-16092496 ] Marcus Eriksson edited comment on CASSANDRA-13664 at 7/20/17 3:39 PM: -- https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/139/ was (Author: krummas): https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/135/ > RangeFetchMapCalculator should not try to optimise 'trivial' ranges > --- > > Key: CASSANDRA-13664 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13664 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams > out of each node as even as possible. > In a typical multi-dc ring the nodes in the dcs are setup using token + 1, > creating many tiny ranges. If we only try to optimise over the number of > streams, it is likely that the amount of data streamed out of each node is > unbalanced. > We should ignore those trivial ranges and only optimise the big ones, then > share the tiny ones over the nodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13710) Add a nodetool command to display details of any tables containing unowned tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-13710: Issue Type: Sub-task (was: Improvement) Parent: CASSANDRA-13704 > Add a nodetool command to display details of any tables containing unowned > tokens > - > > Key: CASSANDRA-13710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13710 > Project: Cassandra > Issue Type: Sub-task > Components: Observability, Tools >Reporter: Sam Tunnicliffe > > This could be implemented as a {{dry-run}} switch for {{nodetool cleanup}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily
[ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094839#comment-16094839 ] Jeff Jirsa commented on CASSANDRA-13696: Wanted to chat with Aleksey offline about the version implications, so withholding comment on that for a bit, but: {quote} might be it's better to just call resetCrc explicitly and still return null like I did here? hint is a local variable, and setting it and carrying on makes the logic a bit harder to understand. For example, for me it was non-obvious that this boolean method would also do some buffer rewinding / state resetting under the hood. {quote} I think the current behavior is actually the right thing to do. Simply resetting the CRC state isn't enough, we need to check to see if the CRC matches, because we want to invoke the disk failure policy if we're reading corrupt data, and frankly, a corruption source (bad disk / RAM / etc) that flips bits could cause us to see an invalid CFID, and skipping the corruption test at that point would be the wrong thing to do. A few more comment lines are probably worthwhile, though, since it seems like an easy 'fix' to revert in the future because it's nonobvious. > Digest mismatch Exception if hints file has UnknownColumnFamily > --- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - > Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - > table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 > HintsDispatchExecutor.java:234 - Failed to dispatch hints file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch > exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) > [main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [main/:na] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111] > Caused by: java.io.IOException: Digest mismatch exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) > ~[main/:na] > ... 16 common frames omitted > {noformat} > It causes multiple cassandra nodes stop [by > default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188]. > Here is the reproduce steps on a 3 nodes cluster, RF=3: > 1. stop node1 > 2. send some data with quorum (or one), it will generate hints file on > node2/node3 > 3. drop the table > 4. start node1 > node2/node3 will report "corrupted hints file"
[jira] [Updated] (CASSANDRA-13709) Warn or error when receiving hints for unowned token ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-13709: Issue Type: Sub-task (was: Improvement) Parent: CASSANDRA-13704 > Warn or error when receiving hints for unowned token ranges > --- > > Key: CASSANDRA-13709 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13709 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination, Observability >Reporter: Sam Tunnicliffe > > When receiving hints a node should log a warning if it receives a mutation > with a key outside of its owned ranges. We could also record a metric for > such events and optionally signal a failure to the sender. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13708) Warn or error when receiving stream requests or inbound streams that contain unowned token ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-13708: Issue Type: Sub-task (was: Improvement) Parent: CASSANDRA-13704 > Warn or error when receiving stream requests or inbound streams that contain > unowned token ranges > - > > Key: CASSANDRA-13708 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13708 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination, Observability, Streaming and Messaging >Reporter: Sam Tunnicliffe > > f a node receives a StreamRequest that includes ranges outside of those owned > by the node, a warning should be logged. > On the receiving side, when deserializing a streamed SSTable we should also > log a warning if we encounter keys outside of the node's owned ranges. Again, > we could also record a metric for such events and optionally fail the > streaming session. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13707) Warn or error when receiving Merkle Tree requests for unowned token ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-13707: Issue Type: Sub-task (was: Improvement) Parent: CASSANDRA-13704 > Warn or error when receiving Merkle Tree requests for unowned token ranges > -- > > Key: CASSANDRA-13707 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13707 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination, Observability, Streaming and Messaging >Reporter: Sam Tunnicliffe > > When a node receives a validation request to construct a Merkle Tree, if the > requested ranges are outside of the node's owned ranges we should log > warning. Maintaining metrics for these events and/or rejecting the request > and so failing the repair job might also be useful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13705) Warn or error when receiving write requests for unowned token ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-13705: Issue Type: Sub-task (was: Improvement) Parent: CASSANDRA-13704 > Warn or error when receiving write requests for unowned token ranges > > > Key: CASSANDRA-13705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13705 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination, Observability >Reporter: Sam Tunnicliffe > > When a replica receives a mutation whose key is outside its own ranges, a > warning should be recorded in the log. We could also record a metric for such > events, and even reject the request either by simply dropping it (resulting > in a timeout on the coordinator), or by returning a specific error response > to the coordinator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13706) Warn or error when receiving a read request for unowned token ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-13706: Issue Type: Sub-task (was: Improvement) Parent: CASSANDRA-13704 > Warn or error when receiving a read request for unowned token ranges > > > Key: CASSANDRA-13706 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13706 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination, Observability >Reporter: Sam Tunnicliffe > > When a replica receives a read request whose key or data range is outside its > own ranges, a warning should be recorded in the log. We could also record a > metric for such events, and even reject the request either by simply dropping > it (resulting in a timeout on the coordinator), or by returning a specific > error response to the coordinator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13706) Warn or error when receiving a read request for unowned token ranges
Sam Tunnicliffe created CASSANDRA-13706: --- Summary: Warn or error when receiving a read request for unowned token ranges Key: CASSANDRA-13706 URL: https://issues.apache.org/jira/browse/CASSANDRA-13706 Project: Cassandra Issue Type: Improvement Components: Coordination, Observability Reporter: Sam Tunnicliffe When a replica receives a read request whose key or data range is outside its own ranges, a warning should be recorded in the log. We could also record a metric for such events, and even reject the request either by simply dropping it (resulting in a timeout on the coordinator), or by returning a specific error response to the coordinator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13707) Warn or error when receiving Merkle Tree requests for unowned token ranges
Sam Tunnicliffe created CASSANDRA-13707: --- Summary: Warn or error when receiving Merkle Tree requests for unowned token ranges Key: CASSANDRA-13707 URL: https://issues.apache.org/jira/browse/CASSANDRA-13707 Project: Cassandra Issue Type: Improvement Components: Coordination, Observability, Streaming and Messaging Reporter: Sam Tunnicliffe When a node receives a validation request to construct a Merkle Tree, if the requested ranges are outside of the node's owned ranges we should log warning. Maintaining metrics for these events and/or rejecting the request and so failing the repair job might also be useful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13708) Warn or error when receiving stream requests or inbound streams that contain unowned token ranges
Sam Tunnicliffe created CASSANDRA-13708: --- Summary: Warn or error when receiving stream requests or inbound streams that contain unowned token ranges Key: CASSANDRA-13708 URL: https://issues.apache.org/jira/browse/CASSANDRA-13708 Project: Cassandra Issue Type: Improvement Components: Coordination, Observability, Streaming and Messaging Reporter: Sam Tunnicliffe f a node receives a StreamRequest that includes ranges outside of those owned by the node, a warning should be logged. On the receiving side, when deserializing a streamed SSTable we should also log a warning if we encounter keys outside of the node's owned ranges. Again, we could also record a metric for such events and optionally fail the streaming session. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13709) Warn or error when receiving hints for unowned token ranges
Sam Tunnicliffe created CASSANDRA-13709: --- Summary: Warn or error when receiving hints for unowned token ranges Key: CASSANDRA-13709 URL: https://issues.apache.org/jira/browse/CASSANDRA-13709 Project: Cassandra Issue Type: Improvement Components: Coordination, Observability Reporter: Sam Tunnicliffe When receiving hints a node should log a warning if it receives a mutation with a key outside of its owned ranges. We could also record a metric for such events and optionally signal a failure to the sender. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13705) Warn or error when receiving write requests for unowned token ranges
Sam Tunnicliffe created CASSANDRA-13705: --- Summary: Warn or error when receiving write requests for unowned token ranges Key: CASSANDRA-13705 URL: https://issues.apache.org/jira/browse/CASSANDRA-13705 Project: Cassandra Issue Type: Improvement Components: Coordination, Observability Reporter: Sam Tunnicliffe When a replica receives a mutation whose key is outside its own ranges, a warning should be recorded in the log. We could also record a metric for such events, and even reject the request either by simply dropping it (resulting in a timeout on the coordinator), or by returning a specific error response to the coordinator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13710) Add a nodetool command to display details of any tables containing unowned tokens
Sam Tunnicliffe created CASSANDRA-13710: --- Summary: Add a nodetool command to display details of any tables containing unowned tokens Key: CASSANDRA-13710 URL: https://issues.apache.org/jira/browse/CASSANDRA-13710 Project: Cassandra Issue Type: Improvement Components: Observability, Tools Reporter: Sam Tunnicliffe This could be implemented as a {{dry-run}} switch for {{nodetool cleanup}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13704) Better reporting of events for out of range tokens
Sam Tunnicliffe created CASSANDRA-13704: --- Summary: Better reporting of events for out of range tokens Key: CASSANDRA-13704 URL: https://issues.apache.org/jira/browse/CASSANDRA-13704 Project: Cassandra Issue Type: Improvement Components: Coordination, Observability Reporter: Sam Tunnicliffe It is possible for nodes to have a divergent view of the ring, which can result in some operations being sent to the wrong nodes. This is an umbrella ticket to mitigate such issues by adding logging when a node is asked to perform an operation for tokens it does not own. This will be useful for detecting when the nodes' views of the ring diverge, which is not highly visible at the moment, and also for post-hoc analysis. It may also be beneficial to straight up reject certain operations, though this will need to balance the risk of performing those ops against the consequences rejecting them has on availability. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12872) Paging reads and limit reads are missing some data
[ https://issues.apache.org/jira/browse/CASSANDRA-12872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-12872: Description: We are seeing an issue with paging reads missing some small number of columns when we do paging/limit reads. We get this on a single DC cluster itself when both reads and writes are happening with QUORUM. Paging/limit reads see this issue. I have attached the ccm based script which reproduces the problem. * Keyspace RF - 3 * Table (id int, course text, marks int, primary key(id, course)) * replicas for partition key 1 - r1, r2 and r3 * insert (1, '1', 1) , (1, '2', 2), (1, '3', 3), (1, '4', 4), (1, '5', 5) - succeeded on all 3 replicas * insert (1, '6', 6) succeeded on r1 and r3, failed on r2 * delete (1, '2'), (1, '3'), (1, '4'), (1, '5') succeeded on r1 and r2, failed on r3 * insert (1, '7', 7) succeeded on r1 and r2, failed on r3 Local data on 3 nodes looks like as below now r1: (1, '1', 1), tombstone(2-5 records), (1, '6', 6), (1, '7', 7) r2: (1, '1', 1), tombstone(2-5 records), (1, '7', 7) r3: (1, '1', 1), (1, '2', 2), (1, '3', 3), (1, '4', 4), (1, '5', 5), (1, '6', 6) If we do a paging read with page_size 2, and if it gets data from r2 and r3, then it will only get the data (1, '1', 1) and (1, '7', 7) skipping record 6. This problem would happen if the same query is not doing paging but limit set to 2 records. Resolution code for reads works same for paging queries and normal queries. Co-ordinator shouldn't respond back to client with records/columns that it didn't have complete visibility on all required replicas (in this case 2 replicas). In above case, it is sending back record (1, '7', 7) back to client, but its visibility on r3 is limited up to (1, '2', 2) and it is relying on just r2 data to assume (1, '6', 6) doesn't exist, which is wrong. End of the resolution all it can conclusively say any thing about is (1, '1', and the other one is that we and and and and and and the and the and the and d and the other is and 1), which exists and (1, '2', 2), which is deleted. Ideally we should have different resolution implementation for paging/limit queries. We could reproduce this on 2.0.17, 2.1.16 and 3.0.9. Seems like 3.0.9 we have ShortReadProtection transformation on list queries. I assume that is to protect against the cases like above. But, we can reproduce the issue in 3.0.9 as well. was: We are seeing an issue with paging reads missing some small number of columns when we do paging/limit reads. We get this on a single DC cluster itself when both reads and writes are happening with QUORUM. Paging/limit reads see this issue. I have attached the ccm based script which reproduces the problem. * Keyspace RF - 3 * Table (id int, course text, marks int, primary key(id, course)) * replicas for partition key 1 - r1, r2 and r3 * insert (1, '1', 1) , (1, '2', 2), (1, '3', 3), (1, '4', 4), (1, '5', 5) - succeeded on all 3 replicas * insert (1, '6', 6) succeeded on r1 and r3, failed on r2 * delete (1, '2'), (1, '3'), (1, '4'), (1, '5') succeeded on r1 and r2, failed on r3 * insert (1, '7', 7) succeeded on r1 and r2, failed on r3 Local data on 3 nodes looks like as below now r1: (1, '1', 1), tombstone(2-5 records), (1, '6', 6), (1, '7', 7) r2: (1, '1', 1), tombstone(2-5 records), (1, '7', 7) r3: (1, '1', 1), (1, '2', 2), (1, '3', 3), (1, '4', 4), (1, '5', 5), (1, '6', 6) If we do a paging read with page_size 2, and if it gets data from r2 and r3, then it will only get the data (1, '1', 1) and (1, '7', 7) skipping record 6. This problem would happen if the same query is not doing paging but limit set to 2 records. Resolution code for reads works same for paging queries and normal queries. Co-ordinator shouldn't respond back to client with records/columns that it didn't have complete visibility on all required replicas (in this case 2 replicas). In above case, it is sending back record (1, '7', 7) back to client, but its visibility on r3 is limited up to (1, '2', 2) and it is relying on just r2 data to assume (1, '6', 6) doesn't exist, which is wrong. End of the resolution all it can conclusively say any thing about is (1, '1', 1), which exists and (1, '2', 2), which is deleted. Ideally we should have different resolution implementation for paging/limit queries. We could reproduce this on 2.0.17, 2.1.16 and 3.0.9. Seems like 3.0.9 we have ShortReadProtection transformation on list queries. I assume that is to protect against the cases like above. But, we can reproduce the issue in 3.0.9 as well. > Paging reads and limit reads are missing some data > -- > > Key: CASSANDRA-12872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12872 > Project: Cassandra > Issue Type: Bug > Components:
[jira] [Created] (CASSANDRA-13703) Using min_compress_ratio <= 1 causes corruption
Branimir Lambov created CASSANDRA-13703: --- Summary: Using min_compress_ratio <= 1 causes corruption Key: CASSANDRA-13703 URL: https://issues.apache.org/jira/browse/CASSANDRA-13703 Project: Cassandra Issue Type: Bug Reporter: Branimir Lambov Assignee: Branimir Lambov Attachments: patch This is because chunks written uncompressed end up below the compressed size threshold. Demonstrated by applying the attached patch meant to improve the testing of the 10520 changes, and running {{CompressedSequentialWriterTest.testLZ4Writer}}. The default {{min_compress_ratio: 0}} is not affected as it never writes uncompressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13142) Upgradesstables cancels compactions unnecessarily
[ https://issues.apache.org/jira/browse/CASSANDRA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094808#comment-16094808 ] Marcus Eriksson commented on CASSANDRA-13142: - bq. I'm having trouble identifying a way to get currently running compaction futures. Yeah we don't keep the futures around, and it becomes quite a hack to actually do that maybe we should just only not cancel view/index builds and cancel regular compactions like we do today? > Upgradesstables cancels compactions unnecessarily > - > > Key: CASSANDRA-13142 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13142 > Project: Cassandra > Issue Type: Bug >Reporter: Kurt Greaves >Assignee: Kurt Greaves > Attachments: 13142-v1.patch > > > Since at least 1.2 upgradesstables will cancel any compactions bar > validations when run. This was originally determined as a non-issue in > CASSANDRA-3430 however can be quite annoying (especially with STCS) as a > compaction will output the new version anyway. Furthermore, as per > CASSANDRA-12243 it also stops things like view builds and I assume secondary > index builds as well which is not ideal. > We should avoid cancelling compactions unnecessarily. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted
[ https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241 ] ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 2:43 PM: --- h3. Relation: base -> view First of all, I think all of us should agree on what cases view row should exists. IMO, there are two main cases: 1. base pk and view pk are the same (order doesn't matter) and view has no filter conditions or only conditions on base pk. (filter condition mean: {{c = 1}} in view's where clause. filter condition is not a concern here, since no previous view data to be cleared.) view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) and base row pk satifies view's filter conditions if any. * b. or one of base row columns selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. this is handled by existing mechanism of liveness and tombstone since all info are included in view row * c. or one of base row columns not selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. Those unselected columns' timestamp/ttl/cell-deletion info are not currently stored on view row. 2. base column used in view pk or view has filter conditions on base non-key column which can also lead to entire view row being wiped. view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) && base column used in view pk is not null but no timestamp && conditions are satisfied. ( pk having live livenesInfo means it is not deleted by tombstone) * b. or base row column in view pk has timestamp (via update) && conditions are satisfied. eg. if base column used in view pk is TTLed, entire view row should be wiped. Next thing is to model "view's tombstone and livenessInfo" to maintain view data based on above cases. h3. Previous known issues: (I might miss some issues, feel free to ping me..) ttl * view row is not wiped when TTLed on base column used in view pk or TTLed on base non-key column with filter condition * cells with same timestamp, merging ttls are not deterministic. partial update on base columns not selected in view * it results in no view data. because of current update semantics, no view updates are generated * corresponding view row' liveness is not depending on liveness of base columns filter conditions or base column used in view pk causes * view row is shadowed after a few modification on base column used in view pk if the base non-key column has TS greater than base pk's ts and view key column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry when a prior one had been deleted need to be careful to hanlde timestamp tie) tombstone merging is not commutative * in current code, shadowable tombstone doesn't co-exist with regular tombstone sstabledump not supporting current shadowable tombstone h3. Model I can think of two ways to ship all required base column info to view: * make base columns that are not selected in view as "virtual cell" and store their imestamp/ttl to view without their actual values. so we can reuse current ts/tb/ttl mechanism with additional validation logic to check if a view row is alive. * or storing those info on view's livenessInfo/deletion with addition merge logic. I will go ahead with second way since there is an existing shadowable tombstone mechanism. View PrimaryKey LivenessInfo, its timestamp, payloads, merging {code} ColumnInfo: // generated from base column as it is. 0. timestamp 1. ttl 2. localDeletionTime: could be used to represent tombstone or TTLed depends on if there is ttl supersedes(): if timestamps are different, greater timestamp supersedes; if timestamps are same, greater localDeletionTime supersedes. // if a normal column in base row has no timestamp (aka. generated by Insert statement), when it is sent to view, it remains no timestamp. // it will implicitly inherit ViewLivenessInfo just like how it works with standard LivenessInfo in regular table, // unlike shadowable mechanism which will explicitly put base pk's timestamp into not-updated base columns into view data to keep "select writetime" correct in view. // (because in shaowable mechanism, view's pk timestamp is promoted to a bigger value which cannot be used for writetime in view) ViewLivenessInfo // corresponding to base pk livenessInfo 0. timestamp 1. ttl / localDeletionTime // base column that are used in view pk or has filter condition. // if any column is not live or doesn't exist, entire view row is wiped. // if a column in base is filtered and not selected, it's stored here. 2. MapkeyOrConditions;
[jira] [Updated] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-13700: -- Reviewer: Jason Brown > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094756#comment-16094756 ] Romain Hardouin commented on CASSANDRA-13699: - I see random failures/errors in CircleCI > Allow to set batch_size_warn_threshold_in_kb via JMX > > > Key: CASSANDRA-13699 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13699 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Romain Hardouin >Priority: Minor > Fix For: 4.x > > Attachments: 13699-trunk.txt > > > We can set {{batch_size_fail_threshold_in_kb}} via JMX but not > {{batch_size_warn_threshold_in_kb}}. > The patch allows to set it dynamically and adds a INFO log for both > thresholds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094748#comment-16094748 ] Jason Brown commented on CASSANDRA-13700: - We also might need to make {{HeartBeatState.version}} volatile, but I'm still thinking about it (just adding it here for discussion) > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094744#comment-16094744 ] Jason Brown commented on CASSANDRA-13700: - [~jkni] Fantastic debugging here, Joel. We have seen this problem, as well, with missing STATUS and TOKENS entries. I followed this through, and I believe you are correct. Just to point out (because I had to dig and reason through it), the key problem is (as Joel points out) the shared mutable state of {{HeartBeatState}}. In {{Gossiper.getStateForVersionBiggerThan}}, when the local node is building up the {{Mapstates}} about itself, if any states are added after the function returns *and* the heartbeat is incremented before serialization, the peer will get the updated heartbeat value but not the updated states (as we the set of states for the local node that we're sending over was already constructed a priori the serialization). Off the top of my head, I think there are at least two possible ways to fix this: - clone the {{HeartBeatState}} when constructing the {{EndpointState}} to return from {{Gossiper.getStateForVersionBiggerThan}}. That way it's not referencing mutable heartbeat state. - execute the {{GossipTask}} on the same thread the we receive the gossip syn/ack/ack2 messages (on the {{Stage.GOSSIP}} thread). That way we force (almost) all references to gossip's stated mutable state into one thread. The first option is simpler, smaller in scope, and certainly safer. The second option is has performance implications, especially if the {{GossipTask}} takes a while to execute, then we could start backing up the tasks on the stage. This option, though, has the "possibility" of eliminating more of the state race bugs that we seems to continually uncover as time goes on. (Side note: there are still some updates to local Gossip state from the main thread (via {{StorageService}}) at startup, and the response to the {{EchoMessage}} is on the wrong thread, as well.) Joel, can you share the method of how you are able to reproduce this? > Heartbeats can cause gossip information to go permanently missing on certain > nodes > -- > > Key: CASSANDRA-13700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13700 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Critical > > In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} > from the corresponding {{EndpointState}} to the {{EndpointState}} to send. > When we're getting state for ourselves, this means that we add a reference to > the local {{HeartBeatState}}. Then, once we've built a message (in either the > Syn or Ack handler), we send it through the {{MessagingService}}. In the case > that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may > run before serialization of the Syn or Ack. This means that when the > {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the > {{HeartBeatState}} version of the local node as stored in the endpoint state > map. Then, when we finally serialize the Syn or Ack, we'll follow the > reference to the {{HeartBeatState}} and serialize it with a higher version > than we saw when constructing the Ack or Ack2. > Consider the case where we see {{HeartBeatState}} with version 4 when > constructing an Ack and send it through the {{MessagingService}}. Then, we > add some piece of state with version 5 to our local {{EndpointState}}. If > {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before > the {{MessageOut}} containing the Ack is serialized, the node receiving the > Ack will believe it is current to version 6, despite the fact that it has > never received a message containing the {{ApplicationState}} tagged with > version 5. > I've reproduced in this in several versions; so far, I believe this is > possible in all versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12148) Improve determinism of CDC data availability
[ https://issues.apache.org/jira/browse/CASSANDRA-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12148: Status: Patch Available (was: In Progress) > Improve determinism of CDC data availability > > > Key: CASSANDRA-12148 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12148 > Project: Cassandra > Issue Type: Improvement >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie > Fix For: 4.x > > > The latency with which CDC data becomes available has a known limitation due > to our reliance on CommitLogSegments being discarded to have the data > available in cdc_raw: if a slowly written table co-habitates a > CommitLogSegment with CDC data, the CommitLogSegment won't be flushed until > we hit either memory pressure on memtables or CommitLog limit pressure. > Ultimately, this leaves a non-deterministic element to when data becomes > available for CDC consumption unless a consumer parses live CommitLogSegments. > To work around this limitation and make semi-realtime CDC consumption more > friendly to end-users, I propose we extend CDC as follows: > h6. High level: > * Consumers parse hard links of active CommitLogSegments in cdc_raw instead > of waiting for flush/discard and file move > * C* stores an offset of the highest seen CDC mutation in a separate idx file > per commit log segment in cdc_raw. Clients tail this index file, delta their > local last parsed offset on change, and parse the corresponding commit log > segment using their last parsed offset as min > * C* flags that index file with an offset and DONE when the file is flushed > so clients know when they can clean up > h6. Details: > * On creation of a CommitLogSegment, also hard-link the file in cdc_raw > * On first write of a CDC-enabled mutation to a segment, we: > ** Flag it as {{CDCState.CONTAINS}} > ** Set a long tracking the {{CommitLogPosition}} of the 1st CDC-enabled > mutation in the log > ** Set a long in the CommitLogSegment tracking the offset of the end of the > last written CDC mutation in the segment if higher than the previously known > highest CDC offset > * On subsequent writes to the segment, we update the offset of the highest > known CDC data > * On CommitLogSegment fsync, we write a file in cdc_raw as > _cdc.idx containing the min offset and end offset fsynced to > disk per file > * On segment discard, if CDCState == {{CDCState.PERMITTED}}, delete both the > segment in commitlog and in cdc_raw > * On segment discard, if CDCState == {{CDCState.CONTAINS}}, delete the > segment in commitlog and update the _cdc.idx file w/end offset > and a DONE marker > * On segment replay, store the highest end offset of seen CDC-enabled > mutations from a segment and write that to _cdc.idx on > completion of segment replay. This should bridge the potential correctness > gap of a node writing to a segment and then dying before it can write the > _cdc.idx file. > This should allow clients to skip the beginning of a file to the 1st CDC > mutation, track an offset of how far they've parsed, delta against the > _cdc.idx file end offset, and use that as a determinant on when to parse new > CDC data. Any existing clients written to the initial implementation of CDC > need only add the _cdc.idx logic and checking for DONE marker > to their code, so the burden on users to update to support this should be > quite small for the benefit of having data available as soon as it's fsynced > instead of at a non-deterministic time when potentially unrelated tables are > flushed. > Finally, we should look into extending the interface on CommitLogReader to be > more friendly for realtime parsing, perhaps supporting taking a > CommitLogDescriptor and RandomAccessReader and resuming readSection calls, > assuming the reader is at the start of a SyncSegment. Would probably also > need to rewind to the start of the segment before returning so subsequent > calls would respect this contract. This would skip needing to deserialize the > descriptor and all completed SyncSegments to get to the root of the desired > segment for parsing. > One alternative we discussed offline - instead of just storing the highest > seen CDC offset, we could instead store an offset per CDC mutation > (potentially delta encoded) in the idx file to allow clients to seek and only > parse the mutations with CDC enabled. My hunch is that the performance delta > from doing so wouldn't justify the complexity given the SyncSegment > deserialization and seeking restrictions in the compressed and encrypted > cases as mentioned above. > The only complication I can think of with the above design is uncompressed > mmapped CommitLogSegments on Windows being undeletable, but it'd be pretty > simple to
[jira] [Commented] (CASSANDRA-12148) Improve determinism of CDC data availability
[ https://issues.apache.org/jira/browse/CASSANDRA-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094740#comment-16094740 ] Joshua McKenzie commented on CASSANDRA-12148: - Updated with some minor changes: * Rebased to trunk * Added replay logic tester * Tweaked signature in CommitLogReader and added missing handling in one of the replay signatures in CommitLogReplayer * Added CDCWriteException instead of using WriteTimeoutException I'm ambivalent about that last bullet-point. The coordinator will still roll things up in a WriteFailureException so I'm not convinced it's worth the effort in the drivers to support a new exception type if we don't expose it to the client, however it irritated me so I added that. [cassandra branch|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:12148_rebase_trunk] [dtest branch|https://github.com/datastax/python-driver/compare/master...josh-mckenzie:12148] [python driver branch|https://github.com/riptano/cassandra-dtest/compare/master...josh-mckenzie:12148_style] testall, cdc-testing, and dtests look good on this branch. Only 1 failure on dtest and it's a known issue. > Improve determinism of CDC data availability > > > Key: CASSANDRA-12148 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12148 > Project: Cassandra > Issue Type: Improvement >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie > Fix For: 4.x > > > The latency with which CDC data becomes available has a known limitation due > to our reliance on CommitLogSegments being discarded to have the data > available in cdc_raw: if a slowly written table co-habitates a > CommitLogSegment with CDC data, the CommitLogSegment won't be flushed until > we hit either memory pressure on memtables or CommitLog limit pressure. > Ultimately, this leaves a non-deterministic element to when data becomes > available for CDC consumption unless a consumer parses live CommitLogSegments. > To work around this limitation and make semi-realtime CDC consumption more > friendly to end-users, I propose we extend CDC as follows: > h6. High level: > * Consumers parse hard links of active CommitLogSegments in cdc_raw instead > of waiting for flush/discard and file move > * C* stores an offset of the highest seen CDC mutation in a separate idx file > per commit log segment in cdc_raw. Clients tail this index file, delta their > local last parsed offset on change, and parse the corresponding commit log > segment using their last parsed offset as min > * C* flags that index file with an offset and DONE when the file is flushed > so clients know when they can clean up > h6. Details: > * On creation of a CommitLogSegment, also hard-link the file in cdc_raw > * On first write of a CDC-enabled mutation to a segment, we: > ** Flag it as {{CDCState.CONTAINS}} > ** Set a long tracking the {{CommitLogPosition}} of the 1st CDC-enabled > mutation in the log > ** Set a long in the CommitLogSegment tracking the offset of the end of the > last written CDC mutation in the segment if higher than the previously known > highest CDC offset > * On subsequent writes to the segment, we update the offset of the highest > known CDC data > * On CommitLogSegment fsync, we write a file in cdc_raw as > _cdc.idx containing the min offset and end offset fsynced to > disk per file > * On segment discard, if CDCState == {{CDCState.PERMITTED}}, delete both the > segment in commitlog and in cdc_raw > * On segment discard, if CDCState == {{CDCState.CONTAINS}}, delete the > segment in commitlog and update the _cdc.idx file w/end offset > and a DONE marker > * On segment replay, store the highest end offset of seen CDC-enabled > mutations from a segment and write that to _cdc.idx on > completion of segment replay. This should bridge the potential correctness > gap of a node writing to a segment and then dying before it can write the > _cdc.idx file. > This should allow clients to skip the beginning of a file to the 1st CDC > mutation, track an offset of how far they've parsed, delta against the > _cdc.idx file end offset, and use that as a determinant on when to parse new > CDC data. Any existing clients written to the initial implementation of CDC > need only add the _cdc.idx logic and checking for DONE marker > to their code, so the burden on users to update to support this should be > quite small for the benefit of having data available as soon as it's fsynced > instead of at a non-deterministic time when potentially unrelated tables are > flushed. > Finally, we should look into extending the interface on CommitLogReader to be > more friendly for realtime parsing, perhaps supporting taking a > CommitLogDescriptor and RandomAccessReader and resuming readSection calls, > assuming the reader is at
[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted
[ https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241 ] ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 2:07 PM: --- h3. Relation: base -> view First of all, I think all of us should agree on what cases view row should exists. IMO, there are two main cases: 1. base pk and view pk are the same (order doesn't matter) and view has no filter conditions or only conditions on base pk. (filter condition is not a concern here, since no previous view data to be cleared) view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) and base row pk satifies view's filter conditions if any. * b. or one of base row columns selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. this is handled by existing mechanism of liveness and tombstone since all info are included in view row * c. or one of base row columns not selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. Those unselected columns' timestamp/ttl/cell-deletion info currently are not stored on view row. 2. base column used in view pk or view has filter conditions on base non-key column which can also lead to entire view row being wiped. view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) && base column used in view pk is not null but no timestamp && conditions are satisfied. ( pk having live livenesInfo means it is not deleted by tombstone) * b. or base row column in view pk has timestamp (via update) && conditions are satisfied. eg. if base column used in view pk is TTLed, entire view row should be wiped. Next thing is to model "shadowable tombstone or shadowable liveness" to maintain view data based on above cases. h3. Previous known issues: (I might miss some issues, feel free to ping me..) ttl * view row is not wiped when TTLed on base column used in view pk or TTLed on base non-key column with filter condition * cells with same timestamp, merging ttls are not deterministic. partial update on base columns not selected in view * it results in no view data. because of current update semantics, no view updates are generated * corresponding view row liveness is not depending on liveness of base columns filter conditions or base column used in view pk causes * view row is shadowed after a few modification on base column used in view pk if the base non-key column has TS greater than base pk's ts and view key column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry when a prior one had been deleted need to be careful to hanlde timestamp tie) tombstone merging is not commutative * in current code, shadowable tombstone doesn't co-exist with regular tombstone sstabledump doesn't not support current shadowable tombstone h3. Model I can think of two ways to ship all required base column info to view: * make base columns that are not selected in view as "virtual cell" and store their imestamp/ttl to view without their actual values. so we can reuse current ts/tb/ttl mechanism with additional validation logic to check if a view row is alive. * or storing those info on view's livenessInfo/deletion with addition merge logic. I will go ahead with second way since there is an existing shadowable tombstone mechanism. View PrimaryKey LivenessInfo, its timestamp, payloads, merging {code} ColumnInfo: // generated from base column 0. timestamp 1. ttl 2. localDeletionTime: could be used to represent tombstone or TTLed depends on if there is ttl supersedes(): if timestamps are different, greater timestamp supersedes; if timestamps are same, greater localDeletionTime supersedes. ViewLivenessInfo // corresponding to base pk livenessInfo 0. timestamp 1. ttl / localDeletionTime // base column that are used in view pk or has filter condition. // if any column is not live or doesn't exist, entire view row is wiped. // if a column in base is filtered and not selected, it's stored here. 2. MapkeyOrConditions; // if any column is live 3. Map unselected; // to determina if a row is live isRowAlive(Deletion delete): get timestamp or columnInfo that is greater than those in Deletion if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) or not existed, false if {{timestamp or ttl}} are alive, true if any column in {{unselected}} is alive, true otherwise check any columns in view row are alive // cannot use supersedes, because timestamp can tie, we cannot compare keyOrConditions.
[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted
[ https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241 ] ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 1:28 PM: --- h3. Relation: base -> view First of all, I think all of us should agree on what cases view row should exists. IMO, there are two main cases: 1. base pk and view pk are the same (order doesn't matter) and view has no filter conditions or only conditions on base pk. (filter condition is not a concern here, since no previous view data to be cleared) view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) and base row pk satifies view's filter conditions if any. * b. or one of base row columns selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. this is handled by existing mechanism of liveness and tombstone since all info are included in view row * c. or one of base row columns not selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. Those unselected columns' timestamp/ttl/cell-deletion info currently are not stored on view row. 2. base column used in view pk or view has filter conditions on base non-key column which can also lead to entire view row being wiped. view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) && base column used in view pk is not null but no timestamp && conditions are satisfied. ( pk having live livenesInfo means it is not deleted by tombstone) * b. or base row column in view pk has timestamp (via update) && conditions are satisfied. eg. if base column used in view pk is TTLed, entire view row should be wiped. Next thing is to model "shadowable tombstone or shadowable liveness" to maintain view data based on above cases. h3. Previous known issues: (I might miss some issues, feel free to ping me..) ttl * view row is not wiped when TTLed on base column used in view pk or TTLed on base non-key column with filter condition * cells with same timestamp, merging ttls are not deterministic. partial update on base columns not selected in view * it results in no view data. because of current update semantics, no view updates are generated * corresponding view row liveness is not depending on liveness of base columns filter conditions or base column used in view pk causes * view row is shadowed after a few modification on base column used in view pk if the base non-key column has TS greater than base pk's ts and view key column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry when a prior one had been deleted need to be careful to hanlde timestamp tie) tombstone merging is not commutative * in current code, shadowable tombstone doesn't co-exist with regular tombstone sstabledump doesn't not support current shadowable tombstone h3. Model I can think of two ways to ship all required base column info to view: * make base columns that are not selected in view as "virtual cell" and store their imestamp/ttl to view without their actual values. so we can reuse current ts/tb/ttl mechanism with additional validation logic to check if a view row is alive. * or storing those info on view's livenessInfo/deletion with addition merge logic. I will go ahead with second way since there is an existing shadowable tombstone mechanism. View PrimaryKey LivenessInfo, its timestamp, payloads, merging {code} ColumnInfo: // generated from base column 0. timestamp 1. ttl 2. localDeletionTime: could be used to represent tombstone or TTLed depends on if there is ttl supersedes(): if timestamps are different, greater timestamp supersedes; if timestamps are same, greater localDeletionTime supersedes. ViewLivenessInfo // corresponding to base pk livenessInfo 0. timestamp 1. ttl / localDeletionTime // base column that are used in view pk or has filter condition. // if any column is not live or doesn't exist, entire view row is wiped. // if a column in base is filtered and not selected, it's stored here. 2. MapkeyOrConditions; // if any column is live 3. Map unselected; // to determina if a row is live isRowAlive(Deletion delete): get timestamp or columnInfo that is greater than those in Deletion if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) or not existed, false if {{timestamp or ttl}} are alive, true if any column in {{unselected}} is alive, true otherwise check any columns in view row are alive // cannot use supersedes, because timestamp can tie, we cannot compare keyOrConditions.
[jira] [Updated] (CASSANDRA-13702) Error on keyspace create/alter if referencing non-existing DC in cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johnny Miller updated CASSANDRA-13702: -- Priority: Minor (was: Major) > Error on keyspace create/alter if referencing non-existing DC in cluster > > > Key: CASSANDRA-13702 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13702 > Project: Cassandra > Issue Type: Improvement >Reporter: Johnny Miller >Priority: Minor > > It is possible to create/alter a keyspace using NetworkTopologyStrategy and a > DC that does not exist. It would be great if this was validated to prevent > accidents. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13702) Error on keyspace create/alter if referencing non-existing DC in cluster
Johnny Miller created CASSANDRA-13702: - Summary: Error on keyspace create/alter if referencing non-existing DC in cluster Key: CASSANDRA-13702 URL: https://issues.apache.org/jira/browse/CASSANDRA-13702 Project: Cassandra Issue Type: Improvement Reporter: Johnny Miller It is possible to create/alter a keyspace using NetworkTopologyStrategy and a DC that does not exist. It would be great if this was validated to prevent accidents. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted
[ https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241 ] ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 12:53 PM: h3. Relation: base -> view First of all, I think all of us should agree on what cases view row should exists. IMO, there are two main cases: 1. base pk and view pk are the same (order doesn't matter) and view has no filter conditions or only conditions on base pk. (filter condition is not a concern here, since no previous view data to be cleared) view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) and base row pk satifies view's filter conditions if any. * b. or one of base row columns selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. this is handled by existing mechanism of liveness and tombstone since all info are included in view row * c. or one of base row columns not selected in view has live timestamp (via update) and base row pk satifies view's filter conditions if any. Those unselected columns' timestamp/ttl/cell-deletion info currently are not stored on view row. 2. base column used in view pk or view has filter conditions on base non-key column which can also lead to entire view row being wiped. view row exists if any of following is true: * a. base row pk has live livenessInfo(timestamp) && base column used in view pk is not null but no timestamp && conditions are satisfied. ( pk having live livenesInfo means it is not deleted by tombstone) * b. or base row column in view pk has timestamp (via update) && conditions are satisfied. eg. if base column used in view pk is TTLed, entire view row should be wiped. Next thing is to model "shadowable tombstone or shadowable liveness" to maintain view data based on above cases. h3. Previous known issues: (I might miss some issues, feel free to ping me..) ttl * view row is not wiped when TTLed on base column used in view pk or TTLed on base non-key column with filter condition * cells with same timestamp, merging ttls are not deterministic. partial update on base columns not selected in view * it results in no view data. because of current update semantics, no view updates are generated * corresponding view row liveness is not depending on liveness of base columns filter conditions or base column used in view pk causes * view row is shadowed after a few modification on base column used in view pk if the base non-key column has TS greater than base pk's ts and view key column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry when a prior one had been deleted need to be careful to hanlde timestamp tie) tombstone merging is not commutative * in current code, shadowable tombstone doesn't co-exist with regular tombstone sstabledump doesn't not support current shadowable tombstone h3. Model I can think of two ways to ship all required base column info to view: * make base columns that are not selected in view as "virtual cell" and store their imestamp/ttl to view without their actual values. so we can reuse current ts/tb/ttl mechanism with additional validation logic to check if a view row is alive. * or storing those info on view's livenessInfo/deletion with addition merge logic. I will go ahead with second way since there is an existing shadowable tombstone mechanism. View PrimaryKey LivenessInfo, its timestamp, payloads, merging {code} ColumnInfo: // generated from base column 0. timestamp 1. ttl 2. localDeletionTime: could be used to represent tombstone or TTLed depends on if there is ttl supersedes(): if timestamps are different, greater timestamp supersedes; if timestamps are same, greater localDeletionTime supersedes. ViewLivenessInfo // corresponding to base pk livenessInfo 0. timestamp 1. ttl / localDeletionTime // base column that are used in view pk or has filter condition. // if any column is not live or doesn't exist, entire view row is wiped. // if a column in base is filtered and not selected, it's stored here. 2. MapkeyOrConditions; // if any column is live 3. Map unselected; // to determina if a row is live isRowAlive(Deletion delete): get timestamp or columnInfo that is greater than those in Deletion if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) or not existed, false if {{timestamp or ttl}} are alive, true if any column in {{unselected}} is alive, true otherwise check any columns in view row are alive // cannot use supersedes, because timestamp can tie, we cannot compare keyOrConditions.
[jira] [Comment Edited] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily
[ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094602#comment-16094602 ] Alex Petrov edited comment on CASSANDRA-13696 at 7/20/17 12:50 PM: --- I think we should also return a correct version from the hints service [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0] same as we do in commit log descriptor. This also would make the issue for same-version go away, and since it would make the service to pick a different code path I'd say it's also necessary to include it. WRT to the patch itself, might be it's better to just call {{resetCrc}} explicitly and still return null like I did [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0#diff-cf15f9cac67d8b2f3e581129d617df16R242]? {{hint}} is a local variable, and setting it and carrying on makes the logic a bit harder to understand. For example, for me it was non-obvious that this boolean method would also do some buffer rewinding / state resetting under the hood. was (Author: ifesdjeen): I think we should also return a correct version from the hints service [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0] same as we do in commit log descriptor. This also would make the issue for same-version go away, and since it would make the service to pick a different code path I'd say it's also necessary to include it. > Digest mismatch Exception if hints file has UnknownColumnFamily > --- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - > Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - > table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 > HintsDispatchExecutor.java:234 - Failed to dispatch hints file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch > exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) > [main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [main/:na] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111] > Caused by: java.io.IOException: Digest mismatch exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) > ~[main/:na] > ... 16 common frames omitted > {noformat} > It causes multiple cassandra nodes stop [by > default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188]. >
[jira] [Resolved] (CASSANDRA-12685) Add retry to hints dispatcher
[ https://issues.apache.org/jira/browse/CASSANDRA-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski resolved CASSANDRA-12685. Resolution: Not A Problem Please reopen if you think this still is an issue. > Add retry to hints dispatcher > - > > Key: CASSANDRA-12685 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12685 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Dikang Gu >Assignee: Dikang Gu >Priority: Minor > Fix For: 4.x > > > Problem: I often see timeout in hints replay, I find there is no retry for > hints replay, I think it would be great to add some retry logic for timeout > exception. > {code} > 2016-09-20_07:32:01.16610 INFO 07:32:01 [HintedHandoff:3]: Started hinted > handoff for host: 859af100-5d45-42bd-92f5-2bc78822158b with IP: > /2401:db00:12:30d7:face:0:39:0 > 2016-09-20_07:58:49.29983 INFO 07:58:49 [HintedHandoff:3]: Timed out > replaying hints to /2401:db00:12:30d7:face:0:39:0; aborting (55040 delivered) > 2016-09-20_07:58:49.29984 INFO 07:58:49 [HintedHandoff:3]: Enqueuing flush > of hints: 15962349 (0%) on-heap, 2049808 (0%) off-heap > 2016-09-20_08:02:17.55072 INFO 08:02:17 [HintedHandoff:1]: Started hinted > handoff for host: 859af100-5d45-42bd-92f5-2bc78822158b with IP: > /2401:db00:12:30d7:face:0:39:0 > 2016-09-20_08:05:45.25723 INFO 08:05:45 [HintedHandoff:1]: Timed out > replaying hints to /2401:db00:12:30d7:face:0:39:0; aborting (7936 delivered) > 2016-09-20_08:05:45.25725 INFO 08:05:45 [HintedHandoff:1]: Enqueuing flush > of hints: 2301605 (0%) on-heap, 259744 (0%) off-heap > 2016-09-20_08:12:19.92910 INFO 08:12:19 [HintedHandoff:2]: Started hinted > handoff for host: 859af100-5d45-42bd-92f5-2bc78822158b with IP: > /2401:db00:12:30d7:face:0:39:0 > 2016-09-20_08:51:44.72191 INFO 08:51:44 [HintedHandoff:2]: Timed out > replaying hints to /2401:db00:12:30d7:face:0:39:0; aborting (83456 delivered) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13016) log messages should include human readable sizes
[ https://issues.apache.org/jira/browse/CASSANDRA-13016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-13016: --- Labels: lhf (was: ) > log messages should include human readable sizes > > > Key: CASSANDRA-13016 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13016 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Jon Haddad > Labels: lhf > > displaying bytes by itself is difficult to read when going through log > messages. we should add a human readable version in parens (10MB) after > displaying bytes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily
[ https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094602#comment-16094602 ] Alex Petrov commented on CASSANDRA-13696: - I think we should also return a correct version from the hints service [here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13696-3.0] same as we do in commit log descriptor. This also would make the issue for same-version go away, and since it would make the service to pick a different code path I'd say it's also necessary to include it. > Digest mismatch Exception if hints file has UnknownColumnFamily > --- > > Key: CASSANDRA-13696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13696 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Blocker > Fix For: 3.0.x, 3.11.x, 4.x > > > {noformat} > WARN [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - > Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - > table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints > ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 > HintsDispatchExecutor.java:234 - Failed to dispatch hints file > a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch > exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) > ~[main/:na] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) > [main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) > [main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_111] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [main/:na] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111] > Caused by: java.io.IOException: Digest mismatch exception > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) > ~[main/:na] > at > org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) > ~[main/:na] > ... 16 common frames omitted > {noformat} > It causes multiple cassandra nodes stop [by > default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188]. > Here is the reproduce steps on a 3 nodes cluster, RF=3: > 1. stop node1 > 2. send some data with quorum (or one), it will generate hints file on > node2/node3 > 3. drop the table > 4. start node1 > node2/node3 will report "corrupted hints file" and stop. The impact is very > bad for a large cluster, when it happens, almost all the nodes are down at > the same time and we have to remove all the hints files (which contain the > dropped table) to bring the node back. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11223) Queries with LIMIT filtering on clustering columns can return less rows than expected
[ https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094562#comment-16094562 ] Jeremiah Jordan commented on CASSANDRA-11223: - Sounds like we may want to revert this until it can be fixed? > Queries with LIMIT filtering on clustering columns can return less rows than > expected > - > > Key: CASSANDRA-11223 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11223 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0 > > > A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can > return less row than expected if the table has some static columns and some > of the partition have no rows matching b = 1. > The problem can be reproduced with the following unit test: > {code} > public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() > throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, s int static, c int, > primary key (a, b))"); > for (int i = 0; i < 3; i++) > { > execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i); > for (int j = 0; j < 3; j++) > if (!(i == 0 && j == 1)) > execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", > i, j, i + j); > } > assertRows(execute("SELECT * FROM %s"), > row(1, 0, 1, 1), > row(1, 1, 1, 2), > row(1, 2, 1, 3), > row(0, 0, 0, 0), > row(0, 2, 0, 2), > row(2, 0, 2, 2), > row(2, 1, 2, 3), > row(2, 2, 2, 4)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW > FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); // < FAIL It returns only one > row because the static row of partition 0 is counted and filtered out in > SELECT statement > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete
[ https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925 ] ZhaoYang edited comment on CASSANDRA-13526 at 7/20/17 10:36 AM: | branch | unit | [dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] | | [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] | [pass|https://circleci.com/gh/jasonstack/cassandra/182] | bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test known | | [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]| [running|https://circleci.com/gh/jasonstack/cassandra/186] | running | | [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]| [pass|https://circleci.com/gh/jasonstack/cassandra/181] | running | | [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]| [pass|https://circleci.com/gh/jasonstack/cassandra/185] | running | when no local range && node has joined token ring, clean up will remove all base local sstables. was (Author: jasonstack): | branch | unit | [dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] | | [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] | running | running | | [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]| running | running | | [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]| running | running | | [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]| running | running | when no local range && node has joined token ring, clean up will remove all base local sstables. > nodetool cleanup on KS with no replicas should remove old data, not silently > complete > - > > Key: CASSANDRA-13526 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13526 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jeff Jirsa >Assignee: ZhaoYang > Labels: usability > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > From the user list: > https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E > If you have a multi-dc cluster, but some keyspaces not replicated to a given > DC, you'll be unable to run cleanup on those keyspaces in that DC, because > [the cleanup code will see no ranges and exit > early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12373) 3.0 breaks CQL compatibility with super columns families
[ https://issues.apache.org/jira/browse/CASSANDRA-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094434#comment-16094434 ] Alex Petrov commented on CASSANDRA-12373: - Rebased on top of {{3.0}} and {{3.11}}. Since we're doing this patch in the preparation for 4.0 where there'll be no thrift, supercolumnfamiles or compact tables, we do not need a trunk patch (only removing last bits of supercolumnfamilies and compact tables from code if there are any). |[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...ifesdjeen:12373-3.0]|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...ifesdjeen:12373-3.11]|[dtest|https://github.com/riptano/cassandra-dtest/compare/master...ifesdjeen:12373]| {{3.0}} and {{3.11}} patches are quite similar but not exactly the same. In 3.0 there are fewer tests (due to the missing features) and there was a difference in {{SelectStatement}} since {{processPartitions}} is called from two places there. Not sure if we needed to abstract/hide it. CI results, including upgrade tests look good. > 3.0 breaks CQL compatibility with super columns families > > > Key: CASSANDRA-12373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12373 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Sylvain Lebresne >Assignee: Alex Petrov > Fix For: 3.0.x, 3.11.x > > > This is a follow-up to CASSANDRA-12335 to fix the CQL side of super column > compatibility. > The details and a proposed solution can be found in the comments of > CASSANDRA-12335 but the crux of the issue is that super column famillies show > up differently in CQL in 3.0.x/3.x compared to 2.x, hence breaking backward > compatibilty. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12373) 3.0 breaks CQL compatibility with super columns families
[ https://issues.apache.org/jira/browse/CASSANDRA-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-12373: Fix Version/s: 3.11.x Status: Patch Available (was: Open) > 3.0 breaks CQL compatibility with super columns families > > > Key: CASSANDRA-12373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12373 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Sylvain Lebresne >Assignee: Alex Petrov > Fix For: 3.0.x, 3.11.x > > > This is a follow-up to CASSANDRA-12335 to fix the CQL side of super column > compatibility. > The details and a proposed solution can be found in the comments of > CASSANDRA-12335 but the crux of the issue is that super column famillies show > up differently in CQL in 3.0.x/3.x compared to 2.x, hence breaking backward > compatibilty. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-12373) 3.0 breaks CQL compatibility with super columns families
[ https://issues.apache.org/jira/browse/CASSANDRA-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090121#comment-16090121 ] Alex Petrov edited comment on CASSANDRA-12373 at 7/20/17 9:32 AM: -- I'll rebase on top of the latest branches as it seems like the patch has gotten a bit out of date. was (Author: ifesdjeen): I'll rebase on top of the latest trunk as it seems like the patch has gotten a bit out of date. > 3.0 breaks CQL compatibility with super columns families > > > Key: CASSANDRA-12373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12373 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Sylvain Lebresne >Assignee: Alex Petrov > Fix For: 3.0.x > > > This is a follow-up to CASSANDRA-12335 to fix the CQL side of super column > compatibility. > The details and a proposed solution can be found in the comments of > CASSANDRA-12335 but the crux of the issue is that super column famillies show > up differently in CQL in 3.0.x/3.x compared to 2.x, hence breaking backward > compatibilty. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11223) Queries with LIMIT filtering on clustering columns can return less rows than expected
[ https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094408#comment-16094408 ] Stefania commented on CASSANDRA-11223: -- I don't think it's correct to always return false in [ClusteringIndexNamesFilter.selectsAllPartition()|https://github.com/stef1927/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java#L75]. It's existing code, but with this patch applied we are no longer able to count rows for tables of the form {{CREATE TABLE %s (k int, v int, PRIMARY KEY (k) ) WITH COMPACT STORAGE}}. We don't notice in the tests because we trim the results in {{SelectStatement}}, but it does mean that we return too much data replica side in this cases. I noticed because of timeouts with large range queries on tables created by cassandra-stress. Here is a [test|https://github.com/apache/cassandra/compare/trunk...stef1927:11223-3.0] for 3.0 that reproduces the problem: {code} @Test public void testLimitInStaticTable() throws Throwable { createTable("CREATE TABLE %s (k int, v int, PRIMARY KEY (k) ) WITH COMPACT STORAGE "); for (int i = 0; i < 10; i++) execute("INSERT INTO %s(k, v) VALUES (?, ?)", i, i); assertRows(execute("SELECT * FROM %s LIMIT 5"), row(0, 0), row(1, 1), row(2, 2), row(3, 3), row(4, 4)); } {code} If we temporarily comment out {{cqlRows.trim(userLimit);}} in {{SelectStatement.process()}}, then the test only passes if we return {{clusterings.isEmpty()}} from {{ClusteringIndexNamesFilter.selectsAllPartition}}. However, note that I am not 100% sure this approach is correct. Once you are back from holiday, could you take a look [~blerer]? > Queries with LIMIT filtering on clustering columns can return less rows than > expected > - > > Key: CASSANDRA-11223 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11223 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0 > > > A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can > return less row than expected if the table has some static columns and some > of the partition have no rows matching b = 1. > The problem can be reproduced with the following unit test: > {code} > public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() > throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, s int static, c int, > primary key (a, b))"); > for (int i = 0; i < 3; i++) > { > execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i); > for (int j = 0; j < 3; j++) > if (!(i == 0 && j == 1)) > execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", > i, j, i + j); > } > assertRows(execute("SELECT * FROM %s"), > row(1, 0, 1, 1), > row(1, 1, 1, 2), > row(1, 2, 1, 3), > row(0, 0, 0, 0), > row(0, 2, 0, 2), > row(2, 0, 2, 2), > row(2, 1, 2, 3), > row(2, 2, 2, 4)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW > FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); // < FAIL It returns only one > row because the static row of partition 0 is counted and filtered out in > SELECT statement > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Reopened] (CASSANDRA-11223) Queries with LIMIT filtering on clustering columns can return less rows than expected
[ https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania reopened CASSANDRA-11223: -- > Queries with LIMIT filtering on clustering columns can return less rows than > expected > - > > Key: CASSANDRA-11223 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11223 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0 > > > A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can > return less row than expected if the table has some static columns and some > of the partition have no rows matching b = 1. > The problem can be reproduced with the following unit test: > {code} > public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() > throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, s int static, c int, > primary key (a, b))"); > for (int i = 0; i < 3; i++) > { > execute("INSERT INTO %s (a, s) VALUES (?, ?)", i, i); > for (int j = 0; j < 3; j++) > if (!(i == 0 && j == 1)) > execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", > i, j, i + j); > } > assertRows(execute("SELECT * FROM %s"), > row(1, 0, 1, 1), > row(1, 1, 1, 2), > row(1, 2, 1, 3), > row(0, 0, 0, 0), > row(0, 2, 0, 2), > row(2, 0, 2, 2), > row(2, 1, 2, 3), > row(2, 2, 2, 4)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); > assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW > FILTERING"), > row(1, 1, 1, 2), > row(2, 1, 2, 3)); // < FAIL It returns only one > row because the static row of partition 0 is counted and filtered out in > SELECT statement > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romain Hardouin updated CASSANDRA-13699: Attachment: 13699-trunk.txt Added CHANGES.txt entry and updated commit message > Allow to set batch_size_warn_threshold_in_kb via JMX > > > Key: CASSANDRA-13699 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13699 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Romain Hardouin >Priority: Minor > Fix For: 4.x > > Attachments: 13699-trunk.txt > > > We can set {{batch_size_fail_threshold_in_kb}} via JMX but not > {{batch_size_warn_threshold_in_kb}}. > The patch allows to set it dynamically and adds a INFO log for both > thresholds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romain Hardouin updated CASSANDRA-13699: Attachment: (was: 13699-trunk.txt) > Allow to set batch_size_warn_threshold_in_kb via JMX > > > Key: CASSANDRA-13699 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13699 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Romain Hardouin >Priority: Minor > Fix For: 4.x > > > We can set {{batch_size_fail_threshold_in_kb}} via JMX but not > {{batch_size_warn_threshold_in_kb}}. > The patch allows to set it dynamically and adds a INFO log for both > thresholds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094389#comment-16094389 ] Tim Reeves commented on CASSANDRA-13694: Thanks for the prompt responses - looks good! > sstabledump does not show full precision of timestamp columns > - > > Key: CASSANDRA-13694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13694 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Ubuntu 16.04 LTS >Reporter: Tim Reeves > Labels: patch-available > Fix For: 3.7 > > Attachments: CASSANDRA-13694-after-review.patch, CASSANDRA-13694.patch > > > Create a table: > CREATE TABLE test_table ( > unit_no bigint, > event_code text, > active_time timestamp, > ack_time timestamp, > PRIMARY KEY ((unit_no, event_code), active_time) > ) WITH CLUSTERING ORDER BY (active_time DESC) > Insert a row: > INSERT INTO test_table (unit_no, event_code, active_time, ack_time) > VALUES (1234, 'TEST EVENT', toTimestamp(now()), > toTimestamp(now())); > Verify that it is in the database with a full timestamp: > cqlsh:pentaho> select * from test_table; > unit_no | event_code | active_time | ack_time > -++-+- > 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 > 14:52:39.919000+ > (1 rows) > Write file: > nodetool flush > nodetool compact pentaho > Use sstabledump: > treeves@ubuntu:~$ sstabledump > /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db > [ > { > "partition" : { > "key" : [ "1234", "TEST EVENT" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 38, > "clustering" : [ "2017-07-14 15:52+0100" ], > "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" }, > "cells" : [ > { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" } > ] > } > ] > } > ] > treeves@ubuntu:~$ > The timestamp in the cluster key, and the regular column, are both truncated > to the minute. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romain Hardouin updated CASSANDRA-13699: Attachment: (was: 13699-trunk.txt) > Allow to set batch_size_warn_threshold_in_kb via JMX > > > Key: CASSANDRA-13699 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13699 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Romain Hardouin >Priority: Minor > Fix For: 4.x > > Attachments: 13699-trunk.txt > > > We can set {{batch_size_fail_threshold_in_kb}} via JMX but not > {{batch_size_warn_threshold_in_kb}}. > The patch allows to set it dynamically and adds a INFO log for both > thresholds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romain Hardouin updated CASSANDRA-13699: Attachment: 13699-trunk.txt > Allow to set batch_size_warn_threshold_in_kb via JMX > > > Key: CASSANDRA-13699 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13699 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Romain Hardouin >Priority: Minor > Fix For: 4.x > > Attachments: 13699-trunk.txt > > > We can set {{batch_size_fail_threshold_in_kb}} via JMX but not > {{batch_size_warn_threshold_in_kb}}. > The patch allows to set it dynamically and adds a INFO log for both > thresholds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Barala updated CASSANDRA-13694: - Attachment: CASSANDRA-13694-after-review.patch > sstabledump does not show full precision of timestamp columns > - > > Key: CASSANDRA-13694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13694 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Ubuntu 16.04 LTS >Reporter: Tim Reeves > Labels: patch-available > Fix For: 3.7 > > Attachments: CASSANDRA-13694-after-review.patch, CASSANDRA-13694.patch > > > Create a table: > CREATE TABLE test_table ( > unit_no bigint, > event_code text, > active_time timestamp, > ack_time timestamp, > PRIMARY KEY ((unit_no, event_code), active_time) > ) WITH CLUSTERING ORDER BY (active_time DESC) > Insert a row: > INSERT INTO test_table (unit_no, event_code, active_time, ack_time) > VALUES (1234, 'TEST EVENT', toTimestamp(now()), > toTimestamp(now())); > Verify that it is in the database with a full timestamp: > cqlsh:pentaho> select * from test_table; > unit_no | event_code | active_time | ack_time > -++-+- > 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 > 14:52:39.919000+ > (1 rows) > Write file: > nodetool flush > nodetool compact pentaho > Use sstabledump: > treeves@ubuntu:~$ sstabledump > /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db > [ > { > "partition" : { > "key" : [ "1234", "TEST EVENT" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 38, > "clustering" : [ "2017-07-14 15:52+0100" ], > "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" }, > "cells" : [ > { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" } > ] > } > ] > } > ] > treeves@ubuntu:~$ > The timestamp in the cluster key, and the regular column, are both truncated > to the minute. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094372#comment-16094372 ] Varun Barala commented on CASSANDRA-13694: -- [~jjirsa] Thanks for the review. I totally agree with you. In second patch, I exposed new function {{AbstractType#getStringHandlesTimestamp}}. This will only be used by {{JsonTransformer}}. Please have a look. Thanks!! > sstabledump does not show full precision of timestamp columns > - > > Key: CASSANDRA-13694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13694 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Ubuntu 16.04 LTS >Reporter: Tim Reeves > Labels: patch-available > Fix For: 3.7 > > Attachments: CASSANDRA-13694.patch > > > Create a table: > CREATE TABLE test_table ( > unit_no bigint, > event_code text, > active_time timestamp, > ack_time timestamp, > PRIMARY KEY ((unit_no, event_code), active_time) > ) WITH CLUSTERING ORDER BY (active_time DESC) > Insert a row: > INSERT INTO test_table (unit_no, event_code, active_time, ack_time) > VALUES (1234, 'TEST EVENT', toTimestamp(now()), > toTimestamp(now())); > Verify that it is in the database with a full timestamp: > cqlsh:pentaho> select * from test_table; > unit_no | event_code | active_time | ack_time > -++-+- > 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 > 14:52:39.919000+ > (1 rows) > Write file: > nodetool flush > nodetool compact pentaho > Use sstabledump: > treeves@ubuntu:~$ sstabledump > /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db > [ > { > "partition" : { > "key" : [ "1234", "TEST EVENT" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 38, > "clustering" : [ "2017-07-14 15:52+0100" ], > "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" }, > "cells" : [ > { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" } > ] > } > ] > } > ] > treeves@ubuntu:~$ > The timestamp in the cluster key, and the regular column, are both truncated > to the minute. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13648) Upgrade metrics to 3.1.5
[ https://issues.apache.org/jira/browse/CASSANDRA-13648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-13648: --- Attachment: metrics-logback-3.1.5.jar.asc metrics-jvm-3.1.5.jar.asc metrics-core-3.1.5.jar.asc > Upgrade metrics to 3.1.5 > > > Key: CASSANDRA-13648 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13648 > Project: Cassandra > Issue Type: Bug > Components: Libraries >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 4.x > > Attachments: metrics-core-3.1.5.jar.asc, metrics-jvm-3.1.5.jar.asc, > metrics-logback-3.1.5.jar.asc > > > GH PR #123 indicates that metrics 3.1.5 will fix a reconnect bug: > https://github.com/apache/cassandra/pull/123 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13648) Upgrade metrics to 3.1.5
[ https://issues.apache.org/jira/browse/CASSANDRA-13648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094324#comment-16094324 ] Stefan Podkowinski commented on CASSANDRA-13648: Looks like dtest needs a rebuild. I also think you missed to move the corresponding lib/license files. Did you omit the CHANGES.txt update intentionally? > Upgrade metrics to 3.1.5 > > > Key: CASSANDRA-13648 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13648 > Project: Cassandra > Issue Type: Bug > Components: Libraries >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 4.x > > > GH PR #123 indicates that metrics 3.1.5 will fix a reconnect bug: > https://github.com/apache/cassandra/pull/123 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-13656) Change default start_native_transport configuration option
[ https://issues.apache.org/jira/browse/CASSANDRA-13656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski resolved CASSANDRA-13656. Resolution: Fixed Fix Version/s: (was: 4.x) 4.0 Merged as 12d4e2f189fb22825 with added CHANGES.txt entry. > Change default start_native_transport configuration option > -- > > Key: CASSANDRA-13656 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13656 > Project: Cassandra > Issue Type: Wish > Components: Configuration >Reporter: Tomas Repik >Assignee: Tomas Repik >Priority: Trivial > Fix For: 4.0 > > Attachments: update_default_config.patch, update_default_config.patch > > > When you don't specify the start_native_transport option in the > cassandra.yaml config file the default value is set to false. So far I did > not find any good reason for setting it this way so I'm proposing to set it > to true as default. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13656) Change default start_native_transport configuration option
[ https://issues.apache.org/jira/browse/CASSANDRA-13656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091539#comment-16091539 ] Stefan Podkowinski edited comment on CASSANDRA-13656 at 7/20/17 7:44 AM: - * [branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13656] * [testall|https://circleci.com/gh/spodkowinski/cassandra/80] * [dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/137/] was (Author: spo...@gmail.com): * [branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13656] * [testall|https://circleci.com/gh/spodkowinski/cassandra/80] * [dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/132/] > Change default start_native_transport configuration option > -- > > Key: CASSANDRA-13656 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13656 > Project: Cassandra > Issue Type: Wish > Components: Configuration >Reporter: Tomas Repik >Assignee: Tomas Repik >Priority: Trivial > Fix For: 4.x > > Attachments: update_default_config.patch, update_default_config.patch > > > When you don't specify the start_native_transport option in the > cassandra.yaml config file the default value is set to false. So far I did > not find any good reason for setting it this way so I'm proposing to set it > to true as default. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Change default start_native_transport to true and remove from jvm.options
Repository: cassandra Updated Branches: refs/heads/trunk fd0657140 -> 12d4e2f18 Change default start_native_transport to true and remove from jvm.options patch by Tomas Repik; reviewed by Stefan Podkowinski for CASSANDRA-13656 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/12d4e2f1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/12d4e2f1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/12d4e2f1 Branch: refs/heads/trunk Commit: 12d4e2f189fb228250edc876963d0c74b5ab0d4f Parents: fd06571 Author: Tomas RepikAuthored: Tue Jul 18 14:12:19 2017 +0200 Committer: Stefan Podkowinski Committed: Thu Jul 20 09:43:02 2017 +0200 -- CHANGES.txt | 1 + conf/jvm.options | 3 --- src/java/org/apache/cassandra/config/Config.java | 2 +- 3 files changed, 2 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/12d4e2f1/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3fb2716..80d82dd 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Default for start_native_transport now true if not set in config (CASSANDRA-13656) * Don't add localhost to the graph when calculating where to stream from (CASSANDRA-13583) * Allow skipping equality-restricted clustering columns in ORDER BY clause (CASSANDRA-10271) * Use common nowInSec for validation compactions (CASSANDRA-13671) http://git-wip-us.apache.org/repos/asf/cassandra/blob/12d4e2f1/conf/jvm.options -- diff --git a/conf/jvm.options b/conf/jvm.options index 398b52f..17376d6 100644 --- a/conf/jvm.options +++ b/conf/jvm.options @@ -56,9 +56,6 @@ # Set the SSL port for encrypted communication. (Default: 7001) #-Dcassandra.ssl_storage_port=port -# Enable or disable the native transport server. See start_native_transport in cassandra.yaml. -# cassandra.start_native_transport=true|false - # Set the port for inter-node communication. (Default: 7000) #-Dcassandra.storage_port=port http://git-wip-us.apache.org/repos/asf/cassandra/blob/12d4e2f1/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index b7bacde..22f3551 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -142,7 +142,7 @@ public class Config public int internode_send_buff_size_in_bytes = 0; public int internode_recv_buff_size_in_bytes = 0; -public boolean start_native_transport = false; +public boolean start_native_transport = true; public int native_transport_port = 9042; public Integer native_transport_port_ssl = null; public int native_transport_max_threads = 128; - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted
[ https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094272#comment-16094272 ] ZhaoYang commented on CASSANDRA-11500: -- All livenessInfo or row deletion in MV will be ViewLivenessInfo or ViewDeletion with some extra details to check if view row is still alive. Shadowable mechanism is not used..(single flag is not sufficient and in the proposal, we don't need to bring back the columns shadowed by shadowable-tombstone) > Obsolete MV entry may not be properly deleted > - > > Key: CASSANDRA-11500 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11500 > Project: Cassandra > Issue Type: Bug > Components: Materialized Views >Reporter: Sylvain Lebresne >Assignee: ZhaoYang > > When a Materialized View uses a non-PK base table column in its PK, if an > update changes that column value, we add the new view entry and remove the > old one. When doing that removal, the current code uses the same timestamp > than for the liveness info of the new entry, which is the max timestamp for > any columns participating to the view PK. This is not correct for the > deletion as the old view entry could have other columns with higher timestamp > which won't be deleted as can easily shown by the failing of the following > test: > {noformat} > CREATE TABLE t (k int PRIMARY KEY, a int, b int); > CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS > NOT NULL PRIMARY KEY (k, a); > INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0; > UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1; > UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; > SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old > (invalid) and the new one > {noformat} > So the correct timestamp to use for the deletion is the biggest timestamp in > the old view entry (which we know since we read the pre-existing base row), > and that is what CASSANDRA-11475 does (the test above thus doesn't fail on > that branch). > Unfortunately, even then we can still have problems if further updates > requires us to overide the old entry. Consider the following case: > {noformat} > CREATE TABLE t (k int PRIMARY KEY, a int, b int); > CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS > NOT NULL PRIMARY KEY (k, a); > INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0; > UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1; > UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the > entry for a=1 with timestamp 10 > UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert > an entry for a=1 but shouldn't be deleted by the prior deletion > UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this > game more than once > UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1; > ... > {noformat} > In a way, this is saying that the "shadowable" deletion mechanism is not > general enough: we need to be able to re-insert an entry when a prior one had > been deleted before, but we can't rely on timestamps being strictly bigger on > the re-insert. In that sense, this can be though as a similar problem than > CASSANDRA-10965, though the solution there of a single flag is not enough > since we can have to replace more than once. > I think the proper solution would be to ship enough information to always be > able to decide when a view deletion is shadowed. Which means that both > liveness info (for updates) and shadowable deletion would need to ship the > timestamp of any base table column that is part the view PK (so {{a}} in the > example below). It's doable (and not that hard really), but it does require > a change to the sstable and intra-node protocol, which makes this a bit > painful right now. > But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even > enough since on equal timestamp the value can be the deciding factor. So in > theory we'd have to ship the value of those columns (in the case of a > deletion at least since we have it in the view PK for updates). That said, on > that last problem, my preference would be that we start prioritizing > CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp > anymore, which would make this problem go away. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13142) Upgradesstables cancels compactions unnecessarily
[ https://issues.apache.org/jira/browse/CASSANDRA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094253#comment-16094253 ] Kurt Greaves commented on CASSANDRA-13142: -- I was afraid someone would say that > Upgradesstables cancels compactions unnecessarily > - > > Key: CASSANDRA-13142 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13142 > Project: Cassandra > Issue Type: Bug >Reporter: Kurt Greaves >Assignee: Kurt Greaves > Attachments: 13142-v1.patch > > > Since at least 1.2 upgradesstables will cancel any compactions bar > validations when run. This was originally determined as a non-issue in > CASSANDRA-3430 however can be quite annoying (especially with STCS) as a > compaction will output the new version anyway. Furthermore, as per > CASSANDRA-12243 it also stops things like view builds and I assume secondary > index builds as well which is not ideal. > We should avoid cancelling compactions unnecessarily. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted
[ https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094252#comment-16094252 ] Kurt Greaves commented on CASSANDRA-11500: -- Yep have been, was just hoping there was some code I could pin it to to make things clearer - as I'm sure you're aware it's hard to figure out all the edge cases unless you actively try them. Not a big deal, I'll keep an eye out for when you have a branch ready. Your proposal looks good and seems to make sense and cover all the cases I can think of (but there are so many I'm sure I've forgotten some). With this change in place would all deletions/deletions in views be represented as a ViewTombstone? My understanding is that you're essentially combining normal tombstones and shadowables to create the viewtombstone, with a few extra details to catch the edge cases, is that right? > Obsolete MV entry may not be properly deleted > - > > Key: CASSANDRA-11500 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11500 > Project: Cassandra > Issue Type: Bug > Components: Materialized Views >Reporter: Sylvain Lebresne >Assignee: ZhaoYang > > When a Materialized View uses a non-PK base table column in its PK, if an > update changes that column value, we add the new view entry and remove the > old one. When doing that removal, the current code uses the same timestamp > than for the liveness info of the new entry, which is the max timestamp for > any columns participating to the view PK. This is not correct for the > deletion as the old view entry could have other columns with higher timestamp > which won't be deleted as can easily shown by the failing of the following > test: > {noformat} > CREATE TABLE t (k int PRIMARY KEY, a int, b int); > CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS > NOT NULL PRIMARY KEY (k, a); > INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0; > UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1; > UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; > SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old > (invalid) and the new one > {noformat} > So the correct timestamp to use for the deletion is the biggest timestamp in > the old view entry (which we know since we read the pre-existing base row), > and that is what CASSANDRA-11475 does (the test above thus doesn't fail on > that branch). > Unfortunately, even then we can still have problems if further updates > requires us to overide the old entry. Consider the following case: > {noformat} > CREATE TABLE t (k int PRIMARY KEY, a int, b int); > CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS > NOT NULL PRIMARY KEY (k, a); > INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0; > UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1; > UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the > entry for a=1 with timestamp 10 > UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert > an entry for a=1 but shouldn't be deleted by the prior deletion > UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this > game more than once > UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1; > ... > {noformat} > In a way, this is saying that the "shadowable" deletion mechanism is not > general enough: we need to be able to re-insert an entry when a prior one had > been deleted before, but we can't rely on timestamps being strictly bigger on > the re-insert. In that sense, this can be though as a similar problem than > CASSANDRA-10965, though the solution there of a single flag is not enough > since we can have to replace more than once. > I think the proper solution would be to ship enough information to always be > able to decide when a view deletion is shadowed. Which means that both > liveness info (for updates) and shadowable deletion would need to ship the > timestamp of any base table column that is part the view PK (so {{a}} in the > example below). It's doable (and not that hard really), but it does require > a change to the sstable and intra-node protocol, which makes this a bit > painful right now. > But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even > enough since on equal timestamp the value can be the deciding factor. So in > theory we'd have to ship the value of those columns (in the case of a > deletion at least since we have it in the view PK for updates). That said, on > that last problem, my preference would be that we start prioritizing > CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp > anymore, which would make this problem go away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)