[jira] [Resolved] (CASSANDRA-11222) datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.
[ https://issues.apache.org/jira/browse/CASSANDRA-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-11222. -- Resolution: Not A Problem This is really for the datanucleus project to fix so closing. > datanucleus-cassandra won't work with cassandra 3.0 system.* metadata. > -- > > Key: CASSANDRA-11222 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11222 > Project: Cassandra > Issue Type: Wish > Components: CQL > Environment: Java JDO >Reporter: Rafael Sanches >Priority: Minor > Labels: newbie > Fix For: 3.0.x > > > Hi, > I'm starting a new project and was hoping to upgrade directly to cassandra > 3.0, so it would save us a migration from 2.2 later on. > Unfortunately, the datanucleus-cassandra-5.0.0-m1 (latest) don't support the > 3.0 data model. > Errors like these will appear because of JDO: > https://issues.apache.org/jira/browse/CASSANDRA-10996 > To be more specific, this class does things like: > StringBuilder stmtBuilder = new StringBuilder("SELECT keyspace_name FROM > system.schema_keyspaces WHERE keyspace_name=?;"); > https://github.com/datanucleus/datanucleus-cassandra/blob/master/src/main/java/org/datanucleus/store/cassandra/CassandraSchemaHandler.java > It doesn't seem like the Datanucleus guys are looking to fix this, since the > last update on datanucleus-cassandra was on 2014. I will open an issue there > too. Hope can reach contributors from both places. > I guess opening an issue here is more a "heads up", because more developers > will waste time on this soon. > thanks > rafa -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11222) datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.
[ https://issues.apache.org/jira/browse/CASSANDRA-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-11222: - Priority: Minor (was: Blocker) > datanucleus-cassandra won't work with cassandra 3.0 system.* metadata. > -- > > Key: CASSANDRA-11222 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11222 > Project: Cassandra > Issue Type: Wish > Components: CQL > Environment: Java JDO >Reporter: Rafael Sanches >Priority: Minor > Labels: newbie > Fix For: 3.0.x > > > Hi, > I'm starting a new project and was hoping to upgrade directly to cassandra > 3.0, so it would save us a migration from 2.2 later on. > Unfortunately, the datanucleus-cassandra-5.0.0-m1 (latest) don't support the > 3.0 data model. > Errors like these will appear because of JDO: > https://issues.apache.org/jira/browse/CASSANDRA-10996 > To be more specific, this class does things like: > StringBuilder stmtBuilder = new StringBuilder("SELECT keyspace_name FROM > system.schema_keyspaces WHERE keyspace_name=?;"); > https://github.com/datanucleus/datanucleus-cassandra/blob/master/src/main/java/org/datanucleus/store/cassandra/CassandraSchemaHandler.java > It doesn't seem like the Datanucleus guys are looking to fix this, since the > last update on datanucleus-cassandra was on 2014. I will open an issue there > too. Hope can reach contributors from both places. > I guess opening an issue here is more a "heads up", because more developers > will waste time on this soon. > thanks > rafa -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160323#comment-15160323 ] Jeff Jirsa edited comment on CASSANDRA-11209 at 2/24/16 7:44 AM: - Similar to CASSANDRA-10510 as well ? was (Author: jjirsa): Similar to CASSANDRA-10510 as well > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez >Assignee: Marcus Eriksson > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160323#comment-15160323 ] Jeff Jirsa commented on CASSANDRA-11209: Similar to CASSANDRA-10510 as well > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez >Assignee: Marcus Eriksson > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM
[ https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9303: Labels: doc-impacting (was: ) > Match cassandra-loader options in COPY FROM > --- > > Key: CASSANDRA-9303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9303 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Stefania >Priority: Critical > Labels: doc-impacting > Fix For: 2.1.13, 2.2.5, 3.0.3, 3.2 > > Attachments: dtest.out > > > https://github.com/brianmhess/cassandra-loader added a bunch of options to > handle real world requirements, we should match those. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reopened CASSANDRA-11209: - Assignee: Marcus Eriksson or maybe not... need more tests > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez >Assignee: Marcus Eriksson > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9043) Improve COPY command to work with Counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9043: Labels: doc-impacting lhf (was: lhf) > Improve COPY command to work with Counter columns > - > > Key: CASSANDRA-9043 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9043 > Project: Cassandra > Issue Type: Improvement >Reporter: Sebastian Estevez >Assignee: ZhaoYang >Priority: Minor > Labels: doc-impacting, lhf > Fix For: 2.1.12, 2.2.4, 3.0.1, 3.1 > > Attachments: CASSANDRA-9043-2.1.8.patch, CASSANDRA-9043-trunk.patch > > > Noticed today that the copy command doesn't work with counter column tables. > This makes sense given that we need to use UPDATE instead of INSERT with > counters. > Given that we're making improvements in the COPY command in 3.0 with > CASSANDRA-7405, can we also tweak it to work with counters? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-11209. - Resolution: Duplicate yep, this is a duplicate of CASSANDRA-11215 - we can leak references if we throw exceptions in doValidationCompaction > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11124) Change default cqlsh encoding to utf-8
[ https://issues.apache.org/jira/browse/CASSANDRA-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156310#comment-15156310 ] Stefania edited comment on CASSANDRA-11124 at 2/24/16 3:06 AM: --- Looking good, +1. dtests on 2.2. are not working at the moment and there are 2 cqlsh failures on trunk but neither are related to this ticket. was (Author: stefania): Looking good, +1. dtests on 2.2. are not working at the moment and there are 2 cqlsh failures on trunk but neither are not related to this ticket. > Change default cqlsh encoding to utf-8 > -- > > Key: CASSANDRA-11124 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11124 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Trivial > Labels: cqlsh > > Strange things can happen when utf-8 is not the default cqlsh encoding (see > CASSANDRA-11030). This ticket proposes changing the default cqlsh encoding to > utf-8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11212) cqlsh python version checking is out of date
[ https://issues.apache.org/jira/browse/CASSANDRA-11212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160049#comment-15160049 ] Stefania commented on CASSANDRA-11212: -- Latest round of dtests was good, this can be committed. > cqlsh python version checking is out of date > > > Key: CASSANDRA-11212 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11212 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > Fix For: 2.2.x, 3.0.x, 3.x > > > cqlsh.py has python version checking code at the top, but it still says > python 2.5 is a valid version, which we then error out on a few lines down in > the file. We should fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11222) datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.
Rafael Sanches created CASSANDRA-11222: -- Summary: datanucleus-cassandra won't work with cassandra 3.0 system.* metadata. Key: CASSANDRA-11222 URL: https://issues.apache.org/jira/browse/CASSANDRA-11222 Project: Cassandra Issue Type: Wish Components: CQL Environment: Java JDO Reporter: Rafael Sanches Priority: Blocker Fix For: 3.0.x Hi, I'm starting a new project and was hoping to upgrade directly to cassandra 3.0, so it would save us a migration from 2.2 later on. Unfortunately, the datanucleus-cassandra-5.0.0-m1 (latest) don't support the 3.0 data model. Errors like these will appear because of JDO: https://issues.apache.org/jira/browse/CASSANDRA-10996 To be more specific, this class does things like: StringBuilder stmtBuilder = new StringBuilder("SELECT keyspace_name FROM system.schema_keyspaces WHERE keyspace_name=?;"); https://github.com/datanucleus/datanucleus-cassandra/blob/master/src/main/java/org/datanucleus/store/cassandra/CassandraSchemaHandler.java It doesn't seem like the Datanucleus guys are looking to fix this, since the last update on datanucleus-cassandra was on 2014. I will open an issue there too. Hope can reach contributors from both places. I guess opening an issue here is more a "heads up", because more developers will waste time on this soon. thanks rafa -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11164) Order and filter cipher suites correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160023#comment-15160023 ] Stefania commented on CASSANDRA-11164: -- I understood that the ordering would be dealt with by CASSANDRA-10508 but if we want to fix it here then that's probably better since this can then go into 2.2 and CASSANDRA-10508 be limited to 3.x only. {{testServerSocketCiphers}} is failing locally on my machine because the 256 cipher_suites are not returned by {{socket.getEnabledCipherSuites()}}, so I think we should remove it from this patch? Incidentally it also doesn't need {{UnknownHostException}} in the throws declaration since {{IOException}} is more generic. {{TestTupleType}} has failed on jenkins but it is passing locally and the failure doesn't seem related. Let's rebase, squash the two commits and repeat the cassci tests on *2.2, 3.0* and *trunk*. If that's clear then we are good to go. If {{TestTupleType}} is still failing then my best guess is that for some reason we've uncovered an existing problem in {{CQLTester}} and we'll deal with it. We'll also need to add a line to CHANGES.txt. > Order and filter cipher suites correctly > > > Key: CASSANDRA-11164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11164 > Project: Cassandra > Issue Type: Bug >Reporter: Tom Petracca >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x > > Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, > 11164-2.2_2_call_filterCipherSuites_everywhere.patch > > > As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, > SSLFactory.filterCipherSuites() doesn't respect the ordering of desired > ciphers in cassandra.yaml. > Also the fix that occurred for > https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs > to be applied to all locations where we create an SSLSocket so that JCE is > not required out of the box or with additional configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11212) cqlsh python version checking is out of date
[ https://issues.apache.org/jira/browse/CASSANDRA-11212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159972#comment-15159972 ] Stefania commented on CASSANDRA-11212: -- It seems all dtest jobs failed for infrastructure reasons, trying again with the 2.2 patch and a different cassci job: http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11212-2.2-dtest/ > cqlsh python version checking is out of date > > > Key: CASSANDRA-11212 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11212 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Jeremiah Jordan >Assignee: Jeremiah Jordan > Fix For: 2.2.x, 3.0.x, 3.x > > > cqlsh.py has python version checking code at the top, but it still says > python 2.5 is a valid version, which we then error out on a few lines down in > the file. We should fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11221) replication_test.ReplicationTest.network_topology_test flaps
[ https://issues.apache.org/jira/browse/CASSANDRA-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159970#comment-15159970 ] Philip Thompson commented on CASSANDRA-11221: - I know what's wrong. I'll take care of it. I didn't realize these were failing again. > replication_test.ReplicationTest.network_topology_test flaps > > > Key: CASSANDRA-11221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11221 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Philip Thompson > Labels: dtest > > Test intermittently failing with set comparison errors that differ from one > failure to the next. Looks a bit more stable recently since #203 failed, but > probably worth keeping an eye on, and check if there's a problem with the > test code. > most recent failure: > http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/203/testReport/replication_test/ReplicationTest/network_topology_test/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11221) replication_test.ReplicationTest.network_topology_test flaps
[ https://issues.apache.org/jira/browse/CASSANDRA-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159965#comment-15159965 ] Russ Hatch commented on CASSANDRA-11221: errors are typically some variant of set comparison fails, like most recently "Items in the second set but not the first: u'127.0.0.4'" > replication_test.ReplicationTest.network_topology_test flaps > > > Key: CASSANDRA-11221 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11221 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: DS Test Eng > Labels: dtest > > Test intermittently failing with set comparison errors that differ from one > failure to the next. Looks a bit more stable recently since #203 failed, but > probably worth keeping an eye on, and check if there's a problem with the > test code. > most recent failure: > http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/203/testReport/replication_test/ReplicationTest/network_topology_test/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11221) replication_test.ReplicationTest.network_topology_test flaps
Russ Hatch created CASSANDRA-11221: -- Summary: replication_test.ReplicationTest.network_topology_test flaps Key: CASSANDRA-11221 URL: https://issues.apache.org/jira/browse/CASSANDRA-11221 Project: Cassandra Issue Type: Test Reporter: Russ Hatch Assignee: DS Test Eng Test intermittently failing with set comparison errors that differ from one failure to the next. Looks a bit more stable recently since #203 failed, but probably worth keeping an eye on, and check if there's a problem with the test code. most recent failure: http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/203/testReport/replication_test/ReplicationTest/network_topology_test/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip
[ https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159795#comment-15159795 ] Jason Brown commented on CASSANDRA-10371: - Committed to 2.1, 2.2, 3.0, and trunk, as sha 7877d6f85f1a84d9f9de4d81339730d9df3667a1 > Decommissioned nodes can remain in gossip > - > > Key: CASSANDRA-10371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10371 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Brandon Williams >Assignee: Joel Knighton >Priority: Minor > Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4 > > > This may apply to other dead states as well. Dead states should be expired > after 3 days. In the case of decom we attach a timestamp to let the other > nodes know when it should be expired. It has been observed that sometimes a > subset of nodes in the cluster never expire the state, and through heap > analysis of these nodes it is revealed that the epstate.isAlive check returns > true when it should return false, which would allow the state to be evicted. > This may have been affected by CASSANDRA-8336. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[09/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c4bd6d25 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c4bd6d25 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c4bd6d25 Branch: refs/heads/trunk Commit: c4bd6d2549bd794b81cf0c9a9ac73b97c5a0b686 Parents: e9abaab 77ff794 Author: Jason BrownAuthored: Tue Feb 23 14:36:15 2016 -0800 Committer: Jason Brown Committed: Tue Feb 23 14:37:56 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/CHANGES.txt -- diff --cc CHANGES.txt index cd2a930,e989e7f..9ca2f80 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -34,7 -17,9 +34,8 @@@ Merged from 2.2 * Fix paging on DISTINCT queries repeats result when first row in partition changes (CASSANDRA-10010) Merged from 2.1: + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) - * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) * Fix sstableloader to unthrottle streaming by default (CASSANDRA-9714) * Fix incorrect warning in 'nodetool status' (CASSANDRA-10176) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/src/java/org/apache/cassandra/gms/Gossiper.java --
[06/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/77ff7947 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/77ff7947 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/77ff7947 Branch: refs/heads/cassandra-2.2 Commit: 77ff794737f067b04f1e2fae6124cb22921eb4c7 Parents: 5009594 7877d6f Author: Jason BrownAuthored: Tue Feb 23 14:31:38 2016 -0800 Committer: Jason Brown Committed: Tue Feb 23 14:35:03 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/CHANGES.txt -- diff --cc CHANGES.txt index 01e7b3d,82ee99e..e989e7f --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,22 -1,5 +1,23 @@@ -2.1.14 +2.2.6 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) + * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) + * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) + * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) + * Protect from keyspace dropped during repair (CASSANDRA-11065) + * Handle adding fields to a UDT in SELECT JSON and toJson() (CASSANDRA-11146) + * Better error message for cleanup (CASSANDRA-10991) + * cqlsh pg-style-strings broken if line ends with ';' (CASSANDRA-11123) + * Use cloned TokenMetadata in size estimates to avoid race against membership check + (CASSANDRA-10736) + * Always persist upsampled index summaries (CASSANDRA-10512) + * (cqlsh) Fix inconsistent auto-complete (CASSANDRA-10733) + * Make SELECT JSON and toJson() threadsafe (CASSANDRA-11048) + * Fix SELECT on tuple relations for mixed ASC/DESC clustering order (CASSANDRA-7281) + * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030) + * Fix paging on DISTINCT queries repeats result when first row in partition changes + (CASSANDRA-10010) +Merged from 2.1: + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/src/java/org/apache/cassandra/gms/Gossiper.java --
[03/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint
Don't remove FailureDetector history on removeEndpoint patch by jkni, reviewed by jasobrown for CASSANDRA-10371 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8 Branch: refs/heads/cassandra-3.0 Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1 Parents: 67637d1 Author: Joel KnightonAuthored: Fri Feb 19 15:19:33 2016 -0600 Committer: Jason Brown Committed: Tue Feb 23 14:30:28 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 52bdcce..82ee99e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java -- diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java b/src/java/org/apache/cassandra/gms/Gossiper.java index ae99829..889806c 100644 --- a/src/java/org/apache/cassandra/gms/Gossiper.java +++ b/src/java/org/apache/cassandra/gms/Gossiper.java @@ -386,6 +386,7 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean unreachableEndpoints.remove(endpoint); endpointStateMap.remove(endpoint); expireTimeEndpointMap.remove(endpoint); +FailureDetector.instance.remove(endpoint); quarantineEndpoint(endpoint); if (logger.isDebugEnabled()) logger.debug("evicting {} from gossip", endpoint); @@ -409,8 +410,6 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean liveEndpoints.remove(endpoint); unreachableEndpoints.remove(endpoint); -// do not remove endpointState until the quarantine expires -FailureDetector.instance.remove(endpoint); MessagingService.instance().resetVersion(endpoint); quarantineEndpoint(endpoint); MessagingService.instance().destroyConnectionPool(endpoint); http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java -- diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java new file mode 100644 index 000..9325922 --- /dev/null +++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.gms; + +import java.net.InetAddress; +import java.net.UnknownHostException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.UUID; + +import org.junit.BeforeClass; +import org.junit.Test; + +import org.apache.cassandra.Util; +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.dht.IPartitioner; +import org.apache.cassandra.dht.RandomPartitioner; +import org.apache.cassandra.dht.Token; +import org.apache.cassandra.locator.TokenMetadata; +import org.apache.cassandra.service.StorageService; + +import static org.junit.Assert.assertFalse; + +public class FailureDetectorTest +{ +@BeforeClass +public static
[07/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/77ff7947 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/77ff7947 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/77ff7947 Branch: refs/heads/trunk Commit: 77ff794737f067b04f1e2fae6124cb22921eb4c7 Parents: 5009594 7877d6f Author: Jason BrownAuthored: Tue Feb 23 14:31:38 2016 -0800 Committer: Jason Brown Committed: Tue Feb 23 14:35:03 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/CHANGES.txt -- diff --cc CHANGES.txt index 01e7b3d,82ee99e..e989e7f --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,22 -1,5 +1,23 @@@ -2.1.14 +2.2.6 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) + * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) + * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) + * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) + * Protect from keyspace dropped during repair (CASSANDRA-11065) + * Handle adding fields to a UDT in SELECT JSON and toJson() (CASSANDRA-11146) + * Better error message for cleanup (CASSANDRA-10991) + * cqlsh pg-style-strings broken if line ends with ';' (CASSANDRA-11123) + * Use cloned TokenMetadata in size estimates to avoid race against membership check + (CASSANDRA-10736) + * Always persist upsampled index summaries (CASSANDRA-10512) + * (cqlsh) Fix inconsistent auto-complete (CASSANDRA-10733) + * Make SELECT JSON and toJson() threadsafe (CASSANDRA-11048) + * Fix SELECT on tuple relations for mixed ASC/DESC clustering order (CASSANDRA-7281) + * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030) + * Fix paging on DISTINCT queries repeats result when first row in partition changes + (CASSANDRA-10010) +Merged from 2.1: + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/src/java/org/apache/cassandra/gms/Gossiper.java --
[04/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint
Don't remove FailureDetector history on removeEndpoint patch by jkni, reviewed by jasobrown for CASSANDRA-10371 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8 Branch: refs/heads/trunk Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1 Parents: 67637d1 Author: Joel KnightonAuthored: Fri Feb 19 15:19:33 2016 -0600 Committer: Jason Brown Committed: Tue Feb 23 14:30:28 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 52bdcce..82ee99e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java -- diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java b/src/java/org/apache/cassandra/gms/Gossiper.java index ae99829..889806c 100644 --- a/src/java/org/apache/cassandra/gms/Gossiper.java +++ b/src/java/org/apache/cassandra/gms/Gossiper.java @@ -386,6 +386,7 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean unreachableEndpoints.remove(endpoint); endpointStateMap.remove(endpoint); expireTimeEndpointMap.remove(endpoint); +FailureDetector.instance.remove(endpoint); quarantineEndpoint(endpoint); if (logger.isDebugEnabled()) logger.debug("evicting {} from gossip", endpoint); @@ -409,8 +410,6 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean liveEndpoints.remove(endpoint); unreachableEndpoints.remove(endpoint); -// do not remove endpointState until the quarantine expires -FailureDetector.instance.remove(endpoint); MessagingService.instance().resetVersion(endpoint); quarantineEndpoint(endpoint); MessagingService.instance().destroyConnectionPool(endpoint); http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java -- diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java new file mode 100644 index 000..9325922 --- /dev/null +++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.gms; + +import java.net.InetAddress; +import java.net.UnknownHostException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.UUID; + +import org.junit.BeforeClass; +import org.junit.Test; + +import org.apache.cassandra.Util; +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.dht.IPartitioner; +import org.apache.cassandra.dht.RandomPartitioner; +import org.apache.cassandra.dht.Token; +import org.apache.cassandra.locator.TokenMetadata; +import org.apache.cassandra.service.StorageService; + +import static org.junit.Assert.assertFalse; + +public class FailureDetectorTest +{ +@BeforeClass +public static void
[10/10] cassandra git commit: Merge branch 'cassandra-3.0' into trunk
Merge branch 'cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ac8c8b21 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ac8c8b21 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ac8c8b21 Branch: refs/heads/trunk Commit: ac8c8b213c6a02de1b547cd537bf9058e851bfdc Parents: babf30d c4bd6d2 Author: Jason BrownAuthored: Tue Feb 23 14:38:18 2016 -0800 Committer: Jason Brown Committed: Tue Feb 23 14:39:53 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ac8c8b21/CHANGES.txt -- diff --cc CHANGES.txt index 6ad7e1f,9ca2f80..361eedc --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -61,8 -33,8 +61,9 @@@ Merged from 2.2 * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030) * Fix paging on DISTINCT queries repeats result when first row in partition changes (CASSANDRA-10010) + * (cqlsh) Support timezone conversion using pytz (CASSANDRA-10397) Merged from 2.1: + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) * Fix sstableloader to unthrottle streaming by default (CASSANDRA-9714)
[08/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c4bd6d25 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c4bd6d25 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c4bd6d25 Branch: refs/heads/cassandra-3.0 Commit: c4bd6d2549bd794b81cf0c9a9ac73b97c5a0b686 Parents: e9abaab 77ff794 Author: Jason BrownAuthored: Tue Feb 23 14:36:15 2016 -0800 Committer: Jason Brown Committed: Tue Feb 23 14:37:56 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/CHANGES.txt -- diff --cc CHANGES.txt index cd2a930,e989e7f..9ca2f80 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -34,7 -17,9 +34,8 @@@ Merged from 2.2 * Fix paging on DISTINCT queries repeats result when first row in partition changes (CASSANDRA-10010) Merged from 2.1: + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) - * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) * Fix sstableloader to unthrottle streaming by default (CASSANDRA-9714) * Fix incorrect warning in 'nodetool status' (CASSANDRA-10176) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/src/java/org/apache/cassandra/gms/Gossiper.java --
[01/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 67637d1bb -> 7877d6f85 refs/heads/cassandra-2.2 50095947e -> 77ff79473 refs/heads/cassandra-3.0 e9abaabfe -> c4bd6d254 refs/heads/trunk babf30dd1 -> ac8c8b213 Don't remove FailureDetector history on removeEndpoint patch by jkni, reviewed by jasobrown for CASSANDRA-10371 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8 Branch: refs/heads/cassandra-2.1 Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1 Parents: 67637d1 Author: Joel KnightonAuthored: Fri Feb 19 15:19:33 2016 -0600 Committer: Jason Brown Committed: Tue Feb 23 14:30:28 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 52bdcce..82ee99e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java -- diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java b/src/java/org/apache/cassandra/gms/Gossiper.java index ae99829..889806c 100644 --- a/src/java/org/apache/cassandra/gms/Gossiper.java +++ b/src/java/org/apache/cassandra/gms/Gossiper.java @@ -386,6 +386,7 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean unreachableEndpoints.remove(endpoint); endpointStateMap.remove(endpoint); expireTimeEndpointMap.remove(endpoint); +FailureDetector.instance.remove(endpoint); quarantineEndpoint(endpoint); if (logger.isDebugEnabled()) logger.debug("evicting {} from gossip", endpoint); @@ -409,8 +410,6 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean liveEndpoints.remove(endpoint); unreachableEndpoints.remove(endpoint); -// do not remove endpointState until the quarantine expires -FailureDetector.instance.remove(endpoint); MessagingService.instance().resetVersion(endpoint); quarantineEndpoint(endpoint); MessagingService.instance().destroyConnectionPool(endpoint); http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java -- diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java new file mode 100644 index 000..9325922 --- /dev/null +++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.gms; + +import java.net.InetAddress; +import java.net.UnknownHostException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.UUID; + +import org.junit.BeforeClass; +import org.junit.Test; + +import org.apache.cassandra.Util; +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.dht.IPartitioner; +import org.apache.cassandra.dht.RandomPartitioner; +import
[02/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint
Don't remove FailureDetector history on removeEndpoint patch by jkni, reviewed by jasobrown for CASSANDRA-10371 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8 Branch: refs/heads/cassandra-2.2 Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1 Parents: 67637d1 Author: Joel KnightonAuthored: Fri Feb 19 15:19:33 2016 -0600 Committer: Jason Brown Committed: Tue Feb 23 14:30:28 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 52bdcce..82ee99e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java -- diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java b/src/java/org/apache/cassandra/gms/Gossiper.java index ae99829..889806c 100644 --- a/src/java/org/apache/cassandra/gms/Gossiper.java +++ b/src/java/org/apache/cassandra/gms/Gossiper.java @@ -386,6 +386,7 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean unreachableEndpoints.remove(endpoint); endpointStateMap.remove(endpoint); expireTimeEndpointMap.remove(endpoint); +FailureDetector.instance.remove(endpoint); quarantineEndpoint(endpoint); if (logger.isDebugEnabled()) logger.debug("evicting {} from gossip", endpoint); @@ -409,8 +410,6 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean liveEndpoints.remove(endpoint); unreachableEndpoints.remove(endpoint); -// do not remove endpointState until the quarantine expires -FailureDetector.instance.remove(endpoint); MessagingService.instance().resetVersion(endpoint); quarantineEndpoint(endpoint); MessagingService.instance().destroyConnectionPool(endpoint); http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java -- diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java new file mode 100644 index 000..9325922 --- /dev/null +++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.gms; + +import java.net.InetAddress; +import java.net.UnknownHostException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.UUID; + +import org.junit.BeforeClass; +import org.junit.Test; + +import org.apache.cassandra.Util; +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.dht.IPartitioner; +import org.apache.cassandra.dht.RandomPartitioner; +import org.apache.cassandra.dht.Token; +import org.apache.cassandra.locator.TokenMetadata; +import org.apache.cassandra.service.StorageService; + +import static org.junit.Assert.assertFalse; + +public class FailureDetectorTest +{ +@BeforeClass +public static
[05/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/77ff7947 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/77ff7947 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/77ff7947 Branch: refs/heads/cassandra-3.0 Commit: 77ff794737f067b04f1e2fae6124cb22921eb4c7 Parents: 5009594 7877d6f Author: Jason BrownAuthored: Tue Feb 23 14:31:38 2016 -0800 Committer: Jason Brown Committed: Tue Feb 23 14:35:03 2016 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/gms/Gossiper.java | 3 +- .../cassandra/gms/FailureDetectorTest.java | 85 3 files changed, 87 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/CHANGES.txt -- diff --cc CHANGES.txt index 01e7b3d,82ee99e..e989e7f --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,22 -1,5 +1,23 @@@ -2.1.14 +2.2.6 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) + * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) + * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) + * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) + * Protect from keyspace dropped during repair (CASSANDRA-11065) + * Handle adding fields to a UDT in SELECT JSON and toJson() (CASSANDRA-11146) + * Better error message for cleanup (CASSANDRA-10991) + * cqlsh pg-style-strings broken if line ends with ';' (CASSANDRA-11123) + * Use cloned TokenMetadata in size estimates to avoid race against membership check + (CASSANDRA-10736) + * Always persist upsampled index summaries (CASSANDRA-10512) + * (cqlsh) Fix inconsistent auto-complete (CASSANDRA-10733) + * Make SELECT JSON and toJson() threadsafe (CASSANDRA-11048) + * Fix SELECT on tuple relations for mixed ASC/DESC clustering order (CASSANDRA-7281) + * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030) + * Fix paging on DISTINCT queries repeats result when first row in partition changes + (CASSANDRA-10010) +Merged from 2.1: + * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371) * Only notify if repair status changed (CASSANDRA-11172) * Add partition key to TombstoneOverwhelmingException error message (CASSANDRA-10888) * Use logback setting for 'cassandra -v' command (CASSANDRA-10767) http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/src/java/org/apache/cassandra/gms/Gossiper.java --
[jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip
[ https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159767#comment-15159767 ] Jason Brown commented on CASSANDRA-10371: - +1. Nice detective, Joel. Will commit this afternoon. > Decommissioned nodes can remain in gossip > - > > Key: CASSANDRA-10371 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10371 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Brandon Williams >Assignee: Joel Knighton >Priority: Minor > > This may apply to other dead states as well. Dead states should be expired > after 3 days. In the case of decom we attach a timestamp to let the other > nodes know when it should be expired. It has been observed that sometimes a > subset of nodes in the cluster never expire the state, and through heap > analysis of these nodes it is revealed that the epstate.isAlive check returns > true when it should return false, which would allow the state to be evicted. > This may have been affected by CASSANDRA-8336. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10637) Extract LoaderOptions and refactor BulkLoader to be able to be used from within existing Java code instead of just through main()
[ https://issues.apache.org/jira/browse/CASSANDRA-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159683#comment-15159683 ] Yuki Morishita commented on CASSANDRA-10637: Since your patch, there are couples of change in LoaderOptions, so I added them and rebased on top of the latest head. Especially AuthProvider and internodeStreamingThrottle are added so I added both to your Builder. ||branch||testall||dtest|| |[10637|https://github.com/yukim/cassandra/tree/10637]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10637-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10637-dtest/lastCompletedBuild/testReport/]| > Extract LoaderOptions and refactor BulkLoader to be able to be used from > within existing Java code instead of just through main() > - > > Key: CASSANDRA-10637 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10637 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Eric Fenderbosch >Priority: Minor > Fix For: 3.x > > > We are writing a service to migrate data from various RDMBS tables in to > Cassandra. We write out a CSV from the source system, use CQLSSTableWriter to > write sstables to disk, then call sstableloader to stream to the Cassandra > cluster. > Right now, we either have to: > * return a CSV location from one Java process to a wrapper script which then > kicks off sstableloader > * or call sstableloader via Runtime.getRuntime().exec > * or call BulkLoader.main from within our Java code, using a custom > SecurityManager to trap the System.exit calls > * or subclass BulkLoader putting the subclass in the > org.apache.cassandra.tools package in order to access the package scoped > inner classes > None of these solutions are ideal. Ideally, we should be able to use the > functionality of BulkLoader.main directly. I've extracted LoaderOptions to a > top level class that uses the builder pattern so that it can be used as part > of a Java migration service directly. > Creating the builder can now be performed with a fluent builder interface: > LoaderOptions options = LoaderOptions.builder(). // > connectionsPerHost(2). // > directory(directory). // > hosts(hosts). // > build(); > Or used to parse command line arguments: > LoaderOptions options = LoaderOptions.builder().parseArgs(args).build(); > A new load method takes a LoaderOptions parameter and throws > BulkLoadException instead of System.exit(1). > Fork on github can be found here: > https://github.com/efenderbosch/cassandra -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing on 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson reassigned CASSANDRA-11220: --- Assignee: Philip Thompson (was: DS Test Eng) > repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test > failing on 2.1 > -- > > Key: CASSANDRA-11220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Philip Thompson > Labels: dtest > > recent occurence: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ > last 2 runs failed: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing on 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159633#comment-15159633 ] Russ Hatch commented on CASSANDRA-11220: problem appears to be an assertion which is not true anymore {noformat} 1 not greater than or equal to 2 {noformat} > repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test > failing on 2.1 > -- > > Key: CASSANDRA-11220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: DS Test Eng > Labels: dtest > > recent occurence: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ > last 2 runs failed: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing on 2.1
Russ Hatch created CASSANDRA-11220: -- Summary: repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing on 2.1 Key: CASSANDRA-11220 URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 Project: Cassandra Issue Type: Test Reporter: Russ Hatch Assignee: DS Test Eng recent occurence: http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ last 2 runs failed: http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11219) Some Paxos issues
Andy Chen created CASSANDRA-11219: - Summary: Some Paxos issues Key: CASSANDRA-11219 URL: https://issues.apache.org/jira/browse/CASSANDRA-11219 Project: Cassandra Issue Type: Bug Components: Core Reporter: Andy Chen Two issues, 1. ‘Raw Paxos' without Single Leader may result in non-progress, though waiting and retry may solve part of the problem, but cannot be proven that it can make progress. 2. learning issue, mostRecentCommit may not sufficient for a learner to catch up, such as node A, B and C, C is down, while A and B done with two Commits, c1, and c2. now when C is up, only c2 could be learned by C, since update is based on row changes (according to my understanding), data may inconsistent due to this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159424#comment-15159424 ] Yuki Morishita commented on CASSANDRA-10990: Did you try running sstableupgrade from 3.3 against 2.1 SSTables? > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159412#comment-15159412 ] xiaodong wang commented on CASSANDRA-10990: --- Hi, I learnt that this issue has been worked on actively. Could you please share the updates/ETA of the fix release? I am currently running into the same issue and get blocked when trying to migrating the data from C* 2.1.x to C* 3.3 Did a few approaches: * Running the "sstableupgrade" & "sstableloader" on the 2.1.x's snapshots won't work because of this issue; * Due to the CASSANDRA-8110: within the same 2.1.x cluster running "rebuild" on 3.3 DC won't work; > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159403#comment-15159403 ] sankalp kohli commented on CASSANDRA-11218: --- cc [~krummas] I think we should also prioritize user defined compactions over others. What do you think? > Prioritize Secondary Index rebuild > -- > > Key: CASSANDRA-11218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11218 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Priority: Minor > > We have seen that secondary index rebuild get stuck behind other compaction > during a bootstrap and other operations. This causes things to not finish. We > should prioritize index rebuild via a separate thread pool or using a > priority queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11218) Prioritize Secondary Index rebuild
sankalp kohli created CASSANDRA-11218: - Summary: Prioritize Secondary Index rebuild Key: CASSANDRA-11218 URL: https://issues.apache.org/jira/browse/CASSANDRA-11218 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor We have seen that secondary index rebuild get stuck behind other compaction during a bootstrap and other operations. This causes things to not finish. We should prioritize index rebuild via a separate thread pool or using a priority queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling
[ https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159286#comment-15159286 ] Marcus Olsson commented on CASSANDRA-10070: --- bq. Sounds good! We could ask the user to pause, but I think doing that automatically via "system interrupts" is better. It just ocurred to me that both "the pause" or "system interrupts" will prevent new repairs from starting, but what about already running repairs? We will probably want to interrupt already running repairs as well in some situations. For this reason CASSANDRA-3486 is also relevant for this ticket (adding it as a dependency of this ticket). +1 bq. Then I think we should either have timeout, or add an ability to cancel/interrupt a running scheduled repair in the initial version, to avoid hanging repairs to render the automatic repair scheduling useless. I think the timeout would be good enough in the initial version. I guess the interruption of repairs would be handled by CASSANDRA-3486? Perhaps it would be possible to extend that feature later to be able to cancel a scheduled repair? Here I'm thinking that the interruption is stopping the running repair and allowing the scheduled job to retry it immediately, while cancelling it would prevent the scheduled job from retrying it immediately. bq. WDYT? Feel free to update or break-up into smaller or larger subtasks, and then create the actual subtasks to start work on them. Sounds good, I'll have a closer look on the subtasks tomorrow! I guess we will have sort of a dependency tree for some of the tasks. > Automatic repair scheduling > --- > > Key: CASSANDRA-10070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10070 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Olsson >Assignee: Marcus Olsson >Priority: Minor > Fix For: 3.x > > Attachments: Distributed Repair Scheduling.doc > > > Scheduling and running repairs in a Cassandra cluster is most often a > required task, but this can both be hard for new users and it also requires a > bit of manual configuration. There are good tools out there that can be used > to simplify things, but wouldn't this be a good feature to have inside of > Cassandra? To automatically schedule and run repairs, so that when you start > up your cluster it basically maintains itself in terms of normal > anti-entropy, with the possibility for manual configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10099) Improve concurrency in CompactionStrategyManager
[ https://issues.apache.org/jira/browse/CASSANDRA-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159240#comment-15159240 ] Yuki Morishita commented on CASSANDRA-10099: bq. I'll push an updated branch with both approaches unless you disagree? Sure, go ahead. > Improve concurrency in CompactionStrategyManager > > > Key: CASSANDRA-10099 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10099 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Marcus Eriksson > Fix For: 2.1.x, 2.2.x, 3.x > > > Continue discussion from CASSANDRA-9882. > CompactionStrategyManager(WrappingCompactionStrategy for <3.0) tracks SSTable > changes mainly for separating repaired / unrepaired SSTables (+ LCS manages > level). > This is blocking operation, and can lead to block of flush etc. when > determining next background task takes longer. > Explore the way to mitigate this concurrency issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk
Merge branch 'cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/babf30dd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/babf30dd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/babf30dd Branch: refs/heads/trunk Commit: babf30dd13dfdd49398d4067edff874f64051ad2 Parents: fc9c6fa e9abaab Author: Tyler HobbsAuthored: Tue Feb 23 11:29:13 2016 -0600 Committer: Tyler Hobbs Committed: Tue Feb 23 11:29:13 2016 -0600 -- CHANGES.txt | 1 + .../transport/messages/ErrorMessage.java | 6 -- .../cassandra/transport/ProtocolErrorTest.java| 18 ++ 3 files changed, 23 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/babf30dd/CHANGES.txt --
[2/3] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e9abaabf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e9abaabf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e9abaabf Branch: refs/heads/trunk Commit: e9abaabfe83f74b1ef7c0273bdd7738402fb0ebc Parents: 037d24e 5009594 Author: Tyler HobbsAuthored: Tue Feb 23 11:29:04 2016 -0600 Committer: Tyler Hobbs Committed: Tue Feb 23 11:29:04 2016 -0600 -- CHANGES.txt | 1 + .../transport/messages/ErrorMessage.java | 6 -- .../cassandra/transport/ProtocolErrorTest.java| 18 ++ 3 files changed, 23 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/CHANGES.txt -- diff --cc CHANGES.txt index a675016,01e7b3d..cd2a930 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,21 -1,5 +1,22 @@@ -2.2.6 +3.0.4 + * Introduce backpressure for hints (CASSANDRA-10972) + * Fix ClusteringPrefix not being able to read tombstone range boundaries (CASSANDRA-11158) + * Prevent logging in sandboxed state (CASSANDRA-11033) + * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721) + * Add query time validation method on Index (CASSANDRA-11043) + * Avoid potential AssertionError in mixed version cluster (CASSANDRA-11128) + * Properly handle hinted handoff after topology changes (CASSANDRA-5902) + * AssertionError when listing sstable files on inconsistent disk state (CASSANDRA-11156) + * Fix wrong rack counting and invalid conditions check for TokenAllocation + (CASSANDRA-11139) + * Avoid creating empty hint files (CASSANDRA-11090) + * Fix leak detection strong reference loop using weak reference (CASSANDRA-11120) + * Configurie BatchlogManager to stop delayed tasks on shutdown (CASSANDRA-11062) + * Hadoop integration is incompatible with Cassandra Driver 3.0.0 (CASSANDRA-11001) + * Add dropped_columns to the list of schema table so it gets handled + properly (CASSANDRA-11050) +Merged from 2.2: + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java --
[1/3] cassandra git commit: Avoid NPE when serializing ErrorMessage with null msg
Repository: cassandra Updated Branches: refs/heads/trunk fc9c6faa2 -> babf30dd1 Avoid NPE when serializing ErrorMessage with null msg Patch by Tyler Hobbs; reviewed by Carl Yeksigian for CASSANDRA-11167 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/50095947 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/50095947 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/50095947 Branch: refs/heads/trunk Commit: 50095947e25f630ce48ee24d10ff3e1f3fd91183 Parents: c8c8cf6 Author: Tyler HobbsAuthored: Tue Feb 23 11:28:17 2016 -0600 Committer: Tyler Hobbs Committed: Tue Feb 23 11:28:17 2016 -0600 -- CHANGES.txt | 1 + .../transport/messages/ErrorMessage.java | 6 -- .../cassandra/transport/ProtocolErrorTest.java| 18 ++ 3 files changed, 23 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 767eb8a..01e7b3d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.6 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java -- diff --git a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java index 222e833..021db5a 100644 --- a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java +++ b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java @@ -151,7 +151,8 @@ public class ErrorMessage extends Message.Response { final TransportException err = getBackwardsCompatibleException(msg, version); dest.writeInt(err.code().value); -CBUtil.writeString(err.getMessage(), dest); +String errorString = err.getMessage() == null ? "" : err.getMessage(); +CBUtil.writeString(errorString, dest); switch (err.code()) { @@ -212,7 +213,8 @@ public class ErrorMessage extends Message.Response public int encodedSize(ErrorMessage msg, int version) { final TransportException err = getBackwardsCompatibleException(msg, version); -int size = 4 + CBUtil.sizeOfString(err.getMessage()); +String errorString = err.getMessage() == null ? "" : err.getMessage(); +int size = 4 + CBUtil.sizeOfString(errorString); switch (err.code()) { case UNAVAILABLE: http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java -- diff --git a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java index 11b0ebd..fc8c41c 100644 --- a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java +++ b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java @@ -113,4 +113,22 @@ public class ProtocolErrorTest { Assert.assertTrue(e.getMessage().contains("Request is too big")); } } + +@Test +public void testErrorMessageWithNullString() throws Exception +{ +// test for CASSANDRA-11167 +ErrorMessage msg = ErrorMessage.fromException(new ServerError((String) null)); +assert msg.toString().endsWith("null") : msg.toString(); +int size = ErrorMessage.codec.encodedSize(msg, Server.CURRENT_VERSION); +ByteBuf buf = Unpooled.buffer(size); +ErrorMessage.codec.encode(msg, buf, Server.CURRENT_VERSION); + +ByteBuf expected = Unpooled.wrappedBuffer(new byte[]{ +0x00, 0x00, 0x00, 0x00, // int error code +0x00, 0x00 // short message length +}); + +Assert.assertEquals(expected, buf); +} }
[2/2] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e9abaabf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e9abaabf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e9abaabf Branch: refs/heads/cassandra-3.0 Commit: e9abaabfe83f74b1ef7c0273bdd7738402fb0ebc Parents: 037d24e 5009594 Author: Tyler HobbsAuthored: Tue Feb 23 11:29:04 2016 -0600 Committer: Tyler Hobbs Committed: Tue Feb 23 11:29:04 2016 -0600 -- CHANGES.txt | 1 + .../transport/messages/ErrorMessage.java | 6 -- .../cassandra/transport/ProtocolErrorTest.java| 18 ++ 3 files changed, 23 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/CHANGES.txt -- diff --cc CHANGES.txt index a675016,01e7b3d..cd2a930 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,21 -1,5 +1,22 @@@ -2.2.6 +3.0.4 + * Introduce backpressure for hints (CASSANDRA-10972) + * Fix ClusteringPrefix not being able to read tombstone range boundaries (CASSANDRA-11158) + * Prevent logging in sandboxed state (CASSANDRA-11033) + * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721) + * Add query time validation method on Index (CASSANDRA-11043) + * Avoid potential AssertionError in mixed version cluster (CASSANDRA-11128) + * Properly handle hinted handoff after topology changes (CASSANDRA-5902) + * AssertionError when listing sstable files on inconsistent disk state (CASSANDRA-11156) + * Fix wrong rack counting and invalid conditions check for TokenAllocation + (CASSANDRA-11139) + * Avoid creating empty hint files (CASSANDRA-11090) + * Fix leak detection strong reference loop using weak reference (CASSANDRA-11120) + * Configurie BatchlogManager to stop delayed tasks on shutdown (CASSANDRA-11062) + * Hadoop integration is incompatible with Cassandra Driver 3.0.0 (CASSANDRA-11001) + * Add dropped_columns to the list of schema table so it gets handled + properly (CASSANDRA-11050) +Merged from 2.2: + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java --
cassandra git commit: Avoid NPE when serializing ErrorMessage with null msg
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 c8c8cf679 -> 50095947e Avoid NPE when serializing ErrorMessage with null msg Patch by Tyler Hobbs; reviewed by Carl Yeksigian for CASSANDRA-11167 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/50095947 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/50095947 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/50095947 Branch: refs/heads/cassandra-2.2 Commit: 50095947e25f630ce48ee24d10ff3e1f3fd91183 Parents: c8c8cf6 Author: Tyler HobbsAuthored: Tue Feb 23 11:28:17 2016 -0600 Committer: Tyler Hobbs Committed: Tue Feb 23 11:28:17 2016 -0600 -- CHANGES.txt | 1 + .../transport/messages/ErrorMessage.java | 6 -- .../cassandra/transport/ProtocolErrorTest.java| 18 ++ 3 files changed, 23 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 767eb8a..01e7b3d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.6 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java -- diff --git a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java index 222e833..021db5a 100644 --- a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java +++ b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java @@ -151,7 +151,8 @@ public class ErrorMessage extends Message.Response { final TransportException err = getBackwardsCompatibleException(msg, version); dest.writeInt(err.code().value); -CBUtil.writeString(err.getMessage(), dest); +String errorString = err.getMessage() == null ? "" : err.getMessage(); +CBUtil.writeString(errorString, dest); switch (err.code()) { @@ -212,7 +213,8 @@ public class ErrorMessage extends Message.Response public int encodedSize(ErrorMessage msg, int version) { final TransportException err = getBackwardsCompatibleException(msg, version); -int size = 4 + CBUtil.sizeOfString(err.getMessage()); +String errorString = err.getMessage() == null ? "" : err.getMessage(); +int size = 4 + CBUtil.sizeOfString(errorString); switch (err.code()) { case UNAVAILABLE: http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java -- diff --git a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java index 11b0ebd..fc8c41c 100644 --- a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java +++ b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java @@ -113,4 +113,22 @@ public class ProtocolErrorTest { Assert.assertTrue(e.getMessage().contains("Request is too big")); } } + +@Test +public void testErrorMessageWithNullString() throws Exception +{ +// test for CASSANDRA-11167 +ErrorMessage msg = ErrorMessage.fromException(new ServerError((String) null)); +assert msg.toString().endsWith("null") : msg.toString(); +int size = ErrorMessage.codec.encodedSize(msg, Server.CURRENT_VERSION); +ByteBuf buf = Unpooled.buffer(size); +ErrorMessage.codec.encode(msg, buf, Server.CURRENT_VERSION); + +ByteBuf expected = Unpooled.wrappedBuffer(new byte[]{ +0x00, 0x00, 0x00, 0x00, // int error code +0x00, 0x00 // short message length +}); + +Assert.assertEquals(expected, buf); +} }
[1/2] cassandra git commit: Avoid NPE when serializing ErrorMessage with null msg
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 037d24efd -> e9abaabfe Avoid NPE when serializing ErrorMessage with null msg Patch by Tyler Hobbs; reviewed by Carl Yeksigian for CASSANDRA-11167 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/50095947 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/50095947 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/50095947 Branch: refs/heads/cassandra-3.0 Commit: 50095947e25f630ce48ee24d10ff3e1f3fd91183 Parents: c8c8cf6 Author: Tyler HobbsAuthored: Tue Feb 23 11:28:17 2016 -0600 Committer: Tyler Hobbs Committed: Tue Feb 23 11:28:17 2016 -0600 -- CHANGES.txt | 1 + .../transport/messages/ErrorMessage.java | 6 -- .../cassandra/transport/ProtocolErrorTest.java| 18 ++ 3 files changed, 23 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 767eb8a..01e7b3d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.2.6 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167) * Replacing an aggregate with a new version doesn't reset INITCOND (CASSANDRA-10840) * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037) * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793) http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java -- diff --git a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java index 222e833..021db5a 100644 --- a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java +++ b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java @@ -151,7 +151,8 @@ public class ErrorMessage extends Message.Response { final TransportException err = getBackwardsCompatibleException(msg, version); dest.writeInt(err.code().value); -CBUtil.writeString(err.getMessage(), dest); +String errorString = err.getMessage() == null ? "" : err.getMessage(); +CBUtil.writeString(errorString, dest); switch (err.code()) { @@ -212,7 +213,8 @@ public class ErrorMessage extends Message.Response public int encodedSize(ErrorMessage msg, int version) { final TransportException err = getBackwardsCompatibleException(msg, version); -int size = 4 + CBUtil.sizeOfString(err.getMessage()); +String errorString = err.getMessage() == null ? "" : err.getMessage(); +int size = 4 + CBUtil.sizeOfString(errorString); switch (err.code()) { case UNAVAILABLE: http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java -- diff --git a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java index 11b0ebd..fc8c41c 100644 --- a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java +++ b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java @@ -113,4 +113,22 @@ public class ProtocolErrorTest { Assert.assertTrue(e.getMessage().contains("Request is too big")); } } + +@Test +public void testErrorMessageWithNullString() throws Exception +{ +// test for CASSANDRA-11167 +ErrorMessage msg = ErrorMessage.fromException(new ServerError((String) null)); +assert msg.toString().endsWith("null") : msg.toString(); +int size = ErrorMessage.codec.encodedSize(msg, Server.CURRENT_VERSION); +ByteBuf buf = Unpooled.buffer(size); +ErrorMessage.codec.encode(msg, buf, Server.CURRENT_VERSION); + +ByteBuf expected = Unpooled.wrappedBuffer(new byte[]{ +0x00, 0x00, 0x00, 0x00, // int error code +0x00, 0x00 // short message length +}); + +Assert.assertEquals(expected, buf); +} }
[jira] [Commented] (CASSANDRA-7464) Replace sstable2json and json2sstable
[ https://issues.apache.org/jira/browse/CASSANDRA-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159235#comment-15159235 ] Yuki Morishita commented on CASSANDRA-7464: --- Fixed one more bug (handle case sensitive column name) and backported to 3.0 as well. ||branch||testall||dtest|| |[7464-3.0|https://github.com/yukim/cassandra/tree/7464-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-3.0-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-3.0-dtest/lastCompletedBuild/testReport/]| |[7464|https://github.com/yukim/cassandra/tree/7464]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-dtest/lastCompletedBuild/testReport/]| Tests are running. > Replace sstable2json and json2sstable > - > > Key: CASSANDRA-7464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7464 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Chris Lohfink >Priority: Minor > Fix For: 3.0.x, 3.x > > Attachments: sstable-only.patch, sstabledump.patch > > > Both tools are pretty awful. They are primarily meant for debugging (there is > much more efficient and convenient ways to do import/export data), but their > output manage to be hard to handle both for humans and for tools (especially > as soon as you have modern stuff like composites). > There is value to having tools to export sstable contents into a format that > is easy to manipulate by human and tools for debugging, small hacks and > general tinkering, but sstable2json and json2sstable are not that. > So I propose that we deprecate those tools and consider writing better > replacements. It shouldn't be too hard to come up with an output format that > is more aware of modern concepts like composites, UDTs, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7464) Replace sstable2json and json2sstable
[ https://issues.apache.org/jira/browse/CASSANDRA-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-7464: -- Fix Version/s: 3.0.x > Replace sstable2json and json2sstable > - > > Key: CASSANDRA-7464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7464 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Chris Lohfink >Priority: Minor > Fix For: 3.0.x, 3.x > > Attachments: sstable-only.patch, sstabledump.patch > > > Both tools are pretty awful. They are primarily meant for debugging (there is > much more efficient and convenient ways to do import/export data), but their > output manage to be hard to handle both for humans and for tools (especially > as soon as you have modern stuff like composites). > There is value to having tools to export sstable contents into a format that > is easy to manipulate by human and tools for debugging, small hacks and > general tinkering, but sstable2json and json2sstable are not that. > So I propose that we deprecate those tools and consider writing better > replacements. It shouldn't be too hard to come up with an output format that > is more aware of modern concepts like composites, UDTs, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159202#comment-15159202 ] Marcus Olsson commented on CASSANDRA-11215: --- After looking around a bit in the dtests I think that self.ignore_log_patterns could handle that, although our expected error is causing more errors than "Cannot start multiple repairs". There are errors logged from nodetool and the repair sessions as well. But for this test I guess the important part is that there are no "LEAK DETECTED" error messages, right? Assuming that is the case, could we simply ignore the other repair errors? > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8110) Make streaming backwards compatible
[ https://issues.apache.org/jira/browse/CASSANDRA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159153#comment-15159153 ] xiaodong wang commented on CASSANDRA-8110: -- Thanks for your prompt response, Paulo. Some background: Currently I have a C* 2.1.x cluster and try to migrate the data to C* 3.3. Already did a few approaches: * Set up another 3.3 DC and ran the "rebuild"; * Set up a new 3.3 cluster and ran the "sstableupgrade" & "sstableloader" ; Both didn't work. > Make streaming backwards compatible > --- > > Key: CASSANDRA-8110 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8110 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Marcus Eriksson > Labels: gsoc2016, mentor > Fix For: 3.x > > > To be able to seamlessly upgrade clusters we need to make it possible to > stream files between nodes with different StreamMessage.CURRENT_VERSION -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11164) Order and filter cipher suites correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-11164: --- Attachment: 11164-2.2_2_call_filterCipherSuites_everywhere.patch > Order and filter cipher suites correctly > > > Key: CASSANDRA-11164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11164 > Project: Cassandra > Issue Type: Bug >Reporter: Tom Petracca >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x > > Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, > 11164-2.2_2_call_filterCipherSuites_everywhere.patch > > > As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, > SSLFactory.filterCipherSuites() doesn't respect the ordering of desired > ciphers in cassandra.yaml. > Also the fix that occurred for > https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs > to be applied to all locations where we create an SSLSocket so that JCE is > not required out of the box or with additional configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11164) Order and filter cipher suites correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-11164: --- Attachment: (was: 11164-on-10508-2.2.patch) > Order and filter cipher suites correctly > > > Key: CASSANDRA-11164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11164 > Project: Cassandra > Issue Type: Bug >Reporter: Tom Petracca >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x > > Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, > 11164-2.2_2_call_filterCipherSuites_everywhere.patch > > > As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, > SSLFactory.filterCipherSuites() doesn't respect the ordering of desired > ciphers in cassandra.yaml. > Also the fix that occurred for > https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs > to be applied to all locations where we create an SSLSocket so that JCE is > not required out of the box or with additional configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11164) Order and filter cipher suites correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-11164: --- Attachment: 11164-2.2_1_preserve_cipher_order.patch > Order and filter cipher suites correctly > > > Key: CASSANDRA-11164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11164 > Project: Cassandra > Issue Type: Bug >Reporter: Tom Petracca >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x > > Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, > 11164-2.2_2_call_filterCipherSuites_everywhere.patch > > > As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, > SSLFactory.filterCipherSuites() doesn't respect the ordering of desired > ciphers in cassandra.yaml. > Also the fix that occurred for > https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs > to be applied to all locations where we create an SSLSocket so that JCE is > not required out of the box or with additional configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11164) Order and filter cipher suites correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159131#comment-15159131 ] Stefan Podkowinski commented on CASSANDRA-11164: Scope of this ticket as reported would be: - respect ordering of enabled ciphers - apply cipher filtering wherever SSL is used I've now created two patches for that: - {{11164-2.2_1_preserve_cipher_order.patch}} - cherry picked {{filterCipherSuites}} implementation and unit test from CASSANDRA-10508 with some of your suggested changes - {{11164-2.2_2_call_filterCipherSuites_everywhere.patch}} - this is {{11164-2.2.txt}} from Tom minus the {{filterCipherSuites}} implementation ||2.2|| |[Branch|https://github.com/spodkowinski/cassandra/commits/CASSANDRA-11164]| |[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-11164-testall/]| |[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-11164-dtest/]| > Order and filter cipher suites correctly > > > Key: CASSANDRA-11164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11164 > Project: Cassandra > Issue Type: Bug >Reporter: Tom Petracca >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x > > Attachments: 11164-2.2.txt, 11164-on-10508-2.2.patch > > > As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, > SSLFactory.filterCipherSuites() doesn't respect the ordering of desired > ciphers in cassandra.yaml. > Also the fix that occurred for > https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs > to be applied to all locations where we create an SSLSocket so that JCE is > not required out of the box or with additional configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159110#comment-15159110 ] Marcus Eriksson commented on CASSANDRA-11209: - ok, with incremental repair an sstable can only be involved in a single repair session (since we are going to anticompact it after, the sstable would be gone after the second repair finished) It should of course not mess up the live size, I will try to reproduce. (hmm.. maybe it is CASSANDRA-11215) > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159101#comment-15159101 ] Jose Fernandez commented on CASSANDRA-11209: This is the error on 10.1.29.31 ERROR 22:08:05 Cannot start multiple repair sessions over the same sstables ERROR 22:08:05 Failed creating a merkle tree for [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]], /10.1.28.32 (see log for details) ERROR 22:08:05 Exception in thread Thread[ValidationExecutor:8,1,main] java.lang.RuntimeException: Cannot start multiple repair sessions over the same sstables at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1043) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:89) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:692) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159099#comment-15159099 ] Marcus Eriksson commented on CASSANDRA-11209: - and on 10.1.29.31 ? > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159096#comment-15159096 ] Jose Fernandez edited comment on CASSANDRA-11209 at 2/23/16 4:06 PM: - Actually, I just spotted an error during repair: ``` ERROR 22:08:05 [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee] session completed with the following error org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] ERROR 22:08:05 Repair session a85c9760-d9b0-11e5-9b9c-c12de94ec9ee for range (7686143364045646505,-6148914691236517207] failed with error org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at java.util.concurrent.FutureTask.report(FutureTask.java:122) [na:1.8.0_66] at java.util.concurrent.FutureTask.get(FutureTask.java:192) [na:1.8.0_66] at org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:3048) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_66] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) [apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_66] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_66] ... 1 common frames omitted Caused by: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-2.1.13.jar:2.1.13] ... 3 common frames omitted ERROR 22:08:05 Exception in thread Thread[AntiEntropySessions:1,5,jolokia] java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at
[jira] [Updated] (CASSANDRA-11176) SSTableRewriter.InvalidateKeys should have a weak reference to cache
[ https://issues.apache.org/jira/browse/CASSANDRA-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11176: Assignee: Marcus Eriksson > SSTableRewriter.InvalidateKeys should have a weak reference to cache > > > Key: CASSANDRA-11176 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11176 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeremiah Jordan >Assignee: Marcus Eriksson > Fix For: 3.0.x > > > From [~aweisberg] > bq. The SSTableReader.DropPageCache runnable references > SSTableRewriter.InvalidateKeys which references the cache. The cache > reference should be a WeakReference. > {noformat} > ERROR [Strong-Reference-Leak-Detector:1] 2016-02-17 14:51:52,111 > NoSpamLogger.java:97 - Strong self-ref loop detected > [/var/lib/cassandra/data/keyspace1/standard1-990bc741d56411e591d5590d7a7ad312/ma-20-big, > private java.lang.Runnable > org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.runOnClose-org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache, > final java.lang.Runnable > org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache.andThen-org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys, > final org.apache.cassandra.cache.InstrumentingCache > org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys.cache-org.apache.cassandra.cache.AutoSavingCache, > protected volatile java.util.concurrent.ScheduledFuture > org.apache.cassandra.cache.AutoSavingCache.saveTask-java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask, > final java.util.concurrent.ScheduledThreadPoolExecutor > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.this$0-org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor, > private final java.util.concurrent.BlockingQueue > java.util.concurrent.ThreadPoolExecutor.workQueue-java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue, > private final java.util.concurrent.BlockingQueue > java.util.concurrent.ThreadPoolExecutor.workQueue-java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask, > private java.util.concurrent.Callable > java.util.concurrent.FutureTask.callable-java.util.concurrent.Executors$RunnableAdapter, > final java.lang.Runnable > java.util.concurrent.Executors$RunnableAdapter.task-org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable, > private final java.lang.Runnable > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.runnable-org.apache.cassandra.db.ColumnFamilyStore$3, > final org.apache.cassandra.db.ColumnFamilyStore > org.apache.cassandra.db.ColumnFamilyStore$3.this$0-org.apache.cassandra.db.ColumnFamilyStore, > public final org.apache.cassandra.db.Keyspace > org.apache.cassandra.db.ColumnFamilyStore.keyspace-org.apache.cassandra.db.Keyspace, > private final java.util.concurrent.ConcurrentMap > org.apache.cassandra.db.Keyspace.columnFamilyStores-java.util.concurrent.ConcurrentHashMap, > private final java.util.concurrent.ConcurrentMap > org.apache.cassandra.db.Keyspace.columnFamilyStores-org.apache.cassandra.db.ColumnFamilyStore, > private final org.apache.cassandra.db.lifecycle.Tracker > org.apache.cassandra.db.ColumnFamilyStore.data-org.apache.cassandra.db.lifecycle.Tracker, > final java.util.concurrent.atomic.AtomicReference > org.apache.cassandra.db.lifecycle.Tracker.view-java.util.concurrent.atomic.AtomicReference, > private volatile java.lang.Object > java.util.concurrent.atomic.AtomicReference.value-org.apache.cassandra.db.lifecycle.View, > public final java.util.List > org.apache.cassandra.db.lifecycle.View.liveMemtables-com.google.common.collect.SingletonImmutableList, > final transient java.lang.Object > com.google.common.collect.SingletonImmutableList.element-org.apache.cassandra.db.Memtable, > private final org.apache.cassandra.utils.memory.MemtableAllocator > org.apache.cassandra.db.Memtable.allocator-org.apache.cassandra.utils.memory.SlabAllocator, > private final > org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator > org.apache.cassandra.utils.memory.MemtableAllocator.onHeap-org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator, > private final org.apache.cassandra.utils.memory.MemtablePool$SubPool > org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.parent-org.apache.cassandra.utils.memory.MemtablePool$SubPool, > final org.apache.cassandra.utils.memory.MemtablePool > org.apache.cassandra.utils.memory.MemtablePool$SubPool.this$0-org.apache.cassandra.utils.memory.SlabPool, > final org.apache.cassandra.utils.memory.MemtableCleanerThread >
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159096#comment-15159096 ] Jose Fernandez commented on CASSANDRA-11209: Actually, I just spotted an error during repair: ERROR 22:08:05 [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee] session completed with the following error org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] ERROR 22:08:05 Repair session a85c9760-d9b0-11e5-9b9c-c12de94ec9ee for range (7686143364045646505,-6148914691236517207] failed with error org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at java.util.concurrent.FutureTask.report(FutureTask.java:122) [na:1.8.0_66] at java.util.concurrent.FutureTask.get(FutureTask.java:192) [na:1.8.0_66] at org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:3048) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_66] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) [apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_66] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_66] ... 1 common frames omitted Caused by: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-2.1.13.jar:2.1.13] ... 3 common frames omitted ERROR 22:08:05 Exception in thread Thread[AntiEntropySessions:1,5,jolokia] java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on timeslice_store/minute_timeslice_blobs, (7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31 at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159089#comment-15159089 ] Marcus Eriksson commented on CASSANDRA-11209: - very strange I assume there are no error messages/exceptions in the logs on that node? > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159085#comment-15159085 ] Jose Fernandez commented on CASSANDRA-11209: Yes, out of the 4 node cluster, only one is showing this behavior (its always the same one). Repairs run on all of them. They all have the same version of Cassandra and exact same settings (we've Dockerized it). > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159053#comment-15159053 ] Marcus Olsson commented on CASSANDRA-11215: --- I could try to do it. I guess it more or less would be to go through the reproduction steps and grep in the logs for reference leak, right? I'll put it in the repair_test.py dtest. :) > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159072#comment-15159072 ] Marcus Eriksson commented on CASSANDRA-11215: - Yes, only tricky part is to handle the expected "Cannot start multiple repairs" error message > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11108) Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test on 2.1 and 2.2
[ https://issues.apache.org/jira/browse/CASSANDRA-11108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-11108. -- Resolution: Fixed Sure, if it passes it means we have updated the python driver somehow, we're good. > Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test > on 2.1 and 2.2 > --- > > Key: CASSANDRA-11108 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11108 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne > > The aforementioned test fails on 2.1 and 2.2 (the only branch on which it is > run actually) due to https://datastax-oss.atlassian.net/browse/PYTHON-459. > That ticket has been fixed but I don't think the version incorporating it has > been released yet. This ticket is so we don't got forget to act once said > version is released. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159081#comment-15159081 ] Marcus Eriksson commented on CASSANDRA-11209: - is there only one node showing this? not all nodes involved in the repairs? > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159074#comment-15159074 ] Jan Urbański commented on CASSANDRA-11209: -- BTW: the initial bump in Live and Total disk space was caused by a Cassandra restart, similar to the previous graph. The slowly diverging Live vs Total disk space afterwards is what's worrying. > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11108) Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test on 2.1 and 2.2
[ https://issues.apache.org/jira/browse/CASSANDRA-11108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159075#comment-15159075 ] Jim Witschey commented on CASSANDRA-11108: -- This seems to be passing now: http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-2.1_dtest/lastCompletedBuild/testReport/cql_tests/MiscellaneousCQLTester/large_collection_errors_test/history/ http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/junit/cql_tests/MiscellaneousCQLTester/large_collection_errors_test/history/ Shall we close this? From the description of the issue, I'm not sure whether or not more needs to be done. > Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test > on 2.1 and 2.2 > --- > > Key: CASSANDRA-11108 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11108 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne > > The aforementioned test fails on 2.1 and 2.2 (the only branch on which it is > run actually) due to https://datastax-oss.atlassian.net/browse/PYTHON-459. > That ticket has been fixed but I don't think the version incorporating it has > been released yet. This ticket is so we don't got forget to act once said > version is released. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jose Fernandez updated CASSANDRA-11209: --- Attachment: screenshot-2.png > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159061#comment-15159061 ] Jose Fernandez commented on CASSANDRA-11209: [~krummas] we currently have a cluster that's showing this issue in staging. You can see in the screenshot attached the growing divergence in Live vs Total. We haven't restarted the node yet in hopes you could point us at some things we could look at to debug this. !screenshot-2.png! > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jose Fernandez updated CASSANDRA-11209: --- Attachment: (was: screenshot-2.png) > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jose Fernandez updated CASSANDRA-11209: --- Attachment: screenshot-2.png > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png, screenshot-2.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11203) Improve nothing to repair message when RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-11203: Priority: Trivial (was: Major) > Improve nothing to repair message when RF=1 > --- > > Key: CASSANDRA-11203 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11203 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: debian jesse up to date content >Reporter: Jason Kania >Priority: Trivial > Labels: lhf > > When nodetool repair is run, it indicates that no repair is needed on some > keyspaces but on others it attempts repair. However, when run multiple times, > the output seems to indicate that the same triggering conditions still > persists that indicate a problem. Alternatively, the output could indicate > that the underlying condition has not been resolved. > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:12,144] Repair completed successfully > [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:33,324] Repair completed successfully > [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10972) File based hints don't implement backpressure and can OOM
[ https://issues.apache.org/jira/browse/CASSANDRA-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159050#comment-15159050 ] Aleksey Yeschenko commented on CASSANDRA-10972: --- LGTM from me as well. Committed as [037d24efdf83bd2736556f9880c5e1f6be48fa77|https://github.com/apache/cassandra/commit/037d24efdf83bd2736556f9880c5e1f6be48fa77] to 3.0 and merged with trunk, thanks. > File based hints don't implement backpressure and can OOM > - > > Key: CASSANDRA-10972 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10972 > Project: Cassandra > Issue Type: Bug >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg >Priority: Minor > Fix For: 3.0.x, 3.x > > > This is something I reproduced in practice. I have what I think is a > reasonable implementation of backpressure, but still need to put together a > unit test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance
[ https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-11053: Reviewer: Adam Holmberg (was: Paulo Motta) > COPY FROM on large datasets: fix progress report and debug performance > -- > > Key: CASSANDRA-11053 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11053 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: copy_from_large_benchmark.txt, > copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, > worker_profiles.txt, worker_profiles_2.txt > > > Running COPY from on a large dataset (20G divided in 20M records) revealed > two issues: > * The progress report is incorrect, it is very slow until almost the end of > the test at which point it catches up extremely quickly. > * The performance in rows per second is similar to running smaller tests with > a smaller cluster locally (approx 35,000 rows per second). As a comparison, > cassandra-stress manages 50,000 rows per second under the same set-up, > therefore resulting 1.5 times faster. > See attached file _copy_from_large_benchmark.txt_ for the benchmark details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11203) Improve nothing to repair message when RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-11203: Summary: Improve nothing to repair message when RF=1 (was: nodetool repair not performing repair or being incorrectly triggered in 3.0.3) > Improve nothing to repair message when RF=1 > --- > > Key: CASSANDRA-11203 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11203 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: debian jesse up to date content >Reporter: Jason Kania > Labels: lhf > > When nodetool repair is run, it indicates that no repair is needed on some > keyspaces but on others it attempts repair. However, when run multiple times, > the output seems to indicate that the same triggering conditions still > persists that indicate a problem. Alternatively, the output could indicate > that the underlying condition has not been resolved. > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:12,144] Repair completed successfully > [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:33,324] Repair completed successfully > [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11203) Improve nothing to repair message when RF=1
[ https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-11203: Labels: lhf (was: ) > Improve nothing to repair message when RF=1 > --- > > Key: CASSANDRA-11203 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11203 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: debian jesse up to date content >Reporter: Jason Kania >Priority: Trivial > Labels: lhf > > When nodetool repair is run, it indicates that no repair is needed on some > keyspaces but on others it attempts repair. However, when run multiple times, > the output seems to indicate that the same triggering conditions still > persists that indicate a problem. Alternatively, the output could indicate > that the underlying condition has not been resolved. > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:12,144] Repair completed successfully > [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:33,324] Repair completed successfully > [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] cassandra git commit: Introduce backpressure for hints
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 fe37e0644 -> 037d24efd refs/heads/trunk 4b27287cd -> fc9c6faa2 Introduce backpressure for hints patch by Ariel Weisberg; reviewed by Benedict Elliott Smith for CASSANDRA-10972 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/037d24ef Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/037d24ef Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/037d24ef Branch: refs/heads/cassandra-3.0 Commit: 037d24efdf83bd2736556f9880c5e1f6be48fa77 Parents: fe37e06 Author: Ariel WeisbergAuthored: Mon Dec 28 16:32:05 2015 -0500 Committer: Aleksey Yeschenko Committed: Tue Feb 23 15:28:41 2016 + -- CHANGES.txt | 1 + build.xml | 14 +++- .../apache/cassandra/hints/HintsBufferPool.java | 34 ++--- .../cassandra/hints/HintsWriteExecutor.java | 3 +- .../cassandra/hints/HintsBufferPoolTest.java| 75 .../apache/cassandra/hints/HintsBufferTest.java | 2 +- 6 files changed, 114 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index da91594..a675016 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.4 + * Introduce backpressure for hints (CASSANDRA-10972) * Fix ClusteringPrefix not being able to read tombstone range boundaries (CASSANDRA-11158) * Prevent logging in sandboxed state (CASSANDRA-11033) * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721) http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/build.xml -- diff --git a/build.xml b/build.xml index d27b77a..6ef99fd 100644 --- a/build.xml +++ b/build.xml @@ -111,6 +111,8 @@ + + @@ -382,6 +384,11 @@ + + + + + @@ -479,7 +486,10 @@ artifactId="cassandra-parent" version="${version}"/> - + + + + + @@ -1701,6 +1712,7 @@ + ]]> http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/src/java/org/apache/cassandra/hints/HintsBufferPool.java -- diff --git a/src/java/org/apache/cassandra/hints/HintsBufferPool.java b/src/java/org/apache/cassandra/hints/HintsBufferPool.java index 83b155a..25f9bc1 100644 --- a/src/java/org/apache/cassandra/hints/HintsBufferPool.java +++ b/src/java/org/apache/cassandra/hints/HintsBufferPool.java @@ -17,10 +17,11 @@ */ package org.apache.cassandra.hints; -import java.util.Queue; import java.util.UUID; -import java.util.concurrent.ConcurrentLinkedQueue; +import java.util.concurrent.BlockingQueue; +import java.util.concurrent.LinkedBlockingQueue; +import org.apache.cassandra.config.Config; import org.apache.cassandra.net.MessagingService; /** @@ -34,15 +35,16 @@ final class HintsBufferPool void flush(HintsBuffer buffer, HintsBufferPool pool); } +static final int MAX_ALLOCATED_BUFFERS = Integer.getInteger(Config.PROPERTY_PREFIX + "MAX_HINT_BUFFERS", 3); private volatile HintsBuffer currentBuffer; -private final Queue reserveBuffers; +private final BlockingQueue reserveBuffers; private final int bufferSize; private final FlushCallback flushCallback; +private int allocatedBuffers = 0; HintsBufferPool(int bufferSize, FlushCallback flushCallback) { -reserveBuffers = new ConcurrentLinkedQueue<>(); - +reserveBuffers = new LinkedBlockingQueue<>(); this.bufferSize = bufferSize; this.flushCallback = flushCallback; } @@ -78,13 +80,10 @@ final class HintsBufferPool } } -boolean offer(HintsBuffer buffer) +void offer(HintsBuffer buffer) { -if (!reserveBuffers.isEmpty()) -return false; - -reserveBuffers.offer(buffer); -return true; +if (!reserveBuffers.offer(buffer)) +throw new RuntimeException("Failed to store buffer"); } // A wrapper to ensure a non-null currentBuffer value on the first call. @@ -108,6 +107,18 @@ final class HintsBufferPool return false; HintsBuffer buffer = reserveBuffers.poll(); +if (buffer == null && allocatedBuffers >= MAX_ALLOCATED_BUFFERS) +{ +try +
[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk
Merge branch 'cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fc9c6faa Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fc9c6faa Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fc9c6faa Branch: refs/heads/trunk Commit: fc9c6faa23f5662ce8d7caedb5d7f1ac3d1fcea6 Parents: 4b27287 037d24e Author: Aleksey YeschenkoAuthored: Tue Feb 23 15:30:35 2016 + Committer: Aleksey Yeschenko Committed: Tue Feb 23 15:30:35 2016 + -- CHANGES.txt | 1 + build.xml | 14 +++- .../apache/cassandra/hints/HintsBufferPool.java | 34 ++--- .../cassandra/hints/HintsWriteExecutor.java | 3 +- .../cassandra/hints/HintsBufferPoolTest.java| 75 .../apache/cassandra/hints/HintsBufferTest.java | 2 +- 6 files changed, 114 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc9c6faa/CHANGES.txt -- diff --cc CHANGES.txt index 01f0e84,a675016..dd67598 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,32 -1,5 +1,33 @@@ -3.0.4 +3.4 + * fix OnDiskIndexTest to properly treat empty ranges (CASSANDRA-11205) + * fix TrackerTest to handle new notifications (CASSANDRA-11178) + * add SASI validation for partitioner and complex columns (CASSANDRA-11169) + * Add caching of encrypted credentials in PasswordAuthenticator (CASSANDRA-7715) + * fix SASI memtable switching on flush (CASSANDRA-11159) + * Remove duplicate offline compaction tracking (CASSANDRA-11148) + * fix EQ semantics of analyzed SASI indexes (CASSANDRA-11130) + * Support long name output for nodetool commands (CASSANDRA-7950) + * Encrypted hints (CASSANDRA-11040) + * SASI index options validation (CASSANDRA-11136) + * Optimize disk seek using min/max column name meta data when the LIMIT clause is used + (CASSANDRA-8180) + * Add LIKE support to CQL3 (CASSANDRA-11067) + * Generic Java UDF types (CASSANDRA-10819) + * cqlsh: Include sub-second precision in timestamps by default (CASSANDRA-10428) + * Set javac encoding to utf-8 (CASSANDRA-11077) + * Integrate SASI index into Cassandra (CASSANDRA-10661) + * Add --skip-flush option to nodetool snapshot + * Skip values for non-queried columns (CASSANDRA-10657) + * Add support for secondary indexes on static columns (CASSANDRA-8103) + * CommitLogUpgradeTestMaker creates broken commit logs (CASSANDRA-11051) + * Add metric for number of dropped mutations (CASSANDRA-10866) + * Simplify row cache invalidation code (CASSANDRA-10396) + * Support user-defined compaction through nodetool (CASSANDRA-10660) + * Stripe view locks by key and table ID to reduce contention (CASSANDRA-10981) + * Add nodetool gettimeout and settimeout commands (CASSANDRA-10953) + * Add 3.0 metadata to sstablemetadata output (CASSANDRA-10838) +Merged from 3.0: + * Introduce backpressure for hints (CASSANDRA-10972) * Fix ClusteringPrefix not being able to read tombstone range boundaries (CASSANDRA-11158) * Prevent logging in sandboxed state (CASSANDRA-11033) * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721) http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc9c6faa/build.xml --
[2/3] cassandra git commit: Introduce backpressure for hints
Introduce backpressure for hints patch by Ariel Weisberg; reviewed by Benedict Elliott Smith for CASSANDRA-10972 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/037d24ef Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/037d24ef Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/037d24ef Branch: refs/heads/trunk Commit: 037d24efdf83bd2736556f9880c5e1f6be48fa77 Parents: fe37e06 Author: Ariel WeisbergAuthored: Mon Dec 28 16:32:05 2015 -0500 Committer: Aleksey Yeschenko Committed: Tue Feb 23 15:28:41 2016 + -- CHANGES.txt | 1 + build.xml | 14 +++- .../apache/cassandra/hints/HintsBufferPool.java | 34 ++--- .../cassandra/hints/HintsWriteExecutor.java | 3 +- .../cassandra/hints/HintsBufferPoolTest.java| 75 .../apache/cassandra/hints/HintsBufferTest.java | 2 +- 6 files changed, 114 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index da91594..a675016 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.4 + * Introduce backpressure for hints (CASSANDRA-10972) * Fix ClusteringPrefix not being able to read tombstone range boundaries (CASSANDRA-11158) * Prevent logging in sandboxed state (CASSANDRA-11033) * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721) http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/build.xml -- diff --git a/build.xml b/build.xml index d27b77a..6ef99fd 100644 --- a/build.xml +++ b/build.xml @@ -111,6 +111,8 @@ + + @@ -382,6 +384,11 @@ + + + + + @@ -479,7 +486,10 @@ artifactId="cassandra-parent" version="${version}"/> - + + + + + @@ -1701,6 +1712,7 @@ + ]]> http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/src/java/org/apache/cassandra/hints/HintsBufferPool.java -- diff --git a/src/java/org/apache/cassandra/hints/HintsBufferPool.java b/src/java/org/apache/cassandra/hints/HintsBufferPool.java index 83b155a..25f9bc1 100644 --- a/src/java/org/apache/cassandra/hints/HintsBufferPool.java +++ b/src/java/org/apache/cassandra/hints/HintsBufferPool.java @@ -17,10 +17,11 @@ */ package org.apache.cassandra.hints; -import java.util.Queue; import java.util.UUID; -import java.util.concurrent.ConcurrentLinkedQueue; +import java.util.concurrent.BlockingQueue; +import java.util.concurrent.LinkedBlockingQueue; +import org.apache.cassandra.config.Config; import org.apache.cassandra.net.MessagingService; /** @@ -34,15 +35,16 @@ final class HintsBufferPool void flush(HintsBuffer buffer, HintsBufferPool pool); } +static final int MAX_ALLOCATED_BUFFERS = Integer.getInteger(Config.PROPERTY_PREFIX + "MAX_HINT_BUFFERS", 3); private volatile HintsBuffer currentBuffer; -private final Queue reserveBuffers; +private final BlockingQueue reserveBuffers; private final int bufferSize; private final FlushCallback flushCallback; +private int allocatedBuffers = 0; HintsBufferPool(int bufferSize, FlushCallback flushCallback) { -reserveBuffers = new ConcurrentLinkedQueue<>(); - +reserveBuffers = new LinkedBlockingQueue<>(); this.bufferSize = bufferSize; this.flushCallback = flushCallback; } @@ -78,13 +80,10 @@ final class HintsBufferPool } } -boolean offer(HintsBuffer buffer) +void offer(HintsBuffer buffer) { -if (!reserveBuffers.isEmpty()) -return false; - -reserveBuffers.offer(buffer); -return true; +if (!reserveBuffers.offer(buffer)) +throw new RuntimeException("Failed to store buffer"); } // A wrapper to ensure a non-null currentBuffer value on the first call. @@ -108,6 +107,18 @@ final class HintsBufferPool return false; HintsBuffer buffer = reserveBuffers.poll(); +if (buffer == null && allocatedBuffers >= MAX_ALLOCATED_BUFFERS) +{ +try +{ +//This BlockingQueue.take is a target for byteman in HintsBufferPoolTest +buffer = reserveBuffers.take();
[jira] [Commented] (CASSANDRA-8110) Make streaming backwards compatible
[ https://issues.apache.org/jira/browse/CASSANDRA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159044#comment-15159044 ] Paulo Motta commented on CASSANDRA-8110: What operation are you trying to execute during upgrade? Repair, bootstrap, rebuild? Please note that these operations are not supported during upgrades, so you must first complete upgrade before running any of those. > Make streaming backwards compatible > --- > > Key: CASSANDRA-8110 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8110 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Marcus Eriksson > Labels: gsoc2016, mentor > Fix For: 3.x > > > To be able to seamlessly upgrade clusters we need to make it possible to > stream files between nodes with different StreamMessage.CURRENT_VERSION -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[Cassandra Wiki] Update of "Committers" by AlekseyYeschenko
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "Committers" page has been changed by AlekseyYeschenko: https://wiki.apache.org/cassandra/Committers?action=diff=56=57 ||Marcus Eriksson ||Apr 2013 ||Datastax || || ||Mikhail Stepura ||Jan 2014 ||Apple || || ||Tyler Hobbs ||Mar 2014 ||Datastax || || - ||Benedict Elliott Smith ||May 2014 ||Datastax || || + ||Benedict Elliott Smith ||May 2014 ||Vast || || ||Josh Mckenzie ||Jul 2014 ||Datastax || || ||Robert Stupp ||Jan 2015 ||Datastax || || ||Sam Tunnicliffe ||May 2015 ||Datastax || ||
[jira] [Commented] (CASSANDRA-11213) Improve ClusteringPrefix hierarchy
[ https://issues.apache.org/jira/browse/CASSANDRA-11213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159034#comment-15159034 ] Branimir Lambov commented on CASSANDRA-11213: - I did it slightly differently: range tombstones do make an explicit distinction between bound and boundary, so it isn't that valuable for them to have a shared bound class to use; it made more sense for me to isolate the bound concept from its uses and avoid conversion between the slice ends and the corresponding range markers: - Moved the bound concept outside of {{Slice}} and {{RangeTombstone}}. What used to be a {{Slice.Bound}} is now a {{ClusteringBound}}. - Made a {{ClusteringBoundary}} type and changed the markers to use bound/boundary directly. - Added a shared {{AbstactClusteringBound}} ancestor for the few bits of code that need to be able to work with both. - Had to name the types {{ClusteringX}} to avoid naming conflict between {{ClusteringBound}} and cql3 statements {{Bound}}. |[code|https://github.com/blambov/cassandra/tree/11213]|[utest|http://cassci.datastax.com/job/blambov-11213-testall/]|[dtest|http://cassci.datastax.com/job/blambov-11213-dtest/]| > Improve ClusteringPrefix hierarchy > -- > > Key: CASSANDRA-11213 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11213 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Branimir Lambov > Fix For: 3.x > > > As noted by [~blambov] on CASSANDRA-11158, having {{RangeTombstone.Bound}} be > a subclass of {{Slice.Bound}} is somewhat inconsistent. I'd argue in fact > that conceptually neither should really be a subclass of the other as none is > a special case of the other and they are use in strictly non-overlapping > places ({{Slice.Bound}} is for slices which are used for selecting data while > {{RangeTombstone.Bound}} is for range tombstone which actually represent some > type of data). > We should figure out a cleaner hierarchy of this, which probably mean > slightly changing the {{ClusteringPrefix}} hierarchy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11208) Paging is broken for IN queries
[ https://issues.apache.org/jira/browse/CASSANDRA-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-11208: --- Attachment: 11083-2.2.txt {{AbstractQueryPager}} was not taking into account the fact that for tables with no clustering columns there is only one row per partition. When the next page was fetched, the pager was believing that it still had to return some rows from the partition. ||utests||dtests|| |[3.0|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-3.0-testall/1/]|[3.0|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-3.0-dtest/1/]| |[trunk|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-trunk-testall/1/]|[trunk|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-trunk-dtest/1/]| The DTest PR is [here|https://github.com/riptano/cassandra-dtest/pull/820] > Paging is broken for IN queries > --- > > Key: CASSANDRA-11208 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11208 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Attachments: 11083-2.2.txt > > > If the number of selected row is greater than the page size, C* will return > some duplicates. > The problem can be reproduced with the java driver using the following code: > {code} >session = cluster.connect(); >session.execute("CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION = > {'class' : 'SimpleStrategy', 'replication_factor' : '1'}"); >session.execute("USE test"); >session.execute("DROP TABLE IF EXISTS test"); >session.execute("CREATE TABLE test (rc int, pk int, PRIMARY KEY > (pk))"); >for (int i = 0; i < 5; i++) >session.execute("INSERT INTO test (pk, rc) VALUES (?, ?);", i, i); >ResultSet rs = session.execute(session.newSimpleStatement("SELECT * > FROM test WHERE pk IN (1, 2, 3)").setFetchSize(2)); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11124) Change default cqlsh encoding to utf-8
[ https://issues.apache.org/jira/browse/CASSANDRA-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159011#comment-15159011 ] Paulo Motta commented on CASSANDRA-11124: - Thanks! > Change default cqlsh encoding to utf-8 > -- > > Key: CASSANDRA-11124 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11124 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Trivial > Labels: cqlsh > > Strange things can happen when utf-8 is not the default cqlsh encoding (see > CASSANDRA-11030). This ticket proposes changing the default cqlsh encoding to > utf-8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159003#comment-15159003 ] Marcus Eriksson commented on CASSANDRA-11215: - I'll try to write up a dtest for this, unless you want to do that [~molsson]? > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-11215: Reviewer: Marcus Eriksson > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11217) Only log yaml config once, at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158993#comment-15158993 ] Jason Brown commented on CASSANDRA-11217: - ||2.2||3.0||trunk|| |[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:config_logging_2.2]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:config_logging_3.0]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:config_logging_3.x] |[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_2.2-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.x-testall/] |[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_2.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.x-dtest/] > Only log yaml config once, at startup > - > > Key: CASSANDRA-11217 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11217 > Project: Cassandra > Issue Type: Bug > Components: Configuration, Core >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Minor > > CASSANDRA-6456 introduced a feature where the yaml is dumped in the log. At > startup this is a nice feature, but I see that it’s actually triggered every > time it handshakes with a node and fails to connect and the node happens to > be a seed ([see > here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L435]). > Calling {{DD.getseeds()}} calls the {{SeedProvider}}, and if you happen to > use {{SimpleSeedProvider}} it will reload the yaml config, and once again > dump it out to the log. > It's debatable if {{DD.getseeds()}} should trigger a reload (which I added in > CASSANDRA-5459) or whether reloading the seeds should be a different method > (it probably should), but we shouldn't keep logging the yaml config on every > connection failure to a seed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11217) Only log yaml config once, at startup
Jason Brown created CASSANDRA-11217: --- Summary: Only log yaml config once, at startup Key: CASSANDRA-11217 URL: https://issues.apache.org/jira/browse/CASSANDRA-11217 Project: Cassandra Issue Type: Bug Components: Configuration, Core Reporter: Jason Brown Assignee: Jason Brown Priority: Minor CASSANDRA-6456 introduced a feature where the yaml is dumped in the log. At startup this is a nice feature, but I see that it’s actually triggered every time it handshakes with a node and fails to connect and the node happens to be a seed ([see here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L435]). Calling {{DD.getseeds()}} calls the {{SeedProvider}}, and if you happen to use {{SimpleSeedProvider}} it will reload the yaml config, and once again dump it out to the log. It's debatable if {{DD.getseeds()}} should trigger a reload (which I added in CASSANDRA-5459) or whether reloading the seeds should be a different method (it probably should), but we shouldn't keep logging the yaml config on every connection failure to a seed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158981#comment-15158981 ] Marcus Olsson edited comment on CASSANDRA-11215 at 2/23/16 2:50 PM: Patch for 2.2 is available [here|https://github.com/emolsson/cassandra/commit/8b1b4317c43db648d54ce2e339a525e3fb324cab]. I think there will be some merge conflicts in 3.0/3.x should I apply separate patch sets for them directly or wait for the review of the 2.2 version first? Edit: To make it clear what the fix is, the sstableCandidates are put in a try-with-resources to make sure that they are released. I felt that this clarification might be needed since the patch also moves the SSTable referencing code to a separate method to reduce complexity in the doValidationCompaction-method. was (Author: molsson): Patch for 2.2 is available [here|https://github.com/emolsson/cassandra/commit/8b1b4317c43db648d54ce2e339a525e3fb324cab]. I think there will be some merge conflicts in 3.0/3.x should I apply separate patch sets for them directly or wait for the review of the 2.2 version first? > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Add extension points in storage and streaming classes
Repository: cassandra Updated Branches: refs/heads/trunk 030c775ee -> 4b27287cd Add extension points in storage and streaming classes Patch by Blake Eggleston; reviewed by marcuse for CASSANDRA-11173 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4b27287c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4b27287c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4b27287c Branch: refs/heads/trunk Commit: 4b27287cd93088148d85d1a6ec9df34601f0c741 Parents: 030c775 Author: Blake EgglestonAuthored: Tue Feb 16 15:06:00 2016 -0800 Committer: Marcus Eriksson Committed: Tue Feb 23 15:45:07 2016 +0100 -- .../apache/cassandra/db/ColumnFamilyStore.java | 1 + .../db/SinglePartitionReadCommand.java | 28 --- .../org/apache/cassandra/db/StorageHook.java| 86 .../apache/cassandra/streaming/StreamHook.java | 57 + .../cassandra/streaming/StreamReader.java | 4 +- .../cassandra/streaming/StreamSession.java | 3 +- .../cassandra/streaming/StreamTransferTask.java | 3 +- 7 files changed, 165 insertions(+), 17 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b27287c/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 9b113c4..fa95063 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1217,6 +1217,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean DecoratedKey key = update.partitionKey(); invalidateCachedPartition(key); metric.samplers.get(Sampler.WRITES).addSample(key.getKey(), key.hashCode(), 1); +StorageHook.instance.reportWrite(metadata.cfId, update); metric.writeLatency.addNano(System.nanoTime() - start); if(timeDelta < Long.MAX_VALUE) metric.colUpdateTimeDeltaHistogram.update(timeDelta); http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b27287c/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java -- diff --git a/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java b/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java index 1a0b400..9712497 100644 --- a/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java +++ b/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java @@ -547,7 +547,7 @@ public class SinglePartitionReadCommand extends ReadCommand @SuppressWarnings("resource") // 'iter' is added to iterators which is closed on exception, // or through the closing of the final merged iterator -UnfilteredRowIteratorWithLowerBound iter = makeIterator(sstable, true); +UnfilteredRowIteratorWithLowerBound iter = makeIterator(cfs, sstable, true); if (!sstable.isRepaired()) oldestUnrepairedTombstone = Math.min(oldestUnrepairedTombstone, sstable.getMinLocalDeletionTime()); @@ -567,7 +567,7 @@ public class SinglePartitionReadCommand extends ReadCommand @SuppressWarnings("resource") // 'iter' is added to iterators which is close on exception, // or through the closing of the final merged iterator -UnfilteredRowIteratorWithLowerBound iter = makeIterator(sstable, false); +UnfilteredRowIteratorWithLowerBound iter = makeIterator(cfs, sstable, false); if (!sstable.isRepaired()) oldestUnrepairedTombstone = Math.min(oldestUnrepairedTombstone, sstable.getMinLocalDeletionTime()); @@ -582,6 +582,7 @@ public class SinglePartitionReadCommand extends ReadCommand if (iterators.isEmpty()) return EmptyIterators.unfilteredRow(cfs.metadata, partitionKey(), filter.isReversed()); +StorageHook.instance.reportRead(cfs.metadata.cfId, partitionKey()); return withStateTracking(withSSTablesIterated(iterators, cfs.metric)); } catch (RuntimeException | Error e) @@ -609,15 +610,17 @@ public class SinglePartitionReadCommand extends ReadCommand return clusteringIndexFilter().shouldInclude(sstable); } -private UnfilteredRowIteratorWithLowerBound makeIterator(final SSTableReader sstable, boolean applyThriftTransformation) +private UnfilteredRowIteratorWithLowerBound
[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158981#comment-15158981 ] Marcus Olsson commented on CASSANDRA-11215: --- Patch for 2.2 is available [here|https://github.com/emolsson/cassandra/commit/8b1b4317c43db648d54ce2e339a525e3fb324cab]. I think there will be some merge conflicts in 3.0/3.x should I apply separate patch sets for them directly or wait for the review of the 2.2 version first? > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11216) Range.compareTo() violates the contract of Comparable
Jason Brown created CASSANDRA-11216: --- Summary: Range.compareTo() violates the contract of Comparable Key: CASSANDRA-11216 URL: https://issues.apache.org/jira/browse/CASSANDRA-11216 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jason Brown Assignee: Jason Brown Priority: Minor When running some quick-check style tests, I discovered that if both of the ranges being compared wrap around, then the result of the comparison depends on which range is evaluated first. For example, two ranges: A = { -1, 2 } B = { -2, 1 } and then compare them together: A.compareTo(B) == -1 B.compareTo(A) == -1 This is because the logic of the existing {{Range.compareTo()}} simply checks to see if the {{this}} range wraps around, and returns -1. This bug does not appear to affect c* until 3.0, and then only in one place ({{MerkleTrees.TokenRangeComparator#compare}}) that I could identify. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11216) Range.compareTo() violates the contract of Comparable
[ https://issues.apache.org/jira/browse/CASSANDRA-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158970#comment-15158970 ] Jason Brown commented on CASSANDRA-11216: - || 2.2 || 3.0 || trunk || |[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:range_compareTo_2.2]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:range_compareTo_3.0]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:range_compareTo_3.x] |[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_2.2-testall]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.0-testall]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.x-testall] |[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_2.2-dtest]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.0-dtest]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.x-dtest] > Range.compareTo() violates the contract of Comparable > - > > Key: CASSANDRA-11216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11216 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Minor > > When running some quick-check style tests, I discovered that if both of the > ranges being compared wrap around, then the result of the comparison depends > on which range is evaluated first. For example, two ranges: > A = { -1, 2 } > B = { -2, 1 } > and then compare them together: > A.compareTo(B) == -1 > B.compareTo(A) == -1 > This is because the logic of the existing {{Range.compareTo()}} simply checks > to see if the {{this}} range wraps around, and returns -1. This bug does not > appear to affect c* until 3.0, and then only in one place > ({{MerkleTrees.TokenRangeComparator#compare}}) that I could identify. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling
[ https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158968#comment-15158968 ] Paulo Motta commented on CASSANDRA-10070: - bq. But in that case the pause/stop feature should be implemented as early as possible to avoid having an upgrade scenario that requires the user to upgrade to the version that introduces the pause feature before upgrading to the latest. Another way would be to have the "system interrupts" feature in place early, so that the repairs would be paused during an upgrade. Sounds good! We could ask the user to pause, but I think doing that automatically via "system interrupts" is better. It just ocurred to me that both "the pause" or "system interrupts" will prevent new repairs from starting, but what about already running repairs? We will probably want to interrupt already running repairs as well in some situations. For this reason CASSANDRA-3486 is also relevant for this ticket (adding it as a dependency of this ticket). bq. I think the timeout might be good to have to prevent a hang from stopping the entire repair process. But I think it would only work if the repair would only hang occasionally, otherwise the same repair would be retried until it is marked as a "fail". +1. Then I think we should either have timeout, or add an ability to cancel/interrupt a running scheduled repair in the initial version, to avoid hanging repairs to render the automatic repair scheduling useless. bq. Another option is to have a "slow repair"-detector that would log a warning if a repair session is taking too long time, to avoid aborting it if it's actually repairing and leaving it up to the user to handle it. Either way I'd say it's out of the scope of the initial version. bq. We might also want to be able to detect if it would be impossible to repair the whole cluster within gc grace and report it to the user. This could happen for multiple reasons like too many tables, too many nodes, too few parallel repairs or simply overload. I guess it would be hard to make accurate predictions with all of these variables so it might be good enough to check through the history of the repairs, do an estimation of the time and compare it to gc grace? I think this is something out of scope for the first version, but I thought I'd just mention it here to remember it. Nice! These could probably live in a separate repair metrics and alert module in the future, allowing users to track statistics, issue alerts/warnings based on history and allow the scheduler to perform more advanced adaptive scheduling. Some metrics to track: * Repair time per session ** Break up of time per phase (validation, sync, anticompaction, etc) * Repair time per node * Validation mismatch % * Fail count bq. Should we maybe compile a list of "features that should be in the initial version" and also a "improvements" list for future work to make the scope clear? Sounds good! Below is a suggested list of subtasks: * Basic functionality ** Resource locking API and implementation ** Maintenance scheduling API and metadata ** Basic scheduling support ** Polling and monitoring module ** Pausing and aborting support ** Rejection policies (includes system interrupts and maintenance windows) ** Failure handling and retry ** Configuration support ** Frontend support (table options, management commands) * Optional/deferred functionality ** Parallel repair session support ** Subrange repair support ** Maintenance history ** Timeout ** Metrics ** Alerts WDYT? Feel free to update or break-up into smaller or larger subtasks, and then create the actual subtasks to start work on them. > Automatic repair scheduling > --- > > Key: CASSANDRA-10070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10070 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Olsson >Assignee: Marcus Olsson >Priority: Minor > Fix For: 3.x > > Attachments: Distributed Repair Scheduling.doc > > > Scheduling and running repairs in a Cassandra cluster is most often a > required task, but this can both be hard for new users and it also requires a > bit of manual configuration. There are good tools out there that can be used > to simplify things, but wouldn't this be a good feature to have inside of > Cassandra? To automatically schedule and run repairs, so that when you start > up your cluster it basically maintains itself in terms of normal > anti-entropy, with the possibility for manual configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
[ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158958#comment-15158958 ] Marcus Olsson commented on CASSANDRA-11215: --- It seems that this issue is caused by not accepting parallel repairs on the same sstables anymore, where it throws a RuntimeException if that happens and fails to release the previously acquired references. I'm currently working on a patch for this. > Reference leak with parallel repairs on the same table > -- > > Key: CASSANDRA-11215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Olsson >Assignee: Marcus Olsson > > When starting multiple repairs on the same table Cassandra starts to log > about reference leak as: > {noformat} > ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class > org.apache.cassandra.io.sstable.format.SSTableReader > $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big > was not released before the reference was garbage collected > {noformat} > Reproducible with: > {noformat} > ccm create repairtest -v 2.2.5 -n 3 > ccm start > ccm stress write n=100 -schema > replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair > # And then perform two repairs concurrently with: > ccm node1 nodetool repair testrepair > {noformat} > I know that starting multiple repairs in parallel on the same table isn't > very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
Marcus Olsson created CASSANDRA-11215: - Summary: Reference leak with parallel repairs on the same table Key: CASSANDRA-11215 URL: https://issues.apache.org/jira/browse/CASSANDRA-11215 Project: Cassandra Issue Type: Bug Reporter: Marcus Olsson Assignee: Marcus Olsson When starting multiple repairs on the same table Cassandra starts to log about reference leak as: {noformat} ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class org.apache.cassandra.io.sstable.format.SSTableReader $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big was not released before the reference was garbage collected {noformat} Reproducible with: {noformat} ccm create repairtest -v 2.2.5 -n 3 ccm start ccm stress write n=100 -schema replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair # And then perform two repairs concurrently with: ccm node1 nodetool repair testrepair {noformat} I know that starting multiple repairs in parallel on the same table isn't very wise, but this shouldn't result in reference leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference
[ https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158945#comment-15158945 ] Marcus Eriksson commented on CASSANDRA-11209: - I've been trying to reproduce this today with no luck (using STCS) [~jrfernandez] can you reproduce in a testing environment? Could you try with one of the built-in compaction strategies if so? > SSTable ancestor leaked reference > - > > Key: CASSANDRA-11209 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11209 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Jose Fernandez > Attachments: screenshot-1.png > > > We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy > from [~jjirsa]. We've been running 4 clusters without any issues for many > months until a few weeks ago we started scheduling incremental repairs every > 24 hours (previously we didn't run any repairs at all). > Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, > TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought > back in sync by restarting the node. We also noticed that when this bug > happens there are several ancestors that don't get cleaned up. A restart will > queue up a lot of compactions that slowly eat away the ancestors. > I looked at the code and noticed that we only decrease the LiveTotalDiskUsed > metric in the SSTableDeletingTask. Since we have no errors being logged, I'm > assuming that for some reason this task is not getting queued up. If I > understand correctly this only happens when the reference count for the > SStable reaches 0. So this is leading us to believe that something during > repairs and/or compactions is causing a reference leak to the ancestor table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11203) nodetool repair not performing repair or being incorrectly triggered in 3.0.3
[ https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157138#comment-15157138 ] Jason Kania edited comment on CASSANDRA-11203 at 2/23/16 2:07 PM: -- Is it possible to change the output text to reflect this insider knowledge? Also, if no repair is required, then it can be changed to indicate this. The messaging right now is confusing. ie. "Replication factor is 1; nothing to repair for keyspace 'sensordb'" was (Author: longtimer): Is it possible to change the output text to reflect this insider knowledge? ie. "Replication factor is 1; nothing to repair for keyspace 'sensordb'" > nodetool repair not performing repair or being incorrectly triggered in 3.0.3 > - > > Key: CASSANDRA-11203 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11203 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: debian jesse up to date content >Reporter: Jason Kania > > When nodetool repair is run, it indicates that no repair is needed on some > keyspaces but on others it attempts repair. However, when run multiple times, > the output seems to indicate that the same triggering conditions still > persists that indicate a problem. Alternatively, the output could indicate > that the underlying condition has not been resolved. > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:12,144] Repair completed successfully > [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second > root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair > [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb' > [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth' > [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace > system_traces with repair options (parallelism: parallel, primary range: > false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 256) > [2016-02-21 23:33:33,324] Repair completed successfully > [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)
[ https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158897#comment-15158897 ] Joshua McKenzie commented on CASSANDRA-8844: bq. Does this mean if RF was say, three, that three CDC commit logs would be written to across the cluster (compared to say, one write at the coordinator)? That really was rather poorly phrased initially. I was originally trying to convey that DDL logic would be similar to RF on a KS but even that's not set in stone. I've pulled that from the design doc as the way it currently reads is redundant (anywhere data's written, as per replication strategy, will by definition have CDC). As for the de-duplication that will need to be done client-side. Whether or not we have a reference implementation for that now (as we will for the CDCConsumerDaemon) is currently up in the air. > Change Data Capture (CDC) > - > > Key: CASSANDRA-8844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8844 > Project: Cassandra > Issue Type: New Feature > Components: Coordination, Local Write-Read Paths >Reporter: Tupshin Harper >Assignee: Joshua McKenzie >Priority: Critical > Fix For: 3.x > > > "In databases, change data capture (CDC) is a set of software design patterns > used to determine (and track) the data that has changed so that action can be > taken using the changed data. Also, Change data capture (CDC) is an approach > to data integration that is based on the identification, capture and delivery > of the changes made to enterprise data sources." > -Wikipedia > As Cassandra is increasingly being used as the Source of Record (SoR) for > mission critical data in large enterprises, it is increasingly being called > upon to act as the central hub of traffic and data flow to other systems. In > order to try to address the general need, we (cc [~brianmhess]), propose > implementing a simple data logging mechanism to enable per-table CDC patterns. > h2. The goals: > # Use CQL as the primary ingestion mechanism, in order to leverage its > Consistency Level semantics, and in order to treat it as the single > reliable/durable SoR for the data. > # To provide a mechanism for implementing good and reliable > (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) > continuous semi-realtime feeds of mutations going into a Cassandra cluster. > # To eliminate the developmental and operational burden of users so that they > don't have to do dual writes to other systems. > # For users that are currently doing batch export from a Cassandra system, > give them the opportunity to make that realtime with a minimum of coding. > h2. The mechanism: > We propose a durable logging mechanism that functions similar to a commitlog, > with the following nuances: > - Takes place on every node, not just the coordinator, so RF number of copies > are logged. > - Separate log per table. > - Per-table configuration. Only tables that are specified as CDC_LOG would do > any logging. > - Per DC. We are trying to keep the complexity to a minimum to make this an > easy enhancement, but most likely use cases would prefer to only implement > CDC logging in one (or a subset) of the DCs that are being replicated to > - In the critical path of ConsistencyLevel acknowledgment. Just as with the > commitlog, failure to write to the CDC log should fail that node's write. If > that means the requested consistency level was not met, then clients *should* > experience UnavailableExceptions. > - Be written in a Row-centric manner such that it is easy for consumers to > reconstitute rows atomically. > - Written in a simple format designed to be consumed *directly* by daemons > written in non JVM languages > h2. Nice-to-haves > I strongly suspect that the following features will be asked for, but I also > believe that they can be deferred for a subsequent release, and to guage > actual interest. > - Multiple logs per table. This would make it easy to have multiple > "subscribers" to a single table's changes. A workaround would be to create a > forking daemon listener, but that's not a great answer. > - Log filtering. Being able to apply filters, including UDF-based filters > would make Casandra a much more versatile feeder into other systems, and > again, reduce complexity that would otherwise need to be built into the > daemons. > h2. Format and Consumption > - Cassandra would only write to the CDC log, and never delete from it. > - Cleaning up consumed logfiles would be the client daemon's responibility > - Logfile size should probably be configurable. > - Logfiles should be named with a predictable naming schema, making it > triivial to process them in order. > - Daemons should be able to checkpoint their work, and resume from
[jira] [Commented] (CASSANDRA-10445) Cassandra-stress throws max frame size error when SSL certification is enabled
[ https://issues.apache.org/jira/browse/CASSANDRA-10445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158895#comment-15158895 ] Vara commented on CASSANDRA-10445: -- It worked fine this time Cornel. Thanks for resolving the issue. > Cassandra-stress throws max frame size error when SSL certification is enabled > -- > > Key: CASSANDRA-10445 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10445 > Project: Cassandra > Issue Type: Bug >Reporter: Sam Goldberg > Labels: stress > Fix For: 2.1.x > > > Running cassandra-stress when SSL is enabled gives the following error and > does not finish executing: > {quote} > cassandra-stress write n=100 > Exception in thread "main" java.lang.RuntimeException: > org.apache.thrift.transport.TTransportException: Frame size (352518912) > larger than max length (15728640)! > at > org.apache.cassandra.stress.settings.StressSettings.getRawThriftClient(StressSettings.java:144) > at > org.apache.cassandra.stress.settings.StressSettings.getRawThriftClient(StressSettings.java:110) > at > org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesThrift(SettingsSchema.java:111) > at > org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:59) > at > org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:205) > at org.apache.cassandra.stress.StressAction.run(StressAction.java:55) > at org.apache.cassandra.stress.Stress.main(Stress.java:109) > {quote} > I was able to reproduce this issue consistently via the following steps: > 1) Spin up 3 node cassandra cluster running 2.1.8 > 2) Perform cassandra-stress write n=100 > 3) Everything works! > 4) Generate keystore and truststore for each node in the cluster and > distribute appropriately > 5) Modify cassandra.yaml on each node to enable SSL: > client_encryption_options: > enabled: true > keystore: / > # require_client_auth: false > # Set trustore and truststore_password if require_client_auth is true > truststore: / > truststore_password: > # More advanced defaults below: > protocol: ssl > 6) Restart each node. > 7) Perform cassandra-stress write n=100 > 8) Get Frame Size error, cassandra-stress fails > This may be related to CASSANDRA-9325. -- This message was sent by Atlassian JIRA (v6.3.4#6332)