date:20160223

[jira] [Resolved] (CASSANDRA-11222) datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.

2016-02-23 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-11222.
--
Resolution: Not A Problem

This is really for the datanucleus project to fix so closing.

> datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.
> --
>
> Key: CASSANDRA-11222
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11222
> Project: Cassandra
>  Issue Type: Wish
>  Components: CQL
> Environment: Java JDO
>Reporter: Rafael Sanches
>Priority: Minor
>  Labels: newbie
> Fix For: 3.0.x
>
>
> Hi, 
> I'm starting a new project and was hoping to upgrade directly to cassandra 
> 3.0, so it would save us a migration from 2.2 later on. 
> Unfortunately, the datanucleus-cassandra-5.0.0-m1 (latest) don't support the 
> 3.0 data model. 
> Errors like these will appear because of JDO:
> https://issues.apache.org/jira/browse/CASSANDRA-10996
> To be more specific, this class does things like:
> StringBuilder stmtBuilder = new StringBuilder("SELECT keyspace_name FROM 
> system.schema_keyspaces WHERE keyspace_name=?;");
> https://github.com/datanucleus/datanucleus-cassandra/blob/master/src/main/java/org/datanucleus/store/cassandra/CassandraSchemaHandler.java
> It doesn't seem like the Datanucleus guys are looking to fix this, since the 
> last update on datanucleus-cassandra was on 2014. I will open an issue there 
> too. Hope can reach contributors from both places. 
> I guess opening an issue here is more a "heads up", because more developers 
> will waste time on this soon.
> thanks
> rafa



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11222) datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.

2016-02-23 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-11222:
-
Priority: Minor  (was: Blocker)

> datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.
> --
>
> Key: CASSANDRA-11222
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11222
> Project: Cassandra
>  Issue Type: Wish
>  Components: CQL
> Environment: Java JDO
>Reporter: Rafael Sanches
>Priority: Minor
>  Labels: newbie
> Fix For: 3.0.x
>
>
> Hi, 
> I'm starting a new project and was hoping to upgrade directly to cassandra 
> 3.0, so it would save us a migration from 2.2 later on. 
> Unfortunately, the datanucleus-cassandra-5.0.0-m1 (latest) don't support the 
> 3.0 data model. 
> Errors like these will appear because of JDO:
> https://issues.apache.org/jira/browse/CASSANDRA-10996
> To be more specific, this class does things like:
> StringBuilder stmtBuilder = new StringBuilder("SELECT keyspace_name FROM 
> system.schema_keyspaces WHERE keyspace_name=?;");
> https://github.com/datanucleus/datanucleus-cassandra/blob/master/src/main/java/org/datanucleus/store/cassandra/CassandraSchemaHandler.java
> It doesn't seem like the Datanucleus guys are looking to fix this, since the 
> last update on datanucleus-cassandra was on 2014. I will open an issue there 
> too. Hope can reach contributors from both places. 
> I guess opening an issue here is more a "heads up", because more developers 
> will waste time on this soon.
> thanks
> rafa



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160323#comment-15160323
 ] 

Jeff Jirsa edited comment on CASSANDRA-11209 at 2/24/16 7:44 AM:
-

Similar to CASSANDRA-10510 as well ?


was (Author: jjirsa):
Similar to CASSANDRA-10510 as well 

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
>Assignee: Marcus Eriksson
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160323#comment-15160323
 ] 

Jeff Jirsa commented on CASSANDRA-11209:


Similar to CASSANDRA-10510 as well 

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
>Assignee: Marcus Eriksson
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2016-02-23 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9303:

Labels: doc-impacting  (was: )

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
>  Labels: doc-impacting
> Fix For: 2.1.13, 2.2.5, 3.0.3, 3.2
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reopened CASSANDRA-11209:
-
  Assignee: Marcus Eriksson

or maybe not... need more tests

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
>Assignee: Marcus Eriksson
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9043) Improve COPY command to work with Counter columns

2016-02-23 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9043:

Labels: doc-impacting lhf  (was: lhf)

> Improve COPY command to work with Counter columns
> -
>
> Key: CASSANDRA-9043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9043
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sebastian Estevez
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: doc-impacting, lhf
> Fix For: 2.1.12, 2.2.4, 3.0.1, 3.1
>
> Attachments: CASSANDRA-9043-2.1.8.patch, CASSANDRA-9043-trunk.patch
>
>
> Noticed today that the copy command doesn't work with counter column tables.
> This makes sense given that we need to use UPDATE instead of INSERT with 
> counters.
> Given that we're making improvements in the COPY command in 3.0 with 
> CASSANDRA-7405, can we also tweak it to work with counters?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson resolved CASSANDRA-11209.
-
Resolution: Duplicate

yep, this is a duplicate of CASSANDRA-11215 - we can leak references if we 
throw exceptions in doValidationCompaction

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11124) Change default cqlsh encoding to utf-8

2016-02-23 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156310#comment-15156310
 ] 

Stefania edited comment on CASSANDRA-11124 at 2/24/16 3:06 AM:
---

Looking good, +1. 

dtests on 2.2. are not working at the moment and there are 2 cqlsh failures on 
trunk but neither are related to this ticket.


was (Author: stefania):
Looking good, +1. 

dtests on 2.2. are not working at the moment and there are 2 cqlsh failures on 
trunk but neither are not related to this ticket.

> Change default cqlsh encoding to utf-8
> --
>
> Key: CASSANDRA-11124
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11124
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Trivial
>  Labels: cqlsh
>
> Strange things can happen when utf-8 is not the default cqlsh encoding (see 
> CASSANDRA-11030). This ticket proposes changing the default cqlsh encoding to 
> utf-8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11212) cqlsh python version checking is out of date

2016-02-23 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160049#comment-15160049
 ] 

Stefania commented on CASSANDRA-11212:
--

Latest round of dtests was good, this can be committed.

> cqlsh python version checking is out of date
> 
>
> Key: CASSANDRA-11212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11212
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> cqlsh.py has python version checking code at the top, but it still says 
> python 2.5 is a valid version, which we then error out on a few lines down in 
> the file.  We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11222) datanucleus-cassandra won't work with cassandra 3.0 system.* metadata.

2016-02-23 Thread Rafael Sanches (JIRA)

Rafael Sanches created CASSANDRA-11222:
--

 Summary: datanucleus-cassandra won't work with cassandra 3.0 
system.* metadata.
 Key: CASSANDRA-11222
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11222
 Project: Cassandra
  Issue Type: Wish
  Components: CQL
 Environment: Java JDO
Reporter: Rafael Sanches
Priority: Blocker
 Fix For: 3.0.x


Hi, 

I'm starting a new project and was hoping to upgrade directly to cassandra 3.0, 
so it would save us a migration from 2.2 later on. 

Unfortunately, the datanucleus-cassandra-5.0.0-m1 (latest) don't support the 
3.0 data model. 

Errors like these will appear because of JDO:
https://issues.apache.org/jira/browse/CASSANDRA-10996

To be more specific, this class does things like:
StringBuilder stmtBuilder = new StringBuilder("SELECT keyspace_name FROM 
system.schema_keyspaces WHERE keyspace_name=?;");

https://github.com/datanucleus/datanucleus-cassandra/blob/master/src/main/java/org/datanucleus/store/cassandra/CassandraSchemaHandler.java

It doesn't seem like the Datanucleus guys are looking to fix this, since the 
last update on datanucleus-cassandra was on 2014. I will open an issue there 
too. Hope can reach contributors from both places. 

I guess opening an issue here is more a "heads up", because more developers 
will waste time on this soon.

thanks
rafa



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11164) Order and filter cipher suites correctly

2016-02-23 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160023#comment-15160023
 ] 

Stefania commented on CASSANDRA-11164:
--

I understood that the ordering would be dealt with by CASSANDRA-10508 but if we 
want to fix it here then that's probably better since this can then go into 2.2 
and CASSANDRA-10508 be limited to 3.x only. 

{{testServerSocketCiphers}} is failing locally on my machine because the 256 
cipher_suites are not returned by {{socket.getEnabledCipherSuites()}}, so I 
think we should remove it from this patch? Incidentally it also doesn't need 
{{UnknownHostException}} in the throws declaration since {{IOException}} is 
more generic.

{{TestTupleType}} has failed on jenkins but it is passing locally and the 
failure doesn't seem related. 

Let's rebase, squash the two commits and repeat the cassci tests on *2.2, 3.0* 
and *trunk*. If that's clear then we are good to go. If {{TestTupleType}} is 
still failing then my best guess is that for some reason we've uncovered an 
existing problem in {{CQLTester}} and we'll deal with it.

We'll also need to add a line to CHANGES.txt.  

> Order and filter cipher suites correctly
> 
>
> Key: CASSANDRA-11164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11164
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom Petracca
>Assignee: Stefan Podkowinski
>Priority: Minor
> Fix For: 2.2.x
>
> Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, 
> 11164-2.2_2_call_filterCipherSuites_everywhere.patch
>
>
> As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, 
> SSLFactory.filterCipherSuites() doesn't respect the ordering of desired 
> ciphers in cassandra.yaml.
> Also the fix that occurred for 
> https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs 
> to be applied to all locations where we create an SSLSocket so that JCE is 
> not required out of the box or with additional configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11212) cqlsh python version checking is out of date

2016-02-23 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159972#comment-15159972
 ] 

Stefania commented on CASSANDRA-11212:
--

It seems all dtest jobs failed for infrastructure reasons, trying again with 
the 2.2 patch and a different cassci job:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11212-2.2-dtest/

> cqlsh python version checking is out of date
> 
>
> Key: CASSANDRA-11212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11212
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jeremiah Jordan
>Assignee: Jeremiah Jordan
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> cqlsh.py has python version checking code at the top, but it still says 
> python 2.5 is a valid version, which we then error out on a few lines down in 
> the file.  We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11221) replication_test.ReplicationTest.network_topology_test flaps

2016-02-23 Thread Philip Thompson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159970#comment-15159970
 ] 

Philip Thompson commented on CASSANDRA-11221:
-

I know what's wrong. I'll take care of it. I didn't realize these were failing 
again.

> replication_test.ReplicationTest.network_topology_test flaps
> 
>
> Key: CASSANDRA-11221
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11221
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: Philip Thompson
>  Labels: dtest
>
> Test intermittently failing with set comparison errors that differ from one 
> failure to the next. Looks a bit more stable recently since #203 failed, but 
> probably worth keeping an eye on, and check if there's a problem with the 
> test code.
> most recent failure:
> http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/203/testReport/replication_test/ReplicationTest/network_topology_test/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11221) replication_test.ReplicationTest.network_topology_test flaps

2016-02-23 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159965#comment-15159965
 ] 

Russ Hatch commented on CASSANDRA-11221:


errors are typically some variant of set comparison fails, like most recently 
"Items in the second set but not the first:
u'127.0.0.4'"

> replication_test.ReplicationTest.network_topology_test flaps
> 
>
> Key: CASSANDRA-11221
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11221
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: DS Test Eng
>  Labels: dtest
>
> Test intermittently failing with set comparison errors that differ from one 
> failure to the next. Looks a bit more stable recently since #203 failed, but 
> probably worth keeping an eye on, and check if there's a problem with the 
> test code.
> most recent failure:
> http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/203/testReport/replication_test/ReplicationTest/network_topology_test/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11221) replication_test.ReplicationTest.network_topology_test flaps

2016-02-23 Thread Russ Hatch (JIRA)

Russ Hatch created CASSANDRA-11221:
--

 Summary: replication_test.ReplicationTest.network_topology_test 
flaps
 Key: CASSANDRA-11221
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11221
 Project: Cassandra
  Issue Type: Test
Reporter: Russ Hatch
Assignee: DS Test Eng


Test intermittently failing with set comparison errors that differ from one 
failure to the next. Looks a bit more stable recently since #203 failed, but 
probably worth keeping an eye on, and check if there's a problem with the test 
code.

most recent failure:
http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/203/testReport/replication_test/ReplicationTest/network_topology_test/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip

2016-02-23 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159795#comment-15159795
 ] 

Jason Brown commented on CASSANDRA-10371:
-

Committed to 2.1, 2.2, 3.0, and trunk, as sha 
7877d6f85f1a84d9f9de4d81339730d9df3667a1

> Decommissioned nodes can remain in gossip
> -
>
> Key: CASSANDRA-10371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Brandon Williams
>Assignee: Joel Knighton
>Priority: Minor
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
>
> This may apply to other dead states as well.  Dead states should be expired 
> after 3 days.  In the case of decom we attach a timestamp to let the other 
> nodes know when it should be expired.  It has been observed that sometimes a 
> subset of nodes in the cluster never expire the state, and through heap 
> analysis of these nodes it is revealed that the epstate.isAlive check returns 
> true when it should return false, which would allow the state to be evicted.  
> This may have been affected by CASSANDRA-8336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[09/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2016-02-23 Thread jasobrown

Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c4bd6d25
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c4bd6d25
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c4bd6d25

Branch: refs/heads/trunk
Commit: c4bd6d2549bd794b81cf0c9a9ac73b97c5a0b686
Parents: e9abaab 77ff794
Author: Jason Brown 
Authored: Tue Feb 23 14:36:15 2016 -0800
Committer: Jason Brown 
Committed: Tue Feb 23 14:37:56 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/CHANGES.txt
--
diff --cc CHANGES.txt
index cd2a930,e989e7f..9ca2f80
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -34,7 -17,9 +34,8 @@@ Merged from 2.2
   * Fix paging on DISTINCT queries repeats result when first row in partition 
changes
 (CASSANDRA-10010)
  Merged from 2.1:
+  * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
   * Only notify if repair status changed (CASSANDRA-11172)
 - * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
   * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)
   * Fix sstableloader to unthrottle streaming by default (CASSANDRA-9714)
   * Fix incorrect warning in 'nodetool status' (CASSANDRA-10176)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/src/java/org/apache/cassandra/gms/Gossiper.java
--

[06/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

2016-02-23 Thread jasobrown

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/77ff7947
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/77ff7947
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/77ff7947

Branch: refs/heads/cassandra-2.2
Commit: 77ff794737f067b04f1e2fae6124cb22921eb4c7
Parents: 5009594 7877d6f
Author: Jason Brown 
Authored: Tue Feb 23 14:31:38 2016 -0800
Committer: Jason Brown 
Committed: Tue Feb 23 14:35:03 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/CHANGES.txt
--
diff --cc CHANGES.txt
index 01e7b3d,82ee99e..e989e7f
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,22 -1,5 +1,23 @@@
 -2.1.14
 +2.2.6
 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
 + * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
 + * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
 + * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)
 + * Protect from keyspace dropped during repair (CASSANDRA-11065)
 + * Handle adding fields to a UDT in SELECT JSON and toJson() (CASSANDRA-11146)
 + * Better error message for cleanup (CASSANDRA-10991)
 + * cqlsh pg-style-strings broken if line ends with ';' (CASSANDRA-11123)
 + * Use cloned TokenMetadata in size estimates to avoid race against 
membership check
 +   (CASSANDRA-10736)
 + * Always persist upsampled index summaries (CASSANDRA-10512)
 + * (cqlsh) Fix inconsistent auto-complete (CASSANDRA-10733)
 + * Make SELECT JSON and toJson() threadsafe (CASSANDRA-11048)
 + * Fix SELECT on tuple relations for mixed ASC/DESC clustering order 
(CASSANDRA-7281)
 + * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030)
 + * Fix paging on DISTINCT queries repeats result when first row in partition 
changes
 +   (CASSANDRA-10010)
 +Merged from 2.1:
+  * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
   * Only notify if repair status changed (CASSANDRA-11172)
   * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
   * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/src/java/org/apache/cassandra/gms/Gossiper.java
--

[03/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint

2016-02-23 Thread jasobrown

Don't remove FailureDetector history on removeEndpoint

patch by jkni, reviewed by jasobrown for CASSANDRA-10371


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8

Branch: refs/heads/cassandra-3.0
Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1
Parents: 67637d1
Author: Joel Knighton 
Authored: Fri Feb 19 15:19:33 2016 -0600
Committer: Jason Brown 
Committed: Tue Feb 23 14:30:28 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 52bdcce..82ee99e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.14
+ * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
  * Only notify if repair status changed (CASSANDRA-11172)
  * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
  * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java
--
diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java 
b/src/java/org/apache/cassandra/gms/Gossiper.java
index ae99829..889806c 100644
--- a/src/java/org/apache/cassandra/gms/Gossiper.java
+++ b/src/java/org/apache/cassandra/gms/Gossiper.java
@@ -386,6 +386,7 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 unreachableEndpoints.remove(endpoint);
 endpointStateMap.remove(endpoint);
 expireTimeEndpointMap.remove(endpoint);
+FailureDetector.instance.remove(endpoint);
 quarantineEndpoint(endpoint);
 if (logger.isDebugEnabled())
 logger.debug("evicting {} from gossip", endpoint);
@@ -409,8 +410,6 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 
 liveEndpoints.remove(endpoint);
 unreachableEndpoints.remove(endpoint);
-// do not remove endpointState until the quarantine expires
-FailureDetector.instance.remove(endpoint);
 MessagingService.instance().resetVersion(endpoint);
 quarantineEndpoint(endpoint);
 MessagingService.instance().destroyConnectionPool(endpoint);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
--
diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java 
b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
new file mode 100644
index 000..9325922
--- /dev/null
+++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.gms;
+
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.UUID;
+
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.cassandra.Util;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.dht.IPartitioner;
+import org.apache.cassandra.dht.RandomPartitioner;
+import org.apache.cassandra.dht.Token;
+import org.apache.cassandra.locator.TokenMetadata;
+import org.apache.cassandra.service.StorageService;
+
+import static org.junit.Assert.assertFalse;
+
+public class FailureDetectorTest
+{
+@BeforeClass
+public static

[07/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

2016-02-23 Thread jasobrown

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/77ff7947
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/77ff7947
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/77ff7947

Branch: refs/heads/trunk
Commit: 77ff794737f067b04f1e2fae6124cb22921eb4c7
Parents: 5009594 7877d6f
Author: Jason Brown 
Authored: Tue Feb 23 14:31:38 2016 -0800
Committer: Jason Brown 
Committed: Tue Feb 23 14:35:03 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/CHANGES.txt
--
diff --cc CHANGES.txt
index 01e7b3d,82ee99e..e989e7f
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,22 -1,5 +1,23 @@@
 -2.1.14
 +2.2.6
 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
 + * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
 + * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
 + * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)
 + * Protect from keyspace dropped during repair (CASSANDRA-11065)
 + * Handle adding fields to a UDT in SELECT JSON and toJson() (CASSANDRA-11146)
 + * Better error message for cleanup (CASSANDRA-10991)
 + * cqlsh pg-style-strings broken if line ends with ';' (CASSANDRA-11123)
 + * Use cloned TokenMetadata in size estimates to avoid race against 
membership check
 +   (CASSANDRA-10736)
 + * Always persist upsampled index summaries (CASSANDRA-10512)
 + * (cqlsh) Fix inconsistent auto-complete (CASSANDRA-10733)
 + * Make SELECT JSON and toJson() threadsafe (CASSANDRA-11048)
 + * Fix SELECT on tuple relations for mixed ASC/DESC clustering order 
(CASSANDRA-7281)
 + * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030)
 + * Fix paging on DISTINCT queries repeats result when first row in partition 
changes
 +   (CASSANDRA-10010)
 +Merged from 2.1:
+  * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
   * Only notify if repair status changed (CASSANDRA-11172)
   * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
   * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/src/java/org/apache/cassandra/gms/Gossiper.java
--

[04/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint

2016-02-23 Thread jasobrown

Don't remove FailureDetector history on removeEndpoint

patch by jkni, reviewed by jasobrown for CASSANDRA-10371


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8

Branch: refs/heads/trunk
Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1
Parents: 67637d1
Author: Joel Knighton 
Authored: Fri Feb 19 15:19:33 2016 -0600
Committer: Jason Brown 
Committed: Tue Feb 23 14:30:28 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 52bdcce..82ee99e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.14
+ * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
  * Only notify if repair status changed (CASSANDRA-11172)
  * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
  * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java
--
diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java 
b/src/java/org/apache/cassandra/gms/Gossiper.java
index ae99829..889806c 100644
--- a/src/java/org/apache/cassandra/gms/Gossiper.java
+++ b/src/java/org/apache/cassandra/gms/Gossiper.java
@@ -386,6 +386,7 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 unreachableEndpoints.remove(endpoint);
 endpointStateMap.remove(endpoint);
 expireTimeEndpointMap.remove(endpoint);
+FailureDetector.instance.remove(endpoint);
 quarantineEndpoint(endpoint);
 if (logger.isDebugEnabled())
 logger.debug("evicting {} from gossip", endpoint);
@@ -409,8 +410,6 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 
 liveEndpoints.remove(endpoint);
 unreachableEndpoints.remove(endpoint);
-// do not remove endpointState until the quarantine expires
-FailureDetector.instance.remove(endpoint);
 MessagingService.instance().resetVersion(endpoint);
 quarantineEndpoint(endpoint);
 MessagingService.instance().destroyConnectionPool(endpoint);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
--
diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java 
b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
new file mode 100644
index 000..9325922
--- /dev/null
+++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.gms;
+
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.UUID;
+
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.cassandra.Util;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.dht.IPartitioner;
+import org.apache.cassandra.dht.RandomPartitioner;
+import org.apache.cassandra.dht.Token;
+import org.apache.cassandra.locator.TokenMetadata;
+import org.apache.cassandra.service.StorageService;
+
+import static org.junit.Assert.assertFalse;
+
+public class FailureDetectorTest
+{
+@BeforeClass
+public static void

[10/10] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2016-02-23 Thread jasobrown

Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ac8c8b21
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ac8c8b21
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ac8c8b21

Branch: refs/heads/trunk
Commit: ac8c8b213c6a02de1b547cd537bf9058e851bfdc
Parents: babf30d c4bd6d2
Author: Jason Brown 
Authored: Tue Feb 23 14:38:18 2016 -0800
Committer: Jason Brown 
Committed: Tue Feb 23 14:39:53 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ac8c8b21/CHANGES.txt
--
diff --cc CHANGES.txt
index 6ad7e1f,9ca2f80..361eedc
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -61,8 -33,8 +61,9 @@@ Merged from 2.2
   * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030)
   * Fix paging on DISTINCT queries repeats result when first row in partition 
changes
 (CASSANDRA-10010)
 + * (cqlsh) Support timezone conversion using pytz (CASSANDRA-10397)
  Merged from 2.1:
+  * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
   * Only notify if repair status changed (CASSANDRA-11172)
   * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)
   * Fix sstableloader to unthrottle streaming by default (CASSANDRA-9714)

[08/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2016-02-23 Thread jasobrown

Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c4bd6d25
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c4bd6d25
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c4bd6d25

Branch: refs/heads/cassandra-3.0
Commit: c4bd6d2549bd794b81cf0c9a9ac73b97c5a0b686
Parents: e9abaab 77ff794
Author: Jason Brown 
Authored: Tue Feb 23 14:36:15 2016 -0800
Committer: Jason Brown 
Committed: Tue Feb 23 14:37:56 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/CHANGES.txt
--
diff --cc CHANGES.txt
index cd2a930,e989e7f..9ca2f80
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -34,7 -17,9 +34,8 @@@ Merged from 2.2
   * Fix paging on DISTINCT queries repeats result when first row in partition 
changes
 (CASSANDRA-10010)
  Merged from 2.1:
+  * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
   * Only notify if repair status changed (CASSANDRA-11172)
 - * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
   * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)
   * Fix sstableloader to unthrottle streaming by default (CASSANDRA-9714)
   * Fix incorrect warning in 'nodetool status' (CASSANDRA-10176)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4bd6d25/src/java/org/apache/cassandra/gms/Gossiper.java
--

[01/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint

2016-02-23 Thread jasobrown

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 67637d1bb -> 7877d6f85
  refs/heads/cassandra-2.2 50095947e -> 77ff79473
  refs/heads/cassandra-3.0 e9abaabfe -> c4bd6d254
  refs/heads/trunk babf30dd1 -> ac8c8b213


Don't remove FailureDetector history on removeEndpoint

patch by jkni, reviewed by jasobrown for CASSANDRA-10371


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8

Branch: refs/heads/cassandra-2.1
Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1
Parents: 67637d1
Author: Joel Knighton 
Authored: Fri Feb 19 15:19:33 2016 -0600
Committer: Jason Brown 
Committed: Tue Feb 23 14:30:28 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 52bdcce..82ee99e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.14
+ * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
  * Only notify if repair status changed (CASSANDRA-11172)
  * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
  * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java
--
diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java 
b/src/java/org/apache/cassandra/gms/Gossiper.java
index ae99829..889806c 100644
--- a/src/java/org/apache/cassandra/gms/Gossiper.java
+++ b/src/java/org/apache/cassandra/gms/Gossiper.java
@@ -386,6 +386,7 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 unreachableEndpoints.remove(endpoint);
 endpointStateMap.remove(endpoint);
 expireTimeEndpointMap.remove(endpoint);
+FailureDetector.instance.remove(endpoint);
 quarantineEndpoint(endpoint);
 if (logger.isDebugEnabled())
 logger.debug("evicting {} from gossip", endpoint);
@@ -409,8 +410,6 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 
 liveEndpoints.remove(endpoint);
 unreachableEndpoints.remove(endpoint);
-// do not remove endpointState until the quarantine expires
-FailureDetector.instance.remove(endpoint);
 MessagingService.instance().resetVersion(endpoint);
 quarantineEndpoint(endpoint);
 MessagingService.instance().destroyConnectionPool(endpoint);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
--
diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java 
b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
new file mode 100644
index 000..9325922
--- /dev/null
+++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.gms;
+
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.UUID;
+
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.cassandra.Util;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.dht.IPartitioner;
+import org.apache.cassandra.dht.RandomPartitioner;
+import

[02/10] cassandra git commit: Don't remove FailureDetector history on removeEndpoint

2016-02-23 Thread jasobrown

Don't remove FailureDetector history on removeEndpoint

patch by jkni, reviewed by jasobrown for CASSANDRA-10371


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7877d6f8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7877d6f8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7877d6f8

Branch: refs/heads/cassandra-2.2
Commit: 7877d6f85f1a84d9f9de4d81339730d9df3667a1
Parents: 67637d1
Author: Joel Knighton 
Authored: Fri Feb 19 15:19:33 2016 -0600
Committer: Jason Brown 
Committed: Tue Feb 23 14:30:28 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 52bdcce..82ee99e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.14
+ * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
  * Only notify if repair status changed (CASSANDRA-11172)
  * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
  * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/src/java/org/apache/cassandra/gms/Gossiper.java
--
diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java 
b/src/java/org/apache/cassandra/gms/Gossiper.java
index ae99829..889806c 100644
--- a/src/java/org/apache/cassandra/gms/Gossiper.java
+++ b/src/java/org/apache/cassandra/gms/Gossiper.java
@@ -386,6 +386,7 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 unreachableEndpoints.remove(endpoint);
 endpointStateMap.remove(endpoint);
 expireTimeEndpointMap.remove(endpoint);
+FailureDetector.instance.remove(endpoint);
 quarantineEndpoint(endpoint);
 if (logger.isDebugEnabled())
 logger.debug("evicting {} from gossip", endpoint);
@@ -409,8 +410,6 @@ public class Gossiper implements 
IFailureDetectionEventListener, GossiperMBean
 
 liveEndpoints.remove(endpoint);
 unreachableEndpoints.remove(endpoint);
-// do not remove endpointState until the quarantine expires
-FailureDetector.instance.remove(endpoint);
 MessagingService.instance().resetVersion(endpoint);
 quarantineEndpoint(endpoint);
 MessagingService.instance().destroyConnectionPool(endpoint);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/7877d6f8/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
--
diff --git a/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java 
b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
new file mode 100644
index 000..9325922
--- /dev/null
+++ b/test/unit/org/apache/cassandra/gms/FailureDetectorTest.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.gms;
+
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.UUID;
+
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.cassandra.Util;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.dht.IPartitioner;
+import org.apache.cassandra.dht.RandomPartitioner;
+import org.apache.cassandra.dht.Token;
+import org.apache.cassandra.locator.TokenMetadata;
+import org.apache.cassandra.service.StorageService;
+
+import static org.junit.Assert.assertFalse;
+
+public class FailureDetectorTest
+{
+@BeforeClass
+public static

[05/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

2016-02-23 Thread jasobrown

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/77ff7947
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/77ff7947
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/77ff7947

Branch: refs/heads/cassandra-3.0
Commit: 77ff794737f067b04f1e2fae6124cb22921eb4c7
Parents: 5009594 7877d6f
Author: Jason Brown 
Authored: Tue Feb 23 14:31:38 2016 -0800
Committer: Jason Brown 
Committed: Tue Feb 23 14:35:03 2016 -0800

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  3 +-
 .../cassandra/gms/FailureDetectorTest.java  | 85 
 3 files changed, 87 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/CHANGES.txt
--
diff --cc CHANGES.txt
index 01e7b3d,82ee99e..e989e7f
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,22 -1,5 +1,23 @@@
 -2.1.14
 +2.2.6
 + * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
 + * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
 + * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
 + * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)
 + * Protect from keyspace dropped during repair (CASSANDRA-11065)
 + * Handle adding fields to a UDT in SELECT JSON and toJson() (CASSANDRA-11146)
 + * Better error message for cleanup (CASSANDRA-10991)
 + * cqlsh pg-style-strings broken if line ends with ';' (CASSANDRA-11123)
 + * Use cloned TokenMetadata in size estimates to avoid race against 
membership check
 +   (CASSANDRA-10736)
 + * Always persist upsampled index summaries (CASSANDRA-10512)
 + * (cqlsh) Fix inconsistent auto-complete (CASSANDRA-10733)
 + * Make SELECT JSON and toJson() threadsafe (CASSANDRA-11048)
 + * Fix SELECT on tuple relations for mixed ASC/DESC clustering order 
(CASSANDRA-7281)
 + * (cqlsh) Support utf-8/cp65001 encoding on Windows (CASSANDRA-11030)
 + * Fix paging on DISTINCT queries repeats result when first row in partition 
changes
 +   (CASSANDRA-10010)
 +Merged from 2.1:
+  * Don't remove FailureDetector history on removeEndpoint (CASSANDRA-10371)
   * Only notify if repair status changed (CASSANDRA-11172)
   * Add partition key to TombstoneOverwhelmingException error message 
(CASSANDRA-10888)
   * Use logback setting for 'cassandra -v' command (CASSANDRA-10767)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/77ff7947/src/java/org/apache/cassandra/gms/Gossiper.java
--

[jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip

2016-02-23 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159767#comment-15159767
 ] 

Jason Brown commented on CASSANDRA-10371:
-

+1. Nice detective, Joel. Will commit this afternoon.

> Decommissioned nodes can remain in gossip
> -
>
> Key: CASSANDRA-10371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Brandon Williams
>Assignee: Joel Knighton
>Priority: Minor
>
> This may apply to other dead states as well.  Dead states should be expired 
> after 3 days.  In the case of decom we attach a timestamp to let the other 
> nodes know when it should be expired.  It has been observed that sometimes a 
> subset of nodes in the cluster never expire the state, and through heap 
> analysis of these nodes it is revealed that the epstate.isAlive check returns 
> true when it should return false, which would allow the state to be evicted.  
> This may have been affected by CASSANDRA-8336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10637) Extract LoaderOptions and refactor BulkLoader to be able to be used from within existing Java code instead of just through main()

2016-02-23 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159683#comment-15159683
 ] 

Yuki Morishita commented on CASSANDRA-10637:


Since your patch, there are couples of change in LoaderOptions, so I added them 
and rebased on top of the latest head.
Especially AuthProvider and internodeStreamingThrottle are added so I added 
both to your Builder.

||branch||testall||dtest||
|[10637|https://github.com/yukim/cassandra/tree/10637]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10637-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10637-dtest/lastCompletedBuild/testReport/]|
 

> Extract LoaderOptions and refactor BulkLoader to be able to be used from 
> within existing Java code instead of just through main()
> -
>
> Key: CASSANDRA-10637
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10637
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Eric Fenderbosch
>Priority: Minor
> Fix For: 3.x
>
>
> We are writing a service to migrate data from various RDMBS tables in to 
> Cassandra. We write out a CSV from the source system, use CQLSSTableWriter to 
> write sstables to disk, then call sstableloader to stream to the Cassandra 
> cluster.
> Right now, we either have to:
> * return a CSV location from one Java process to a wrapper script which then 
> kicks off sstableloader
> * or call sstableloader via Runtime.getRuntime().exec
> * or call BulkLoader.main from within our Java code, using a custom 
> SecurityManager to trap the System.exit calls
> * or subclass BulkLoader putting the subclass in the 
> org.apache.cassandra.tools package in order to access the package scoped 
> inner classes
> None of these solutions are ideal. Ideally, we should be able to use the 
> functionality of BulkLoader.main directly. I've extracted LoaderOptions to a 
> top level class that uses the builder pattern so that it can be used as part 
> of a Java migration service directly.
> Creating the builder can now be performed with a fluent builder interface:
> LoaderOptions options = LoaderOptions.builder(). //
> connectionsPerHost(2). //
> directory(directory). //
> hosts(hosts). //
> build();
> Or used to parse command line arguments:
> LoaderOptions options = LoaderOptions.builder().parseArgs(args).build();
> A new load method takes a LoaderOptions parameter and throws 
> BulkLoadException instead of System.exit(1).
> Fork on github can be found here:
> https://github.com/efenderbosch/cassandra



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing on 2.1

2016-02-23 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson reassigned CASSANDRA-11220:
---

Assignee: Philip Thompson  (was: DS Test Eng)

> repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test 
> failing on 2.1
> --
>
> Key: CASSANDRA-11220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11220
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: Philip Thompson
>  Labels: dtest
>
> recent occurence:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/
> last 2 runs failed:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing on 2.1

2016-02-23 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159633#comment-15159633
 ] 

Russ Hatch commented on CASSANDRA-11220:


problem appears to be an assertion which is not true anymore
{noformat}
1 not greater than or equal to 2
{noformat}

> repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test 
> failing on 2.1
> --
>
> Key: CASSANDRA-11220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11220
> Project: Cassandra
>  Issue Type: Test
>Reporter: Russ Hatch
>Assignee: DS Test Eng
>  Labels: dtest
>
> recent occurence:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/
> last 2 runs failed:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing on 2.1

2016-02-23 Thread Russ Hatch (JIRA)

Russ Hatch created CASSANDRA-11220:
--

 Summary: 
repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test 
failing on 2.1
 Key: CASSANDRA-11220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11220
 Project: Cassandra
  Issue Type: Test
Reporter: Russ Hatch
Assignee: DS Test Eng


recent occurence:
http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/

last 2 runs failed:
http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11219) Some Paxos issues

2016-02-23 Thread Andy Chen (JIRA)

Andy Chen created CASSANDRA-11219:
-

 Summary: Some Paxos issues
 Key: CASSANDRA-11219
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11219
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Andy Chen


Two issues,
1. ‘Raw Paxos' without Single Leader may result in non-progress, though waiting 
and retry may solve part of the problem, but cannot be proven that it can make 
progress.
2. learning issue, mostRecentCommit may not sufficient for a learner to catch 
up, such as node A, B and C, C is down, while A and B done with two Commits, 
c1, and c2. now when C is up, only c2 could be learned by C, since update is 
based on row changes (according to my understanding), data may inconsistent due 
to this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0

2016-02-23 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159424#comment-15159424
 ] 

Yuki Morishita commented on CASSANDRA-10990:


Did you try running sstableupgrade from 3.3 against 2.1 SSTables?

> Support streaming of older version sstables in 3.0
> --
>
> Key: CASSANDRA-10990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10990
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Jeremy Hanna
>Assignee: Paulo Motta
>
> In 2.0 we introduced support for streaming older versioned sstables 
> (CASSANDRA-5772).  In 3.0, because of the rewrite of the storage layer, this 
> became no longer supported.  So currently, while 3.0 can read sstables in the 
> 2.1/2.2 format, it cannot stream the older versioned sstables.  We should do 
> some work to make this still possible to be consistent with what 
> CASSANDRA-5772 provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0

2016-02-23 Thread xiaodong wang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159412#comment-15159412
 ] 

xiaodong wang commented on CASSANDRA-10990:
---

Hi,
   I learnt that this issue has been worked on actively. Could you please share 
the updates/ETA of the fix release? 

   I am currently running into the same issue and get blocked when trying to 
migrating the data from C* 2.1.x to C* 3.3 

Did a few approaches:
* Running the "sstableupgrade" & "sstableloader" on the 2.1.x's snapshots won't 
work because of this issue;
* Due to the CASSANDRA-8110:  within the same 2.1.x cluster running "rebuild" 
on 3.3 DC won't work;


> Support streaming of older version sstables in 3.0
> --
>
> Key: CASSANDRA-10990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10990
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Jeremy Hanna
>Assignee: Paulo Motta
>
> In 2.0 we introduced support for streaming older versioned sstables 
> (CASSANDRA-5772).  In 3.0, because of the rewrite of the storage layer, this 
> became no longer supported.  So currently, while 3.0 can read sstables in the 
> 2.1/2.2 format, it cannot stream the older versioned sstables.  We should do 
> some work to make this still possible to be consistent with what 
> CASSANDRA-5772 provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild

2016-02-23 Thread sankalp kohli (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159403#comment-15159403
 ] 

sankalp kohli commented on CASSANDRA-11218:
---

cc [~krummas]  I think we should also prioritize user defined compactions over 
others. What do you think? 

> Prioritize Secondary Index rebuild
> --
>
> Key: CASSANDRA-11218
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11218
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Priority: Minor
>
> We have seen that secondary index rebuild get stuck behind other compaction 
> during a bootstrap and other operations. This causes things to not finish. We 
> should prioritize index rebuild via a separate thread pool or using a 
> priority queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11218) Prioritize Secondary Index rebuild

2016-02-23 Thread sankalp kohli (JIRA)

sankalp kohli created CASSANDRA-11218:
-

 Summary: Prioritize Secondary Index rebuild
 Key: CASSANDRA-11218
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11218
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Priority: Minor


We have seen that secondary index rebuild get stuck behind other compaction 
during a bootstrap and other operations. This causes things to not finish. We 
should prioritize index rebuild via a separate thread pool or using a priority 
queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-23 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159286#comment-15159286
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

bq. Sounds good! We could ask the user to pause, but I think doing that 
automatically via "system interrupts" is better. It just ocurred to me that 
both "the pause" or "system interrupts" will prevent new repairs from starting, 
but what about already running repairs? We will probably want to interrupt 
already running repairs as well in some situations. For this reason 
CASSANDRA-3486 is also relevant for this ticket (adding it as a dependency of 
this ticket).
+1

bq. Then I think we should either have timeout, or add an ability to 
cancel/interrupt a running scheduled repair in the initial version, to avoid 
hanging repairs to render the automatic repair scheduling useless.
I think the timeout would be good enough in the initial version. I guess the 
interruption of repairs would be handled by CASSANDRA-3486? Perhaps it would be 
possible to extend that feature later to be able to cancel a scheduled repair? 
Here I'm thinking that the interruption is stopping the running repair and 
allowing the scheduled job to retry it immediately, while cancelling it would 
prevent the scheduled job from retrying it immediately.

bq. WDYT? Feel free to update or break-up into smaller or larger subtasks, and 
then create the actual subtasks to start work on them.
Sounds good, I'll have a closer look on the subtasks tomorrow! I guess we will 
have sort of a dependency tree for some of the tasks.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10099) Improve concurrency in CompactionStrategyManager

2016-02-23 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159240#comment-15159240
 ] 

Yuki Morishita commented on CASSANDRA-10099:


bq. I'll push an updated branch with both approaches unless you disagree?

Sure, go ahead.

> Improve concurrency in CompactionStrategyManager
> 
>
> Key: CASSANDRA-10099
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10099
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yuki Morishita
>Assignee: Marcus Eriksson
> Fix For: 2.1.x, 2.2.x, 3.x
>
>
> Continue discussion from CASSANDRA-9882.
> CompactionStrategyManager(WrappingCompactionStrategy for <3.0) tracks SSTable 
> changes mainly for separating repaired / unrepaired SSTables (+ LCS manages 
> level).
> This is blocking operation, and can lead to block of flush etc. when 
> determining next background task takes longer.
> Explore the way to mitigate this concurrency issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2016-02-23 Thread tylerhobbs

Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/babf30dd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/babf30dd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/babf30dd

Branch: refs/heads/trunk
Commit: babf30dd13dfdd49398d4067edff874f64051ad2
Parents: fc9c6fa e9abaab
Author: Tyler Hobbs 
Authored: Tue Feb 23 11:29:13 2016 -0600
Committer: Tyler Hobbs 
Committed: Tue Feb 23 11:29:13 2016 -0600

--
 CHANGES.txt   |  1 +
 .../transport/messages/ErrorMessage.java  |  6 --
 .../cassandra/transport/ProtocolErrorTest.java| 18 ++
 3 files changed, 23 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/babf30dd/CHANGES.txt
--

[2/3] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2016-02-23 Thread tylerhobbs

Merge branch 'cassandra-2.2' into cassandra-3.0

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e9abaabf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e9abaabf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e9abaabf

Branch: refs/heads/trunk
Commit: e9abaabfe83f74b1ef7c0273bdd7738402fb0ebc
Parents: 037d24e 5009594
Author: Tyler Hobbs 
Authored: Tue Feb 23 11:29:04 2016 -0600
Committer: Tyler Hobbs 
Committed: Tue Feb 23 11:29:04 2016 -0600

--
 CHANGES.txt   |  1 +
 .../transport/messages/ErrorMessage.java  |  6 --
 .../cassandra/transport/ProtocolErrorTest.java| 18 ++
 3 files changed, 23 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/CHANGES.txt
--
diff --cc CHANGES.txt
index a675016,01e7b3d..cd2a930
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,21 -1,5 +1,22 @@@
 -2.2.6
 +3.0.4
 + * Introduce backpressure for hints (CASSANDRA-10972)
 + * Fix ClusteringPrefix not being able to read tombstone range boundaries 
(CASSANDRA-11158)
 + * Prevent logging in sandboxed state (CASSANDRA-11033)
 + * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721)
 + * Add query time validation method on Index (CASSANDRA-11043)
 + * Avoid potential AssertionError in mixed version cluster (CASSANDRA-11128)
 + * Properly handle hinted handoff after topology changes (CASSANDRA-5902)
 + * AssertionError when listing sstable files on inconsistent disk state 
(CASSANDRA-11156)
 + * Fix wrong rack counting and invalid conditions check for TokenAllocation
 +   (CASSANDRA-11139)
 + * Avoid creating empty hint files (CASSANDRA-11090)
 + * Fix leak detection strong reference loop using weak reference 
(CASSANDRA-11120)
 + * Configurie BatchlogManager to stop delayed tasks on shutdown 
(CASSANDRA-11062)
 + * Hadoop integration is incompatible with Cassandra Driver 3.0.0 
(CASSANDRA-11001)
 + * Add dropped_columns to the list of schema table so it gets handled
 +   properly (CASSANDRA-11050)
 +Merged from 2.2:
+  * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
   * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
   * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
   * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
--

[1/3] cassandra git commit: Avoid NPE when serializing ErrorMessage with null msg

2016-02-23 Thread tylerhobbs

Repository: cassandra
Updated Branches:
  refs/heads/trunk fc9c6faa2 -> babf30dd1


Avoid NPE when serializing ErrorMessage with null msg

Patch by Tyler Hobbs; reviewed by Carl Yeksigian for CASSANDRA-11167


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/50095947
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/50095947
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/50095947

Branch: refs/heads/trunk
Commit: 50095947e25f630ce48ee24d10ff3e1f3fd91183
Parents: c8c8cf6
Author: Tyler Hobbs 
Authored: Tue Feb 23 11:28:17 2016 -0600
Committer: Tyler Hobbs 
Committed: Tue Feb 23 11:28:17 2016 -0600

--
 CHANGES.txt   |  1 +
 .../transport/messages/ErrorMessage.java  |  6 --
 .../cassandra/transport/ProtocolErrorTest.java| 18 ++
 3 files changed, 23 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 767eb8a..01e7b3d 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.6
+ * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
  * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
  * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
  * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
--
diff --git a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java 
b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
index 222e833..021db5a 100644
--- a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
+++ b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
@@ -151,7 +151,8 @@ public class ErrorMessage extends Message.Response
 {
 final TransportException err = 
getBackwardsCompatibleException(msg, version);
 dest.writeInt(err.code().value);
-CBUtil.writeString(err.getMessage(), dest);
+String errorString = err.getMessage() == null ? "" : 
err.getMessage();
+CBUtil.writeString(errorString, dest);
 
 switch (err.code())
 {
@@ -212,7 +213,8 @@ public class ErrorMessage extends Message.Response
 public int encodedSize(ErrorMessage msg, int version)
 {
 final TransportException err = 
getBackwardsCompatibleException(msg, version);
-int size = 4 + CBUtil.sizeOfString(err.getMessage());
+String errorString = err.getMessage() == null ? "" : 
err.getMessage();
+int size = 4 + CBUtil.sizeOfString(errorString);
 switch (err.code())
 {
 case UNAVAILABLE:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
--
diff --git a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java 
b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
index 11b0ebd..fc8c41c 100644
--- a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
+++ b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
@@ -113,4 +113,22 @@ public class ProtocolErrorTest {
 Assert.assertTrue(e.getMessage().contains("Request is too big"));
 }
 }
+
+@Test
+public void testErrorMessageWithNullString() throws Exception
+{
+// test for CASSANDRA-11167
+ErrorMessage msg = ErrorMessage.fromException(new ServerError((String) 
null));
+assert msg.toString().endsWith("null") : msg.toString();
+int size = ErrorMessage.codec.encodedSize(msg, Server.CURRENT_VERSION);
+ByteBuf buf = Unpooled.buffer(size);
+ErrorMessage.codec.encode(msg, buf, Server.CURRENT_VERSION);
+
+ByteBuf expected = Unpooled.wrappedBuffer(new byte[]{
+0x00, 0x00, 0x00, 0x00,  // int error code
+0x00, 0x00   // short message length
+});
+
+Assert.assertEquals(expected, buf);
+}
 }

[2/2] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2016-02-23 Thread tylerhobbs

Merge branch 'cassandra-2.2' into cassandra-3.0

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e9abaabf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e9abaabf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e9abaabf

Branch: refs/heads/cassandra-3.0
Commit: e9abaabfe83f74b1ef7c0273bdd7738402fb0ebc
Parents: 037d24e 5009594
Author: Tyler Hobbs 
Authored: Tue Feb 23 11:29:04 2016 -0600
Committer: Tyler Hobbs 
Committed: Tue Feb 23 11:29:04 2016 -0600

--
 CHANGES.txt   |  1 +
 .../transport/messages/ErrorMessage.java  |  6 --
 .../cassandra/transport/ProtocolErrorTest.java| 18 ++
 3 files changed, 23 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/CHANGES.txt
--
diff --cc CHANGES.txt
index a675016,01e7b3d..cd2a930
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,21 -1,5 +1,22 @@@
 -2.2.6
 +3.0.4
 + * Introduce backpressure for hints (CASSANDRA-10972)
 + * Fix ClusteringPrefix not being able to read tombstone range boundaries 
(CASSANDRA-11158)
 + * Prevent logging in sandboxed state (CASSANDRA-11033)
 + * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721)
 + * Add query time validation method on Index (CASSANDRA-11043)
 + * Avoid potential AssertionError in mixed version cluster (CASSANDRA-11128)
 + * Properly handle hinted handoff after topology changes (CASSANDRA-5902)
 + * AssertionError when listing sstable files on inconsistent disk state 
(CASSANDRA-11156)
 + * Fix wrong rack counting and invalid conditions check for TokenAllocation
 +   (CASSANDRA-11139)
 + * Avoid creating empty hint files (CASSANDRA-11090)
 + * Fix leak detection strong reference loop using weak reference 
(CASSANDRA-11120)
 + * Configurie BatchlogManager to stop delayed tasks on shutdown 
(CASSANDRA-11062)
 + * Hadoop integration is incompatible with Cassandra Driver 3.0.0 
(CASSANDRA-11001)
 + * Add dropped_columns to the list of schema table so it gets handled
 +   properly (CASSANDRA-11050)
 +Merged from 2.2:
+  * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
   * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
   * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
   * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9abaabf/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
--

cassandra git commit: Avoid NPE when serializing ErrorMessage with null msg

2016-02-23 Thread tylerhobbs

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 c8c8cf679 -> 50095947e


Avoid NPE when serializing ErrorMessage with null msg

Patch by Tyler Hobbs; reviewed by Carl Yeksigian for CASSANDRA-11167


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/50095947
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/50095947
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/50095947

Branch: refs/heads/cassandra-2.2
Commit: 50095947e25f630ce48ee24d10ff3e1f3fd91183
Parents: c8c8cf6
Author: Tyler Hobbs 
Authored: Tue Feb 23 11:28:17 2016 -0600
Committer: Tyler Hobbs 
Committed: Tue Feb 23 11:28:17 2016 -0600

--
 CHANGES.txt   |  1 +
 .../transport/messages/ErrorMessage.java  |  6 --
 .../cassandra/transport/ProtocolErrorTest.java| 18 ++
 3 files changed, 23 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 767eb8a..01e7b3d 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.6
+ * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
  * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
  * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
  * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
--
diff --git a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java 
b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
index 222e833..021db5a 100644
--- a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
+++ b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
@@ -151,7 +151,8 @@ public class ErrorMessage extends Message.Response
 {
 final TransportException err = 
getBackwardsCompatibleException(msg, version);
 dest.writeInt(err.code().value);
-CBUtil.writeString(err.getMessage(), dest);
+String errorString = err.getMessage() == null ? "" : 
err.getMessage();
+CBUtil.writeString(errorString, dest);
 
 switch (err.code())
 {
@@ -212,7 +213,8 @@ public class ErrorMessage extends Message.Response
 public int encodedSize(ErrorMessage msg, int version)
 {
 final TransportException err = 
getBackwardsCompatibleException(msg, version);
-int size = 4 + CBUtil.sizeOfString(err.getMessage());
+String errorString = err.getMessage() == null ? "" : 
err.getMessage();
+int size = 4 + CBUtil.sizeOfString(errorString);
 switch (err.code())
 {
 case UNAVAILABLE:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
--
diff --git a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java 
b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
index 11b0ebd..fc8c41c 100644
--- a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
+++ b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
@@ -113,4 +113,22 @@ public class ProtocolErrorTest {
 Assert.assertTrue(e.getMessage().contains("Request is too big"));
 }
 }
+
+@Test
+public void testErrorMessageWithNullString() throws Exception
+{
+// test for CASSANDRA-11167
+ErrorMessage msg = ErrorMessage.fromException(new ServerError((String) 
null));
+assert msg.toString().endsWith("null") : msg.toString();
+int size = ErrorMessage.codec.encodedSize(msg, Server.CURRENT_VERSION);
+ByteBuf buf = Unpooled.buffer(size);
+ErrorMessage.codec.encode(msg, buf, Server.CURRENT_VERSION);
+
+ByteBuf expected = Unpooled.wrappedBuffer(new byte[]{
+0x00, 0x00, 0x00, 0x00,  // int error code
+0x00, 0x00   // short message length
+});
+
+Assert.assertEquals(expected, buf);
+}
 }

[1/2] cassandra git commit: Avoid NPE when serializing ErrorMessage with null msg

2016-02-23 Thread tylerhobbs

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 037d24efd -> e9abaabfe


Avoid NPE when serializing ErrorMessage with null msg

Patch by Tyler Hobbs; reviewed by Carl Yeksigian for CASSANDRA-11167


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/50095947
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/50095947
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/50095947

Branch: refs/heads/cassandra-3.0
Commit: 50095947e25f630ce48ee24d10ff3e1f3fd91183
Parents: c8c8cf6
Author: Tyler Hobbs 
Authored: Tue Feb 23 11:28:17 2016 -0600
Committer: Tyler Hobbs 
Committed: Tue Feb 23 11:28:17 2016 -0600

--
 CHANGES.txt   |  1 +
 .../transport/messages/ErrorMessage.java  |  6 --
 .../cassandra/transport/ProtocolErrorTest.java| 18 ++
 3 files changed, 23 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 767eb8a..01e7b3d 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.6
+ * Avoid NPE when serializing ErrorMessage with null message (CASSANDRA-11167)
  * Replacing an aggregate with a new version doesn't reset INITCOND 
(CASSANDRA-10840)
  * (cqlsh) cqlsh cannot be called through symlink (CASSANDRA-11037)
  * fix ohc and java-driver pom dependencies in build.xml (CASSANDRA-10793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
--
diff --git a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java 
b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
index 222e833..021db5a 100644
--- a/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
+++ b/src/java/org/apache/cassandra/transport/messages/ErrorMessage.java
@@ -151,7 +151,8 @@ public class ErrorMessage extends Message.Response
 {
 final TransportException err = 
getBackwardsCompatibleException(msg, version);
 dest.writeInt(err.code().value);
-CBUtil.writeString(err.getMessage(), dest);
+String errorString = err.getMessage() == null ? "" : 
err.getMessage();
+CBUtil.writeString(errorString, dest);
 
 switch (err.code())
 {
@@ -212,7 +213,8 @@ public class ErrorMessage extends Message.Response
 public int encodedSize(ErrorMessage msg, int version)
 {
 final TransportException err = 
getBackwardsCompatibleException(msg, version);
-int size = 4 + CBUtil.sizeOfString(err.getMessage());
+String errorString = err.getMessage() == null ? "" : 
err.getMessage();
+int size = 4 + CBUtil.sizeOfString(errorString);
 switch (err.code())
 {
 case UNAVAILABLE:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/50095947/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
--
diff --git a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java 
b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
index 11b0ebd..fc8c41c 100644
--- a/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
+++ b/test/unit/org/apache/cassandra/transport/ProtocolErrorTest.java
@@ -113,4 +113,22 @@ public class ProtocolErrorTest {
 Assert.assertTrue(e.getMessage().contains("Request is too big"));
 }
 }
+
+@Test
+public void testErrorMessageWithNullString() throws Exception
+{
+// test for CASSANDRA-11167
+ErrorMessage msg = ErrorMessage.fromException(new ServerError((String) 
null));
+assert msg.toString().endsWith("null") : msg.toString();
+int size = ErrorMessage.codec.encodedSize(msg, Server.CURRENT_VERSION);
+ByteBuf buf = Unpooled.buffer(size);
+ErrorMessage.codec.encode(msg, buf, Server.CURRENT_VERSION);
+
+ByteBuf expected = Unpooled.wrappedBuffer(new byte[]{
+0x00, 0x00, 0x00, 0x00,  // int error code
+0x00, 0x00   // short message length
+});
+
+Assert.assertEquals(expected, buf);
+}
 }

[jira] [Commented] (CASSANDRA-7464) Replace sstable2json and json2sstable

2016-02-23 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159235#comment-15159235
 ] 

Yuki Morishita commented on CASSANDRA-7464:
---

Fixed one more bug (handle case sensitive column name) and backported to 3.0 as 
well.

||branch||testall||dtest||
|[7464-3.0|https://github.com/yukim/cassandra/tree/7464-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-3.0-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-3.0-dtest/lastCompletedBuild/testReport/]|
|[7464|https://github.com/yukim/cassandra/tree/7464]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-7464-dtest/lastCompletedBuild/testReport/]|

Tests are running.

> Replace sstable2json and json2sstable
> -
>
> Key: CASSANDRA-7464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7464
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
> Attachments: sstable-only.patch, sstabledump.patch
>
>
> Both tools are pretty awful. They are primarily meant for debugging (there is 
> much more efficient and convenient ways to do import/export data), but their 
> output manage to be hard to handle both for humans and for tools (especially 
> as soon as you have modern stuff like composites).
> There is value to having tools to export sstable contents into a format that 
> is easy to manipulate by human and tools for debugging, small hacks and 
> general tinkering, but sstable2json and json2sstable are not that.  
> So I propose that we deprecate those tools and consider writing better 
> replacements. It shouldn't be too hard to come up with an output format that 
> is more aware of modern concepts like composites, UDTs, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7464) Replace sstable2json and json2sstable

2016-02-23 Thread Yuki Morishita (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-7464:
--
Fix Version/s: 3.0.x

> Replace sstable2json and json2sstable
> -
>
> Key: CASSANDRA-7464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7464
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
> Attachments: sstable-only.patch, sstabledump.patch
>
>
> Both tools are pretty awful. They are primarily meant for debugging (there is 
> much more efficient and convenient ways to do import/export data), but their 
> output manage to be hard to handle both for humans and for tools (especially 
> as soon as you have modern stuff like composites).
> There is value to having tools to export sstable contents into a format that 
> is easy to manipulate by human and tools for debugging, small hacks and 
> general tinkering, but sstable2json and json2sstable are not that.  
> So I propose that we deprecate those tools and consider writing better 
> replacements. It shouldn't be too hard to come up with an output format that 
> is more aware of modern concepts like composites, UDTs, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159202#comment-15159202
 ] 

Marcus Olsson commented on CASSANDRA-11215:
---

After looking around a bit in the dtests I think that self.ignore_log_patterns 
could handle that, although our expected error is causing more errors than 
"Cannot start multiple repairs". There are errors logged from nodetool and the 
repair sessions as well. But for this test I guess the important part is that 
there are no "LEAK DETECTED" error messages, right? Assuming that is the case, 
could we simply ignore the other repair errors?

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8110) Make streaming backwards compatible

2016-02-23 Thread xiaodong wang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159153#comment-15159153
 ] 

xiaodong wang commented on CASSANDRA-8110:
--

Thanks for your prompt response, Paulo.

Some background:

Currently I have a C* 2.1.x cluster and try to migrate the data to C* 3.3. 

Already did a few approaches:

* Set up another 3.3 DC and ran the "rebuild";
* Set up a new 3.3 cluster and ran the "sstableupgrade" & "sstableloader" ;

Both didn't work. 

> Make streaming backwards compatible
> ---
>
> Key: CASSANDRA-8110
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8110
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Marcus Eriksson
>  Labels: gsoc2016, mentor
> Fix For: 3.x
>
>
> To be able to seamlessly upgrade clusters we need to make it possible to 
> stream files between nodes with different StreamMessage.CURRENT_VERSION



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11164) Order and filter cipher suites correctly

2016-02-23 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-11164:
---
Attachment: 11164-2.2_2_call_filterCipherSuites_everywhere.patch

> Order and filter cipher suites correctly
> 
>
> Key: CASSANDRA-11164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11164
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom Petracca
>Assignee: Stefan Podkowinski
>Priority: Minor
> Fix For: 2.2.x
>
> Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, 
> 11164-2.2_2_call_filterCipherSuites_everywhere.patch
>
>
> As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, 
> SSLFactory.filterCipherSuites() doesn't respect the ordering of desired 
> ciphers in cassandra.yaml.
> Also the fix that occurred for 
> https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs 
> to be applied to all locations where we create an SSLSocket so that JCE is 
> not required out of the box or with additional configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11164) Order and filter cipher suites correctly

2016-02-23 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-11164:
---
Attachment: (was: 11164-on-10508-2.2.patch)

> Order and filter cipher suites correctly
> 
>
> Key: CASSANDRA-11164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11164
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom Petracca
>Assignee: Stefan Podkowinski
>Priority: Minor
> Fix For: 2.2.x
>
> Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, 
> 11164-2.2_2_call_filterCipherSuites_everywhere.patch
>
>
> As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, 
> SSLFactory.filterCipherSuites() doesn't respect the ordering of desired 
> ciphers in cassandra.yaml.
> Also the fix that occurred for 
> https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs 
> to be applied to all locations where we create an SSLSocket so that JCE is 
> not required out of the box or with additional configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11164) Order and filter cipher suites correctly

2016-02-23 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-11164:
---
Attachment: 11164-2.2_1_preserve_cipher_order.patch

> Order and filter cipher suites correctly
> 
>
> Key: CASSANDRA-11164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11164
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom Petracca
>Assignee: Stefan Podkowinski
>Priority: Minor
> Fix For: 2.2.x
>
> Attachments: 11164-2.2.txt, 11164-2.2_1_preserve_cipher_order.patch, 
> 11164-2.2_2_call_filterCipherSuites_everywhere.patch
>
>
> As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, 
> SSLFactory.filterCipherSuites() doesn't respect the ordering of desired 
> ciphers in cassandra.yaml.
> Also the fix that occurred for 
> https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs 
> to be applied to all locations where we create an SSLSocket so that JCE is 
> not required out of the box or with additional configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11164) Order and filter cipher suites correctly

2016-02-23 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159131#comment-15159131
 ] 

Stefan Podkowinski commented on CASSANDRA-11164:


Scope of this ticket as reported would be:
- respect ordering of enabled ciphers
- apply cipher filtering wherever SSL is used

I've now created two patches for that:
- {{11164-2.2_1_preserve_cipher_order.patch}} - cherry picked 
{{filterCipherSuites}} implementation and unit test from CASSANDRA-10508 with 
some of your suggested changes
- {{11164-2.2_2_call_filterCipherSuites_everywhere.patch}} - this is 
{{11164-2.2.txt}} from Tom minus the {{filterCipherSuites}} implementation

||2.2||
|[Branch|https://github.com/spodkowinski/cassandra/commits/CASSANDRA-11164]|
|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-11164-testall/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-11164-dtest/]|


> Order and filter cipher suites correctly
> 
>
> Key: CASSANDRA-11164
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11164
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom Petracca
>Assignee: Stefan Podkowinski
>Priority: Minor
> Fix For: 2.2.x
>
> Attachments: 11164-2.2.txt, 11164-on-10508-2.2.patch
>
>
> As pointed out in https://issues.apache.org/jira/browse/CASSANDRA-10508, 
> SSLFactory.filterCipherSuites() doesn't respect the ordering of desired 
> ciphers in cassandra.yaml.
> Also the fix that occurred for 
> https://issues.apache.org/jira/browse/CASSANDRA-3278 is incomplete and needs 
> to be applied to all locations where we create an SSLSocket so that JCE is 
> not required out of the box or with additional configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159110#comment-15159110
 ] 

Marcus Eriksson commented on CASSANDRA-11209:
-

ok, with incremental repair an sstable can only be involved in a single repair 
session (since we are going to anticompact it after, the sstable would be gone 
after the second repair finished)

It should of course not mess up the live size, I will try to reproduce. (hmm.. 
maybe it is CASSANDRA-11215)

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159101#comment-15159101
 ] 

Jose Fernandez commented on CASSANDRA-11209:


This is the error on 10.1.29.31

ERROR 22:08:05 Cannot start multiple repair sessions over the same sstables
ERROR 22:08:05 Failed creating a merkle tree for [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]], /10.1.28.32 (see log for details)
ERROR 22:08:05 Exception in thread Thread[ValidationExecutor:8,1,main]
java.lang.RuntimeException: Cannot start multiple repair sessions over the same 
sstables
at 
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1043)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:89)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:692)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]


> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159099#comment-15159099
 ] 

Marcus Eriksson commented on CASSANDRA-11209:
-

and on 10.1.29.31 ?

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159096#comment-15159096
 ] 

Jose Fernandez edited comment on CASSANDRA-11209 at 2/23/16 4:06 PM:
-

Actually, I just spotted an error during repair:

```
ERROR 22:08:05 [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee] session completed 
with the following error
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.13.jar:2.1.13]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
ERROR 22:08:05 Repair session a85c9760-d9b0-11e5-9b9c-c12de94ec9ee for range 
(7686143364045646505,-6148914691236517207] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
[na:1.8.0_66]
at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
[na:1.8.0_66]
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:3048)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
[apache-cassandra-2.1.13.jar:2.1.13]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
[apache-cassandra-2.1.13.jar:2.1.13]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_66]
... 1 common frames omitted
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.13.jar:2.1.13]
... 3 common frames omitted
ERROR 22:08:05 Exception in thread Thread[AntiEntropySessions:1,5,jolokia]
java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
[repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at

[jira] [Updated] (CASSANDRA-11176) SSTableRewriter.InvalidateKeys should have a weak reference to cache

2016-02-23 Thread Joshua McKenzie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11176:

Assignee: Marcus Eriksson

> SSTableRewriter.InvalidateKeys should have a weak reference to cache
> 
>
> Key: CASSANDRA-11176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11176
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Marcus Eriksson
> Fix For: 3.0.x
>
>
> From [~aweisberg]
> bq. The SSTableReader.DropPageCache runnable references 
> SSTableRewriter.InvalidateKeys which references the cache. The cache 
> reference should be a WeakReference.
> {noformat}
> ERROR [Strong-Reference-Leak-Detector:1] 2016-02-17 14:51:52,111  
> NoSpamLogger.java:97 - Strong self-ref loop detected 
> [/var/lib/cassandra/data/keyspace1/standard1-990bc741d56411e591d5590d7a7ad312/ma-20-big,
> private java.lang.Runnable 
> org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.runOnClose-org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache,
> final java.lang.Runnable 
> org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache.andThen-org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys,
> final org.apache.cassandra.cache.InstrumentingCache 
> org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys.cache-org.apache.cassandra.cache.AutoSavingCache,
> protected volatile java.util.concurrent.ScheduledFuture 
> org.apache.cassandra.cache.AutoSavingCache.saveTask-java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask,
> final java.util.concurrent.ScheduledThreadPoolExecutor 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.this$0-org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor,
> private final java.util.concurrent.BlockingQueue 
> java.util.concurrent.ThreadPoolExecutor.workQueue-java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue,
> private final java.util.concurrent.BlockingQueue 
> java.util.concurrent.ThreadPoolExecutor.workQueue-java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask,
> private java.util.concurrent.Callable 
> java.util.concurrent.FutureTask.callable-java.util.concurrent.Executors$RunnableAdapter,
> final java.lang.Runnable 
> java.util.concurrent.Executors$RunnableAdapter.task-org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable,
> private final java.lang.Runnable 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.runnable-org.apache.cassandra.db.ColumnFamilyStore$3,
> final org.apache.cassandra.db.ColumnFamilyStore 
> org.apache.cassandra.db.ColumnFamilyStore$3.this$0-org.apache.cassandra.db.ColumnFamilyStore,
> public final org.apache.cassandra.db.Keyspace 
> org.apache.cassandra.db.ColumnFamilyStore.keyspace-org.apache.cassandra.db.Keyspace,
> private final java.util.concurrent.ConcurrentMap 
> org.apache.cassandra.db.Keyspace.columnFamilyStores-java.util.concurrent.ConcurrentHashMap,
> private final java.util.concurrent.ConcurrentMap 
> org.apache.cassandra.db.Keyspace.columnFamilyStores-org.apache.cassandra.db.ColumnFamilyStore,
> private final org.apache.cassandra.db.lifecycle.Tracker 
> org.apache.cassandra.db.ColumnFamilyStore.data-org.apache.cassandra.db.lifecycle.Tracker,
> final java.util.concurrent.atomic.AtomicReference 
> org.apache.cassandra.db.lifecycle.Tracker.view-java.util.concurrent.atomic.AtomicReference,
> private volatile java.lang.Object 
> java.util.concurrent.atomic.AtomicReference.value-org.apache.cassandra.db.lifecycle.View,
> public final java.util.List 
> org.apache.cassandra.db.lifecycle.View.liveMemtables-com.google.common.collect.SingletonImmutableList,
> final transient java.lang.Object 
> com.google.common.collect.SingletonImmutableList.element-org.apache.cassandra.db.Memtable,
> private final org.apache.cassandra.utils.memory.MemtableAllocator 
> org.apache.cassandra.db.Memtable.allocator-org.apache.cassandra.utils.memory.SlabAllocator,
> private final 
> org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator 
> org.apache.cassandra.utils.memory.MemtableAllocator.onHeap-org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator,
> private final org.apache.cassandra.utils.memory.MemtablePool$SubPool 
> org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.parent-org.apache.cassandra.utils.memory.MemtablePool$SubPool,
> final org.apache.cassandra.utils.memory.MemtablePool 
> org.apache.cassandra.utils.memory.MemtablePool$SubPool.this$0-org.apache.cassandra.utils.memory.SlabPool,
> final org.apache.cassandra.utils.memory.MemtableCleanerThread 
>

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159096#comment-15159096
 ] 

Jose Fernandez commented on CASSANDRA-11209:


Actually, I just spotted an error during repair:

ERROR 22:08:05 [repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee] session completed 
with the following error
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.13.jar:2.1.13]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
ERROR 22:08:05 Repair session a85c9760-d9b0-11e5-9b9c-c12de94ec9ee for range 
(7686143364045646505,-6148914691236517207] failed with error 
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
[na:1.8.0_66]
at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
[na:1.8.0_66]
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:3048)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
[apache-cassandra-2.1.13.jar:2.1.13]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
[apache-cassandra-2.1.13.jar:2.1.13]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_66]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_66]
... 1 common frames omitted
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:415)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
 ~[apache-cassandra-2.1.13.jar:2.1.13]
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-2.1.13.jar:2.1.13]
... 3 common frames omitted
ERROR 22:08:05 Exception in thread Thread[AntiEntropySessions:1,5,jolokia]
java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: 
[repair #a85c9760-d9b0-11e5-9b9c-c12de94ec9ee on 
timeslice_store/minute_timeslice_blobs, 
(7686143364045646505,-6148914691236517207]] Validation failed in /10.1.29.31
at com.google.common.base.Throwables.propagate(Throwables.java:160) 
~[guava-16.0.jar:na]
at

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159089#comment-15159089
 ] 

Marcus Eriksson commented on CASSANDRA-11209:
-

very strange

I assume there are no error messages/exceptions in the logs on that node?

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159085#comment-15159085
 ] 

Jose Fernandez commented on CASSANDRA-11209:


Yes, out of the 4 node cluster, only one is showing this behavior (its always 
the same one). Repairs run on all of them.

They all have the same version of Cassandra and exact same settings (we've 
Dockerized it).

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159053#comment-15159053
 ] 

Marcus Olsson commented on CASSANDRA-11215:
---

I could try to do it. I guess it more or less would be to go through the 
reproduction steps and grep in the logs for reference leak, right? I'll put it 
in the repair_test.py dtest. :)

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159072#comment-15159072
 ] 

Marcus Eriksson commented on CASSANDRA-11215:
-

Yes, only tricky part is to handle the expected "Cannot start multiple repairs" 
error message

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-11108) Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test on 2.1 and 2.2

2016-02-23 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-11108.
--
Resolution: Fixed

Sure, if it passes it means we have updated the python driver somehow, we're 
good.

> Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test 
> on 2.1 and 2.2
> ---
>
> Key: CASSANDRA-11108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11108
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>
> The aforementioned test fails on 2.1 and 2.2 (the only branch on which it is 
> run actually) due to https://datastax-oss.atlassian.net/browse/PYTHON-459. 
> That ticket has been fixed but I don't think the version incorporating it has 
> been released yet. This ticket is so we don't got forget to act once said 
> version is released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159081#comment-15159081
 ] 

Marcus Eriksson commented on CASSANDRA-11209:
-

is there only one node showing this? not all nodes involved in the repairs?

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159074#comment-15159074
 ] 

Jan Urbański commented on CASSANDRA-11209:
--

BTW: the initial bump in Live and Total disk space was caused by a Cassandra 
restart, similar to the previous graph. The slowly diverging Live vs Total disk 
space afterwards is what's worrying.

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11108) Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test on 2.1 and 2.2

2016-02-23 Thread Jim Witschey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159075#comment-15159075
 ] 

Jim Witschey commented on CASSANDRA-11108:
--

This seems to be passing now:

http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-2.1_dtest/lastCompletedBuild/testReport/cql_tests/MiscellaneousCQLTester/large_collection_errors_test/history/

http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/junit/cql_tests/MiscellaneousCQLTester/large_collection_errors_test/history/

Shall we close this? From the description of the issue, I'm not sure whether or 
not more needs to be done.

> Fix failure of cql_tests.MiscellaneousCQLTester.large_collection_errors_test 
> on 2.1 and 2.2
> ---
>
> Key: CASSANDRA-11108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11108
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>
> The aforementioned test fails on 2.1 and 2.2 (the only branch on which it is 
> run actually) due to https://datastax-oss.atlassian.net/browse/PYTHON-459. 
> That ticket has been fixed but I don't think the version incorporating it has 
> been released yet. This ticket is so we don't got forget to act once said 
> version is released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Fernandez updated CASSANDRA-11209:
---
Attachment: screenshot-2.png

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159061#comment-15159061
 ] 

Jose Fernandez commented on CASSANDRA-11209:


[~krummas] we currently have a cluster that's showing this issue in staging. 
You can see in the screenshot attached the growing divergence in Live vs Total. 
We haven't restarted the node yet in hopes you could point us at some things we 
could look at to debug this.

!screenshot-2.png!

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Fernandez updated CASSANDRA-11209:
---
Attachment: (was: screenshot-2.png)

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Jose Fernandez (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Fernandez updated CASSANDRA-11209:
---
Attachment: screenshot-2.png

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11203) Improve nothing to repair message when RF=1

2016-02-23 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11203:

Priority: Trivial  (was: Major)

> Improve nothing to repair message when RF=1
> ---
>
> Key: CASSANDRA-11203
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11203
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: debian jesse up to date content
>Reporter: Jason Kania
>Priority: Trivial
>  Labels: lhf
>
> When nodetool repair is run, it indicates that no repair is needed on some 
> keyspaces but on others it attempts repair. However, when run multiple times, 
> the output seems to indicate that the same triggering conditions still 
> persists that indicate a problem. Alternatively, the output could indicate 
> that the underlying condition has not been resolved.
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:12,144] Repair completed successfully
> [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:33,324] Repair completed successfully
> [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10972) File based hints don't implement backpressure and can OOM

2016-02-23 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159050#comment-15159050
 ] 

Aleksey Yeschenko commented on CASSANDRA-10972:
---

LGTM from me as well.

Committed as 
[037d24efdf83bd2736556f9880c5e1f6be48fa77|https://github.com/apache/cassandra/commit/037d24efdf83bd2736556f9880c5e1f6be48fa77]
 to 3.0 and merged with trunk, thanks.

> File based hints don't implement backpressure and can OOM
> -
>
> Key: CASSANDRA-10972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10972
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
>
> This is something I reproduced in practice. I have what I think is a 
> reasonable implementation of backpressure, but still need to put together a 
> unit test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance

2016-02-23 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11053:

Reviewer: Adam Holmberg  (was: Paulo Motta)

> COPY FROM on large datasets: fix progress report and debug performance
> --
>
> Key: CASSANDRA-11053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11053
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: copy_from_large_benchmark.txt, 
> copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, 
> worker_profiles.txt, worker_profiles_2.txt
>
>
> Running COPY from on a large dataset (20G divided in 20M records) revealed 
> two issues:
> * The progress report is incorrect, it is very slow until almost the end of 
> the test at which point it catches up extremely quickly.
> * The performance in rows per second is similar to running smaller tests with 
> a smaller cluster locally (approx 35,000 rows per second). As a comparison, 
> cassandra-stress manages 50,000 rows per second under the same set-up, 
> therefore resulting 1.5 times faster. 
> See attached file _copy_from_large_benchmark.txt_ for the benchmark details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11203) Improve nothing to repair message when RF=1

2016-02-23 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11203:

Summary: Improve nothing to repair message when RF=1  (was: nodetool repair 
not performing repair or being incorrectly triggered in 3.0.3)

> Improve nothing to repair message when RF=1
> ---
>
> Key: CASSANDRA-11203
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11203
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: debian jesse up to date content
>Reporter: Jason Kania
>  Labels: lhf
>
> When nodetool repair is run, it indicates that no repair is needed on some 
> keyspaces but on others it attempts repair. However, when run multiple times, 
> the output seems to indicate that the same triggering conditions still 
> persists that indicate a problem. Alternatively, the output could indicate 
> that the underlying condition has not been resolved.
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:12,144] Repair completed successfully
> [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:33,324] Repair completed successfully
> [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11203) Improve nothing to repair message when RF=1

2016-02-23 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11203:

Labels: lhf  (was: )

> Improve nothing to repair message when RF=1
> ---
>
> Key: CASSANDRA-11203
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11203
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: debian jesse up to date content
>Reporter: Jason Kania
>Priority: Trivial
>  Labels: lhf
>
> When nodetool repair is run, it indicates that no repair is needed on some 
> keyspaces but on others it attempts repair. However, when run multiple times, 
> the output seems to indicate that the same triggering conditions still 
> persists that indicate a problem. Alternatively, the output could indicate 
> that the underlying condition has not been resolved.
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:12,144] Repair completed successfully
> [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:33,324] Repair completed successfully
> [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[1/3] cassandra git commit: Introduce backpressure for hints

2016-02-23 Thread aleksey

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 fe37e0644 -> 037d24efd
  refs/heads/trunk 4b27287cd -> fc9c6faa2


Introduce backpressure for hints

patch by Ariel Weisberg; reviewed by Benedict Elliott Smith for
CASSANDRA-10972


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/037d24ef
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/037d24ef
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/037d24ef

Branch: refs/heads/cassandra-3.0
Commit: 037d24efdf83bd2736556f9880c5e1f6be48fa77
Parents: fe37e06
Author: Ariel Weisberg 
Authored: Mon Dec 28 16:32:05 2015 -0500
Committer: Aleksey Yeschenko 
Committed: Tue Feb 23 15:28:41 2016 +

--
 CHANGES.txt |  1 +
 build.xml   | 14 +++-
 .../apache/cassandra/hints/HintsBufferPool.java | 34 ++---
 .../cassandra/hints/HintsWriteExecutor.java |  3 +-
 .../cassandra/hints/HintsBufferPoolTest.java| 75 
 .../apache/cassandra/hints/HintsBufferTest.java |  2 +-
 6 files changed, 114 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index da91594..a675016 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.4
+ * Introduce backpressure for hints (CASSANDRA-10972)
  * Fix ClusteringPrefix not being able to read tombstone range boundaries 
(CASSANDRA-11158)
  * Prevent logging in sandboxed state (CASSANDRA-11033)
  * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/build.xml
--
diff --git a/build.xml b/build.xml
index d27b77a..6ef99fd 100644
--- a/build.xml
+++ b/build.xml
@@ -111,6 +111,8 @@
 
 
 
+
+
 
 
 
@@ -382,6 +384,11 @@
   
   
 
+  
+  
+  
+
+
   
   
 
@@ -479,7 +486,10 @@
 artifactId="cassandra-parent"
 version="${version}"/>
 
-
+
+
+
+
   
 
   

 
+  
   
   
   
@@ -1701,6 +1712,7 @@
   
   
   
+  
 ]]>
   


http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/src/java/org/apache/cassandra/hints/HintsBufferPool.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsBufferPool.java 
b/src/java/org/apache/cassandra/hints/HintsBufferPool.java
index 83b155a..25f9bc1 100644
--- a/src/java/org/apache/cassandra/hints/HintsBufferPool.java
+++ b/src/java/org/apache/cassandra/hints/HintsBufferPool.java
@@ -17,10 +17,11 @@
  */
 package org.apache.cassandra.hints;
 
-import java.util.Queue;
 import java.util.UUID;
-import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.LinkedBlockingQueue;
 
+import org.apache.cassandra.config.Config;
 import org.apache.cassandra.net.MessagingService;
 
 /**
@@ -34,15 +35,16 @@ final class HintsBufferPool
 void flush(HintsBuffer buffer, HintsBufferPool pool);
 }
 
+static final int MAX_ALLOCATED_BUFFERS = 
Integer.getInteger(Config.PROPERTY_PREFIX + "MAX_HINT_BUFFERS", 3);
 private volatile HintsBuffer currentBuffer;
-private final Queue reserveBuffers;
+private final BlockingQueue reserveBuffers;
 private final int bufferSize;
 private final FlushCallback flushCallback;
+private int allocatedBuffers = 0;
 
 HintsBufferPool(int bufferSize, FlushCallback flushCallback)
 {
-reserveBuffers = new ConcurrentLinkedQueue<>();
-
+reserveBuffers = new LinkedBlockingQueue<>();
 this.bufferSize = bufferSize;
 this.flushCallback = flushCallback;
 }
@@ -78,13 +80,10 @@ final class HintsBufferPool
 }
 }
 
-boolean offer(HintsBuffer buffer)
+void offer(HintsBuffer buffer)
 {
-if (!reserveBuffers.isEmpty())
-return false;
-
-reserveBuffers.offer(buffer);
-return true;
+if (!reserveBuffers.offer(buffer))
+throw new RuntimeException("Failed to store buffer");
 }
 
 // A wrapper to ensure a non-null currentBuffer value on the first call.
@@ -108,6 +107,18 @@ final class HintsBufferPool
 return false;
 
 HintsBuffer buffer = reserveBuffers.poll();
+if (buffer == null && allocatedBuffers >= MAX_ALLOCATED_BUFFERS)
+{
+try
+

[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2016-02-23 Thread aleksey

Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fc9c6faa
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fc9c6faa
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fc9c6faa

Branch: refs/heads/trunk
Commit: fc9c6faa23f5662ce8d7caedb5d7f1ac3d1fcea6
Parents: 4b27287 037d24e
Author: Aleksey Yeschenko 
Authored: Tue Feb 23 15:30:35 2016 +
Committer: Aleksey Yeschenko 
Committed: Tue Feb 23 15:30:35 2016 +

--
 CHANGES.txt |  1 +
 build.xml   | 14 +++-
 .../apache/cassandra/hints/HintsBufferPool.java | 34 ++---
 .../cassandra/hints/HintsWriteExecutor.java |  3 +-
 .../cassandra/hints/HintsBufferPoolTest.java| 75 
 .../apache/cassandra/hints/HintsBufferTest.java |  2 +-
 6 files changed, 114 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc9c6faa/CHANGES.txt
--
diff --cc CHANGES.txt
index 01f0e84,a675016..dd67598
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,32 -1,5 +1,33 @@@
 -3.0.4
 +3.4
 + * fix OnDiskIndexTest to properly treat empty ranges (CASSANDRA-11205)
 + * fix TrackerTest to handle new notifications (CASSANDRA-11178)
 + * add SASI validation for partitioner and complex columns (CASSANDRA-11169)
 + * Add caching of encrypted credentials in PasswordAuthenticator 
(CASSANDRA-7715)
 + * fix SASI memtable switching on flush (CASSANDRA-11159)
 + * Remove duplicate offline compaction tracking (CASSANDRA-11148)
 + * fix EQ semantics of analyzed SASI indexes (CASSANDRA-11130)
 + * Support long name output for nodetool commands (CASSANDRA-7950)
 + * Encrypted hints (CASSANDRA-11040)
 + * SASI index options validation (CASSANDRA-11136)
 + * Optimize disk seek using min/max column name meta data when the LIMIT 
clause is used
 +   (CASSANDRA-8180)
 + * Add LIKE support to CQL3 (CASSANDRA-11067)
 + * Generic Java UDF types (CASSANDRA-10819)
 + * cqlsh: Include sub-second precision in timestamps by default 
(CASSANDRA-10428)
 + * Set javac encoding to utf-8 (CASSANDRA-11077)
 + * Integrate SASI index into Cassandra (CASSANDRA-10661)
 + * Add --skip-flush option to nodetool snapshot
 + * Skip values for non-queried columns (CASSANDRA-10657)
 + * Add support for secondary indexes on static columns (CASSANDRA-8103)
 + * CommitLogUpgradeTestMaker creates broken commit logs (CASSANDRA-11051)
 + * Add metric for number of dropped mutations (CASSANDRA-10866)
 + * Simplify row cache invalidation code (CASSANDRA-10396)
 + * Support user-defined compaction through nodetool (CASSANDRA-10660)
 + * Stripe view locks by key and table ID to reduce contention 
(CASSANDRA-10981)
 + * Add nodetool gettimeout and settimeout commands (CASSANDRA-10953)
 + * Add 3.0 metadata to sstablemetadata output (CASSANDRA-10838)
 +Merged from 3.0:
+  * Introduce backpressure for hints (CASSANDRA-10972)
   * Fix ClusteringPrefix not being able to read tombstone range boundaries 
(CASSANDRA-11158)
   * Prevent logging in sandboxed state (CASSANDRA-11033)
   * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc9c6faa/build.xml
--

[2/3] cassandra git commit: Introduce backpressure for hints

2016-02-23 Thread aleksey

Introduce backpressure for hints

patch by Ariel Weisberg; reviewed by Benedict Elliott Smith for
CASSANDRA-10972


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/037d24ef
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/037d24ef
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/037d24ef

Branch: refs/heads/trunk
Commit: 037d24efdf83bd2736556f9880c5e1f6be48fa77
Parents: fe37e06
Author: Ariel Weisberg 
Authored: Mon Dec 28 16:32:05 2015 -0500
Committer: Aleksey Yeschenko 
Committed: Tue Feb 23 15:28:41 2016 +

--
 CHANGES.txt |  1 +
 build.xml   | 14 +++-
 .../apache/cassandra/hints/HintsBufferPool.java | 34 ++---
 .../cassandra/hints/HintsWriteExecutor.java |  3 +-
 .../cassandra/hints/HintsBufferPoolTest.java| 75 
 .../apache/cassandra/hints/HintsBufferTest.java |  2 +-
 6 files changed, 114 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index da91594..a675016 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.4
+ * Introduce backpressure for hints (CASSANDRA-10972)
  * Fix ClusteringPrefix not being able to read tombstone range boundaries 
(CASSANDRA-11158)
  * Prevent logging in sandboxed state (CASSANDRA-11033)
  * Disallow drop/alter operations of UDTs used by UDAs (CASSANDRA-10721)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/build.xml
--
diff --git a/build.xml b/build.xml
index d27b77a..6ef99fd 100644
--- a/build.xml
+++ b/build.xml
@@ -111,6 +111,8 @@
 
 
 
+
+
 
 
 
@@ -382,6 +384,11 @@
   
   
 
+  
+  
+  
+
+
   
   
 
@@ -479,7 +486,10 @@
 artifactId="cassandra-parent"
 version="${version}"/>
 
-
+
+
+
+
   
 
   

 
+  
   
   
   
@@ -1701,6 +1712,7 @@
   
   
   
+  
 ]]>
   


http://git-wip-us.apache.org/repos/asf/cassandra/blob/037d24ef/src/java/org/apache/cassandra/hints/HintsBufferPool.java
--
diff --git a/src/java/org/apache/cassandra/hints/HintsBufferPool.java 
b/src/java/org/apache/cassandra/hints/HintsBufferPool.java
index 83b155a..25f9bc1 100644
--- a/src/java/org/apache/cassandra/hints/HintsBufferPool.java
+++ b/src/java/org/apache/cassandra/hints/HintsBufferPool.java
@@ -17,10 +17,11 @@
  */
 package org.apache.cassandra.hints;
 
-import java.util.Queue;
 import java.util.UUID;
-import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.LinkedBlockingQueue;
 
+import org.apache.cassandra.config.Config;
 import org.apache.cassandra.net.MessagingService;
 
 /**
@@ -34,15 +35,16 @@ final class HintsBufferPool
 void flush(HintsBuffer buffer, HintsBufferPool pool);
 }
 
+static final int MAX_ALLOCATED_BUFFERS = 
Integer.getInteger(Config.PROPERTY_PREFIX + "MAX_HINT_BUFFERS", 3);
 private volatile HintsBuffer currentBuffer;
-private final Queue reserveBuffers;
+private final BlockingQueue reserveBuffers;
 private final int bufferSize;
 private final FlushCallback flushCallback;
+private int allocatedBuffers = 0;
 
 HintsBufferPool(int bufferSize, FlushCallback flushCallback)
 {
-reserveBuffers = new ConcurrentLinkedQueue<>();
-
+reserveBuffers = new LinkedBlockingQueue<>();
 this.bufferSize = bufferSize;
 this.flushCallback = flushCallback;
 }
@@ -78,13 +80,10 @@ final class HintsBufferPool
 }
 }
 
-boolean offer(HintsBuffer buffer)
+void offer(HintsBuffer buffer)
 {
-if (!reserveBuffers.isEmpty())
-return false;
-
-reserveBuffers.offer(buffer);
-return true;
+if (!reserveBuffers.offer(buffer))
+throw new RuntimeException("Failed to store buffer");
 }
 
 // A wrapper to ensure a non-null currentBuffer value on the first call.
@@ -108,6 +107,18 @@ final class HintsBufferPool
 return false;
 
 HintsBuffer buffer = reserveBuffers.poll();
+if (buffer == null && allocatedBuffers >= MAX_ALLOCATED_BUFFERS)
+{
+try
+{
+//This BlockingQueue.take is a target for byteman in 
HintsBufferPoolTest
+buffer = reserveBuffers.take();

[jira] [Commented] (CASSANDRA-8110) Make streaming backwards compatible

2016-02-23 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159044#comment-15159044
 ] 

Paulo Motta commented on CASSANDRA-8110:


What operation are you trying to execute during upgrade? Repair, bootstrap, 
rebuild?

Please note that these operations are not supported during upgrades, so you 
must first complete upgrade before running any of those.

> Make streaming backwards compatible
> ---
>
> Key: CASSANDRA-8110
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8110
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Marcus Eriksson
>  Labels: gsoc2016, mentor
> Fix For: 3.x
>
>
> To be able to seamlessly upgrade clusters we need to make it possible to 
> stream files between nodes with different StreamMessage.CURRENT_VERSION



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[Cassandra Wiki] Update of "Committers" by AlekseyYeschenko

2016-02-23 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "Committers" page has been changed by AlekseyYeschenko:
https://wiki.apache.org/cassandra/Committers?action=diff=56=57

  ||Marcus Eriksson ||Apr 2013 ||Datastax || ||
  ||Mikhail Stepura ||Jan 2014 ||Apple || ||
  ||Tyler Hobbs ||Mar 2014 ||Datastax || ||
- ||Benedict Elliott Smith ||May 2014 ||Datastax || ||
+ ||Benedict Elliott Smith ||May 2014 ||Vast || ||
  ||Josh Mckenzie ||Jul 2014 ||Datastax || ||
  ||Robert Stupp ||Jan 2015 ||Datastax || ||
  ||Sam Tunnicliffe ||May 2015 ||Datastax || ||

[jira] [Commented] (CASSANDRA-11213) Improve ClusteringPrefix hierarchy

2016-02-23 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159034#comment-15159034
 ] 

Branimir Lambov commented on CASSANDRA-11213:
-

I did it slightly differently: range tombstones do make an explicit distinction 
between bound and boundary, so it isn't that valuable for them to have a shared 
bound class to use; it made more sense for me to isolate the bound concept from 
its uses and avoid conversion between the slice ends and the corresponding 
range markers:
- Moved the bound concept outside of {{Slice}} and {{RangeTombstone}}. What 
used to be a {{Slice.Bound}} is now a {{ClusteringBound}}.
- Made a {{ClusteringBoundary}} type and changed the markers to use 
bound/boundary directly.
- Added a shared {{AbstactClusteringBound}} ancestor for the few bits of code 
that need to be able to work with both.
- Had to name the types {{ClusteringX}} to avoid naming conflict between 
{{ClusteringBound}} and cql3 statements {{Bound}}.

|[code|https://github.com/blambov/cassandra/tree/11213]|[utest|http://cassci.datastax.com/job/blambov-11213-testall/]|[dtest|http://cassci.datastax.com/job/blambov-11213-dtest/]|

> Improve ClusteringPrefix hierarchy
> --
>
> Key: CASSANDRA-11213
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11213
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Branimir Lambov
> Fix For: 3.x
>
>
> As noted by [~blambov] on CASSANDRA-11158, having {{RangeTombstone.Bound}} be 
> a subclass of {{Slice.Bound}} is somewhat inconsistent. I'd argue in fact 
> that conceptually neither should really be a subclass of the other as none is 
> a special case of the other and they are use in strictly non-overlapping 
> places ({{Slice.Bound}} is for slices which are used for selecting data while 
> {{RangeTombstone.Bound}} is for range tombstone which actually represent some 
> type of data).
> We should figure out a cleaner hierarchy of this, which probably mean 
> slightly changing the {{ClusteringPrefix}} hierarchy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11208) Paging is broken for IN queries

2016-02-23 Thread Benjamin Lerer (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11208:
---
Attachment: 11083-2.2.txt

{{AbstractQueryPager}} was not taking into account the fact that for tables 
with no clustering columns there is only one row per partition. When the next 
page was fetched, the pager was believing that it still had to return some rows 
from the partition.

||utests||dtests||
|[3.0|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-3.0-testall/1/]|[3.0|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-3.0-dtest/1/]|
|[trunk|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-trunk-testall/1/]|[trunk|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-11208-trunk-dtest/1/]|

The DTest PR is [here|https://github.com/riptano/cassandra-dtest/pull/820]

> Paging is broken for IN queries
> ---
>
> Key: CASSANDRA-11208
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11208
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
> Attachments: 11083-2.2.txt
>
>
> If the number of selected row is greater than the page size, C* will return 
> some duplicates.
> The problem can be reproduced with the java driver using the following code:
> {code}
>session = cluster.connect();
>session.execute("CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION = 
> {'class' : 'SimpleStrategy', 'replication_factor' : '1'}");
>session.execute("USE test");
>session.execute("DROP TABLE IF EXISTS test");
>session.execute("CREATE TABLE test (rc int, pk int, PRIMARY KEY 
> (pk))");
>for (int i = 0; i < 5; i++)
>session.execute("INSERT INTO test (pk, rc) VALUES (?, ?);", i, i);
>ResultSet rs = session.execute(session.newSimpleStatement("SELECT * 
> FROM test WHERE  pk IN (1, 2, 3)").setFetchSize(2));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11124) Change default cqlsh encoding to utf-8

2016-02-23 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159011#comment-15159011
 ] 

Paulo Motta commented on CASSANDRA-11124:
-

Thanks!

> Change default cqlsh encoding to utf-8
> --
>
> Key: CASSANDRA-11124
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11124
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Trivial
>  Labels: cqlsh
>
> Strange things can happen when utf-8 is not the default cqlsh encoding (see 
> CASSANDRA-11030). This ticket proposes changing the default cqlsh encoding to 
> utf-8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159003#comment-15159003
 ] 

Marcus Eriksson commented on CASSANDRA-11215:
-

I'll try to write up a dtest for this, unless you want to do that [~molsson]?

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-11215:

Reviewer: Marcus Eriksson

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11217) Only log yaml config once, at startup

2016-02-23 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158993#comment-15158993
 ] 

Jason Brown commented on CASSANDRA-11217:
-

||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:config_logging_2.2]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:config_logging_3.0]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:config_logging_3.x]
|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_2.2-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.x-testall/]
|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_2.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-config_logging_3.x-dtest/]

> Only log yaml config once, at startup
> -
>
> Key: CASSANDRA-11217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11217
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration, Core
>Reporter: Jason Brown
>Assignee: Jason Brown
>Priority: Minor
>
> CASSANDRA-6456 introduced a feature where the yaml is dumped in the log. At 
> startup this is a nice feature, but I see that it’s actually triggered every 
> time it handshakes with a node and fails to connect and the node happens to 
> be a seed ([see 
> here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L435]).
>  Calling {{DD.getseeds()}} calls the {{SeedProvider}}, and if you happen to 
> use {{SimpleSeedProvider}} it will reload the yaml config, and once again 
> dump it out to the log.
> It's debatable if {{DD.getseeds()}} should trigger a reload (which I added in 
> CASSANDRA-5459) or whether reloading the seeds should be a different method 
> (it probably should), but we shouldn't keep logging the yaml config on every 
> connection failure to a seed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11217) Only log yaml config once, at startup

2016-02-23 Thread Jason Brown (JIRA)

Jason Brown created CASSANDRA-11217:
---

 Summary: Only log yaml config once, at startup
 Key: CASSANDRA-11217
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11217
 Project: Cassandra
  Issue Type: Bug
  Components: Configuration, Core
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor


CASSANDRA-6456 introduced a feature where the yaml is dumped in the log. At 
startup this is a nice feature, but I see that it’s actually triggered every 
time it handshakes with a node and fails to connect and the node happens to be 
a seed ([see 
here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L435]).
 Calling {{DD.getseeds()}} calls the {{SeedProvider}}, and if you happen to use 
{{SimpleSeedProvider}} it will reload the yaml config, and once again dump it 
out to the log.

It's debatable if {{DD.getseeds()}} should trigger a reload (which I added in 
CASSANDRA-5459) or whether reloading the seeds should be a different method (it 
probably should), but we shouldn't keep logging the yaml config on every 
connection failure to a seed.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158981#comment-15158981
 ] 

Marcus Olsson edited comment on CASSANDRA-11215 at 2/23/16 2:50 PM:


Patch for 2.2 is available 
[here|https://github.com/emolsson/cassandra/commit/8b1b4317c43db648d54ce2e339a525e3fb324cab].

I think there will be some merge conflicts in 3.0/3.x should I apply separate 
patch sets for them directly or wait for the review of the 2.2 version first?

Edit: To make it clear what the fix is, the sstableCandidates are put in a 
try-with-resources to make sure that they are released. I felt that this 
clarification might be needed since the patch also moves the SSTable 
referencing code to a separate method to reduce complexity in the 
doValidationCompaction-method.


was (Author: molsson):
Patch for 2.2 is available 
[here|https://github.com/emolsson/cassandra/commit/8b1b4317c43db648d54ce2e339a525e3fb324cab].

I think there will be some merge conflicts in 3.0/3.x should I apply separate 
patch sets for them directly or wait for the review of the 2.2 version first?

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Add extension points in storage and streaming classes

2016-02-23 Thread marcuse

Repository: cassandra
Updated Branches:
  refs/heads/trunk 030c775ee -> 4b27287cd


Add extension points in storage and streaming classes

Patch by Blake Eggleston; reviewed by marcuse for CASSANDRA-11173


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4b27287c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4b27287c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4b27287c

Branch: refs/heads/trunk
Commit: 4b27287cd93088148d85d1a6ec9df34601f0c741
Parents: 030c775
Author: Blake Eggleston 
Authored: Tue Feb 16 15:06:00 2016 -0800
Committer: Marcus Eriksson 
Committed: Tue Feb 23 15:45:07 2016 +0100

--
 .../apache/cassandra/db/ColumnFamilyStore.java  |  1 +
 .../db/SinglePartitionReadCommand.java  | 28 ---
 .../org/apache/cassandra/db/StorageHook.java| 86 
 .../apache/cassandra/streaming/StreamHook.java  | 57 +
 .../cassandra/streaming/StreamReader.java   |  4 +-
 .../cassandra/streaming/StreamSession.java  |  3 +-
 .../cassandra/streaming/StreamTransferTask.java |  3 +-
 7 files changed, 165 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b27287c/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 9b113c4..fa95063 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -1217,6 +1217,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 DecoratedKey key = update.partitionKey();
 invalidateCachedPartition(key);
 metric.samplers.get(Sampler.WRITES).addSample(key.getKey(), 
key.hashCode(), 1);
+StorageHook.instance.reportWrite(metadata.cfId, update);
 metric.writeLatency.addNano(System.nanoTime() - start);
 if(timeDelta < Long.MAX_VALUE)
 metric.colUpdateTimeDeltaHistogram.update(timeDelta);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b27287c/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
--
diff --git a/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java 
b/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
index 1a0b400..9712497 100644
--- a/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
+++ b/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
@@ -547,7 +547,7 @@ public class SinglePartitionReadCommand extends ReadCommand
 
 @SuppressWarnings("resource") // 'iter' is added to iterators 
which is closed on exception,
   // or through the closing of the 
final merged iterator
-UnfilteredRowIteratorWithLowerBound iter = 
makeIterator(sstable, true);
+UnfilteredRowIteratorWithLowerBound iter = makeIterator(cfs, 
sstable, true);
 if (!sstable.isRepaired())
 oldestUnrepairedTombstone = 
Math.min(oldestUnrepairedTombstone, sstable.getMinLocalDeletionTime());
 
@@ -567,7 +567,7 @@ public class SinglePartitionReadCommand extends ReadCommand
 
 @SuppressWarnings("resource") // 'iter' is added to 
iterators which is close on exception,
   // or through the closing of 
the final merged iterator
-UnfilteredRowIteratorWithLowerBound iter = 
makeIterator(sstable, false);
+UnfilteredRowIteratorWithLowerBound iter = 
makeIterator(cfs, sstable, false);
 if (!sstable.isRepaired())
 oldestUnrepairedTombstone = 
Math.min(oldestUnrepairedTombstone, sstable.getMinLocalDeletionTime());
 
@@ -582,6 +582,7 @@ public class SinglePartitionReadCommand extends ReadCommand
 if (iterators.isEmpty())
 return EmptyIterators.unfilteredRow(cfs.metadata, 
partitionKey(), filter.isReversed());
 
+StorageHook.instance.reportRead(cfs.metadata.cfId, partitionKey());
 return withStateTracking(withSSTablesIterated(iterators, 
cfs.metric));
 }
 catch (RuntimeException | Error e)
@@ -609,15 +610,17 @@ public class SinglePartitionReadCommand extends 
ReadCommand
 return clusteringIndexFilter().shouldInclude(sstable);
 }
 
-private UnfilteredRowIteratorWithLowerBound makeIterator(final 
SSTableReader sstable, boolean applyThriftTransformation)
+private UnfilteredRowIteratorWithLowerBound

[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158981#comment-15158981
 ] 

Marcus Olsson commented on CASSANDRA-11215:
---

Patch for 2.2 is available 
[here|https://github.com/emolsson/cassandra/commit/8b1b4317c43db648d54ce2e339a525e3fb324cab].

I think there will be some merge conflicts in 3.0/3.x should I apply separate 
patch sets for them directly or wait for the review of the 2.2 version first?

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11216) Range.compareTo() violates the contract of Comparable

2016-02-23 Thread Jason Brown (JIRA)

Jason Brown created CASSANDRA-11216:
---

 Summary: Range.compareTo() violates the contract of Comparable
 Key: CASSANDRA-11216
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11216
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor


When running some quick-check style tests, I discovered that if both of the 
ranges being compared wrap around, then the result of the comparison depends on 
which range is evaluated first. For example, two ranges:

A = { -1, 2 }
B = { -2, 1 }

and then compare them together:
A.compareTo(B) == -1
B.compareTo(A) == -1

This is because the logic of the existing {{Range.compareTo()}} simply checks 
to see if the {{this}} range wraps around, and returns -1. This bug does not 
appear to affect c* until 3.0, and then only in one place 
({{MerkleTrees.TokenRangeComparator#compare}}) that I could identify.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11216) Range.compareTo() violates the contract of Comparable

2016-02-23 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158970#comment-15158970
 ] 

Jason Brown commented on CASSANDRA-11216:
-

|| 2.2 || 3.0 || trunk ||
|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:range_compareTo_2.2]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:range_compareTo_3.0]|[branch|https://github.com/apache/cassandra/compare/trunk...jasobrown:range_compareTo_3.x]
|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_2.2-testall]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.0-testall]|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.x-testall]
|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_2.2-dtest]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.0-dtest]|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-range_compareTo_3.x-dtest]

> Range.compareTo() violates the contract of Comparable
> -
>
> Key: CASSANDRA-11216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11216
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jason Brown
>Assignee: Jason Brown
>Priority: Minor
>
> When running some quick-check style tests, I discovered that if both of the 
> ranges being compared wrap around, then the result of the comparison depends 
> on which range is evaluated first. For example, two ranges:
> A = { -1, 2 }
> B = { -2, 1 }
> and then compare them together:
> A.compareTo(B) == -1
> B.compareTo(A) == -1
> This is because the logic of the existing {{Range.compareTo()}} simply checks 
> to see if the {{this}} range wraps around, and returns -1. This bug does not 
> appear to affect c* until 3.0, and then only in one place 
> ({{MerkleTrees.TokenRangeComparator#compare}}) that I could identify.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-23 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158968#comment-15158968
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

bq. But in that case the pause/stop feature should be implemented as early as 
possible to avoid having an upgrade scenario that requires the user to upgrade 
to the version that introduces the pause feature before upgrading to the 
latest. Another way would be to have the "system interrupts" feature in place 
early, so that the repairs would be paused during an upgrade.

Sounds good! We could ask the user to pause, but I think doing that 
automatically via "system interrupts" is better. It just ocurred to me that 
both "the pause" or "system interrupts" will prevent new repairs from starting, 
but what about already running repairs? We will probably want to interrupt 
already running repairs as well in some situations. For this reason 
CASSANDRA-3486 is also relevant for this ticket (adding it as a dependency of 
this ticket).

bq. I think the timeout might be good to have to prevent a hang from stopping 
the entire repair process. But I think it would only work if the repair would 
only hang occasionally, otherwise the same repair would be retried until it is 
marked as a "fail". 

+1. Then I think we should either have timeout, or add an ability to 
cancel/interrupt a running scheduled repair in the initial version, to avoid 
hanging repairs to render the automatic repair scheduling useless.

bq. Another option is to have a "slow repair"-detector that would log a warning 
if a repair session is taking too long time, to avoid aborting it if it's 
actually repairing and leaving it up to the user to handle it. Either way I'd 
say it's out of the scope of the initial version.

bq. We might also want to be able to detect if it would be impossible to repair 
the whole cluster within gc grace and report it to the user. This could happen 
for multiple reasons like too many tables, too many nodes, too few parallel 
repairs or simply overload. I guess it would be hard to make accurate 
predictions with all of these variables so it might be good enough to check 
through the history of the repairs, do an estimation of the time and compare it 
to gc grace? I think this is something out of scope for the first version, but 
I thought I'd just mention it here to remember it.

Nice! These could probably live in a separate repair metrics and alert module 
in the future, allowing users to track statistics, issue alerts/warnings based 
on history and allow the scheduler to perform more advanced adaptive 
scheduling. Some metrics to track:
* Repair time per session
** Break up of time per phase (validation, sync, anticompaction, etc)
* Repair time per node
* Validation mismatch %
* Fail count

bq. Should we maybe compile a list of "features that should be in the initial 
version" and also a "improvements" list for future work to make the scope clear?

Sounds good! Below is a suggested list of subtasks:

* Basic functionality
** Resource locking API and implementation
** Maintenance scheduling API and metadata
** Basic scheduling support
** Polling and monitoring module
** Pausing and aborting support 
** Rejection policies (includes system interrupts and maintenance windows)
** Failure handling and retry
** Configuration support
** Frontend support (table options, management commands)

* Optional/deferred functionality
** Parallel repair session support
** Subrange repair support
** Maintenance history
** Timeout
** Metrics
** Alerts

WDYT? Feel free to update or break-up into smaller or larger subtasks, and then 
create the actual subtasks to start work on them.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158958#comment-15158958
 ] 

Marcus Olsson commented on CASSANDRA-11215:
---

It seems that this issue is caused by not accepting parallel repairs on the 
same sstables anymore, where it throws a RuntimeException if that happens and 
fails to release the previously acquired references.

I'm currently working on a patch for this.

> Reference leak with parallel repairs on the same table
> --
>
> Key: CASSANDRA-11215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log 
> about reference leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
>  was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=100 -schema 
> replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't 
> very wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11215) Reference leak with parallel repairs on the same table

2016-02-23 Thread Marcus Olsson (JIRA)

Marcus Olsson created CASSANDRA-11215:
-

 Summary: Reference leak with parallel repairs on the same table
 Key: CASSANDRA-11215
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Olsson
Assignee: Marcus Olsson


When starting multiple repairs on the same table Cassandra starts to log about 
reference leak as:
{noformat}
ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class 
org.apache.cassandra.io.sstable.format.SSTableReader
$InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
 was not released before the reference was garbage collected
{noformat}

Reproducible with:
{noformat}
ccm create repairtest -v 2.2.5 -n 3
ccm start
ccm stress write n=100 -schema 
replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
# And then perform two repairs concurrently with:
ccm node1 nodetool repair testrepair
{noformat}

I know that starting multiple repairs in parallel on the same table isn't very 
wise, but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11209) SSTable ancestor leaked reference

2016-02-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158945#comment-15158945
 ] 

Marcus Eriksson commented on CASSANDRA-11209:
-

I've been trying to reproduce this today with no luck (using STCS)

[~jrfernandez] can you reproduce in a testing environment? Could you try with 
one of the built-in compaction strategies if so?

> SSTable ancestor leaked reference
> -
>
> Key: CASSANDRA-11209
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11209
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jose Fernandez
> Attachments: screenshot-1.png
>
>
> We're running a fork of 2.1.13 that adds the TimeWindowCompactionStrategy 
> from [~jjirsa]. We've been running 4 clusters without any issues for many 
> months until a few weeks ago we started scheduling incremental repairs every 
> 24 hours (previously we didn't run any repairs at all).
> Since then we started noticing big discrepancies in the LiveDiskSpaceUsed, 
> TotalDiskSpaceUsed, and actual size of files on disk. The numbers are brought 
> back in sync by restarting the node. We also noticed that when this bug 
> happens there are several ancestors that don't get cleaned up. A restart will 
> queue up a lot of compactions that slowly eat away the ancestors.
> I looked at the code and noticed that we only decrease the LiveTotalDiskUsed 
> metric in the SSTableDeletingTask. Since we have no errors being logged, I'm 
> assuming that for some reason this task is not getting queued up. If I 
> understand correctly this only happens when the reference count for the 
> SStable reaches 0. So this is leading us to believe that something during 
> repairs and/or compactions is causing a reference leak to the ancestor table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11203) nodetool repair not performing repair or being incorrectly triggered in 3.0.3

2016-02-23 Thread Jason Kania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157138#comment-15157138
 ] 

Jason Kania edited comment on CASSANDRA-11203 at 2/23/16 2:07 PM:
--

Is it possible to change the output text to reflect this insider knowledge? 
Also, if no repair is required, then it can be changed to indicate this. The 
messaging right now is confusing.

ie. "Replication factor is 1; nothing to repair for keyspace 'sensordb'"


was (Author: longtimer):
Is it possible to change the output text to reflect this insider knowledge?

ie. "Replication factor is 1; nothing to repair for keyspace 'sensordb'"

> nodetool repair not performing repair or being incorrectly triggered in 3.0.3
> -
>
> Key: CASSANDRA-11203
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11203
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: debian jesse up to date content
>Reporter: Jason Kania
>
> When nodetool repair is run, it indicates that no repair is needed on some 
> keyspaces but on others it attempts repair. However, when run multiple times, 
> the output seems to indicate that the same triggering conditions still 
> persists that indicate a problem. Alternatively, the output could indicate 
> that the underlying condition has not been resolved.
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:10,356] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:10,364] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:10,402] Starting repair command #1, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:12,144] Repair completed successfully
> [2016-02-21 23:33:12,157] Repair command #1 finished in 1 second
> root@marble:/var/lib/cassandra/data/sensordb/periodicReading# nodetool repair
> [2016-02-21 23:33:31,683] Nothing to repair for keyspace 'sensordb'
> [2016-02-21 23:33:31,689] Nothing to repair for keyspace 'system_auth'
> [2016-02-21 23:33:31,713] Starting repair command #2, repairing keyspace 
> system_traces with repair options (parallelism: parallel, primary range: 
> false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: 
> [], hosts: [], # of ranges: 256)
> [2016-02-21 23:33:33,324] Repair completed successfully
> [2016-02-21 23:33:33,334] Repair command #2 finished in 1 second



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-02-23 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158897#comment-15158897
 ] 

Joshua McKenzie commented on CASSANDRA-8844:


bq. Does this mean if RF was say, three, that three CDC commit logs would be 
written to across the cluster (compared to say, one write at the coordinator)?
That really was rather poorly phrased initially. I was originally trying to 
convey that DDL logic would be similar to RF on a KS but even that's not set in 
stone. I've pulled that from the design doc as the way it currently reads is 
redundant (anywhere data's written, as per replication strategy, will by 
definition have CDC).

As for the de-duplication that will need to be done client-side. Whether or not 
we have a reference implementation for that now (as we will for the 
CDCConsumerDaemon) is currently up in the air.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from

[jira] [Commented] (CASSANDRA-10445) Cassandra-stress throws max frame size error when SSL certification is enabled

2016-02-23 Thread Vara (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158895#comment-15158895
 ] 

Vara commented on CASSANDRA-10445:
--

It worked fine this time Cornel. Thanks for resolving the issue. 

> Cassandra-stress throws max frame size error when SSL certification is enabled
> --
>
> Key: CASSANDRA-10445
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10445
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sam Goldberg
>  Labels: stress
> Fix For: 2.1.x
>
>
> Running cassandra-stress when SSL is enabled gives the following error and 
> does not finish executing:
> {quote}
> cassandra-stress write n=100
> Exception in thread "main" java.lang.RuntimeException: 
> org.apache.thrift.transport.TTransportException: Frame size (352518912) 
> larger than max length (15728640)!
> at 
> org.apache.cassandra.stress.settings.StressSettings.getRawThriftClient(StressSettings.java:144)
> at 
> org.apache.cassandra.stress.settings.StressSettings.getRawThriftClient(StressSettings.java:110)
> at 
> org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesThrift(SettingsSchema.java:111)
> at 
> org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:59)
> at 
> org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:205)
> at org.apache.cassandra.stress.StressAction.run(StressAction.java:55)
> at org.apache.cassandra.stress.Stress.main(Stress.java:109)
> {quote}
> I was able to reproduce this issue consistently via the following steps:
> 1) Spin up 3 node cassandra cluster running 2.1.8
> 2) Perform cassandra-stress write n=100
> 3) Everything works!
> 4) Generate keystore and truststore for each node in the cluster and 
> distribute appropriately 
> 5) Modify cassandra.yaml on each node to enable SSL:
> client_encryption_options:
> enabled: true
> keystore: /
> # require_client_auth: false
> # Set trustore and truststore_password if require_client_auth is true
> truststore:  /
> truststore_password: 
> # More advanced defaults below:
> protocol: ssl
> 6) Restart each node.
> 7) Perform cassandra-stress write n=100
> 8) Get Frame Size error, cassandra-stress fails
> This may be related to CASSANDRA-9325.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 136 matches

Mail list logo