[jira] [Updated] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked in Repair

2017-03-31 Thread Simon Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Zhou updated CASSANDRA-13397:
---
 Priority: Minor  (was: Major)
Fix Version/s: 3.0.13

> Return value of CountDownLatch.await() not being checked in Repair
> --
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>Priority: Minor
> Fix For: 3.0.13
>
> Attachments: CASSANDRA-13397-v1.patch
>
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked in Repair

2017-03-31 Thread Simon Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Zhou updated CASSANDRA-13397:
---
Attachment: CASSANDRA-13397-v1.patch

The attached patch includes the fix and a minor improvement (bail out early if 
there is any unavailable neighbor). [~krummas] could you help review this patch?

> Return value of CountDownLatch.await() not being checked in Repair
> --
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
> Attachments: CASSANDRA-13397-v1.patch
>
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked in Repair

2017-03-31 Thread Simon Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Zhou updated CASSANDRA-13397:
---
Status: Patch Available  (was: Open)

> Return value of CountDownLatch.await() not being checked in Repair
> --
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
> Attachments: CASSANDRA-13397-v1.patch
>
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13396:
---
Issue Type: Bug  (was: Wish)

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15952024#comment-15952024
 ] 

Jeff Jirsa commented on CASSANDRA-13396:


Changes back to bug, because even if the belief is that other loggers shouldn't 
be encouraged, we surely can do better than throwing a cast exception

Given that log4j2 is likely faster than logback and has been suggested as far 
back as 2013 CASSANDRA-5883 it seems like artificially forcing logback is a 
position that would need to be more rigorously defended - I'm +1 on this change 
conceptually (but this is not a review).


> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15952014#comment-15952014
 ] 

Jeff Jirsa commented on CASSANDRA-13396:


Seems pretty reasonable to me

Certainly logback isn't the only performant slf4j logger available.


> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period

2017-03-31 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-12929:

   Resolution: Fixed
Fix Version/s: 4.0
   Status: Resolved  (was: Ready to Commit)

> dtest failure in 
> bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
> --
>
> Key: CASSANDRA-12929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12929
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Shuler
>Assignee: Paulo Motta
>  Labels: dtest, test-failure
> Fix For: 4.0
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period
> {noformat}
> Error Message
> Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI
> dtest: DEBUG: Done setting configuration options:
> {   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'}
> cassandra.cluster: INFO: New Cassandra host  
> discovered
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in 
> wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in 
> simple_bootstrap_test_small_keepalive_period
> assert_bootstrap_state(self, node2, 'COMPLETED')
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in 
> assert_bootstrap_state
> assert_one(session, "SELECT bootstrapped FROM system.local WHERE 
> key='local'", [expected_bootstrap_state])
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in 
> assert_one
> assert list_res == [expected], "Expected {} from {}, but got 
> {}".format([expected], query, list_res)
> "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]\n >> begin 
> captured logging << \ndtest: DEBUG: cluster ccm 
> directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration 
> options:\n{   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 
> 'true'}\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-12929) Fix version check to enable streaming keep-alive

2017-03-31 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-12929:

Summary: Fix version check to enable streaming keep-alive  (was: dtest 
failure in 
bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period)

> Fix version check to enable streaming keep-alive
> 
>
> Key: CASSANDRA-12929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12929
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Shuler
>Assignee: Paulo Motta
>  Labels: dtest, test-failure
> Fix For: 4.0
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period
> {noformat}
> Error Message
> Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI
> dtest: DEBUG: Done setting configuration options:
> {   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'}
> cassandra.cluster: INFO: New Cassandra host  
> discovered
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in 
> wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in 
> simple_bootstrap_test_small_keepalive_period
> assert_bootstrap_state(self, node2, 'COMPLETED')
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in 
> assert_bootstrap_state
> assert_one(session, "SELECT bootstrapped FROM system.local WHERE 
> key='local'", [expected_bootstrap_state])
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in 
> assert_one
> assert list_res == [expected], "Expected {} from {}, but got 
> {}".format([expected], query, list_res)
> "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]\n >> begin 
> captured logging << \ndtest: DEBUG: cluster ccm 
> directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration 
> options:\n{   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 
> 'true'}\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period

2017-03-31 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951796#comment-15951796
 ] 

Paulo Motta commented on CASSANDRA-12929:
-

Committed to trunk as {{add855ae177d28d02f1172fb0070ef487237ead5}}. Thanks!

> dtest failure in 
> bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
> --
>
> Key: CASSANDRA-12929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12929
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Shuler
>Assignee: Paulo Motta
>  Labels: dtest, test-failure
>
> example failure:
> http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period
> {noformat}
> Error Message
> Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI
> dtest: DEBUG: Done setting configuration options:
> {   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'}
> cassandra.cluster: INFO: New Cassandra host  
> discovered
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in 
> wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in 
> simple_bootstrap_test_small_keepalive_period
> assert_bootstrap_state(self, node2, 'COMPLETED')
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in 
> assert_bootstrap_state
> assert_one(session, "SELECT bootstrapped FROM system.local WHERE 
> key='local'", [expected_bootstrap_state])
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in 
> assert_one
> assert list_res == [expected], "Expected {} from {}, but got 
> {}".format([expected], query, list_res)
> "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]\n >> begin 
> captured logging << \ndtest: DEBUG: cluster ccm 
> directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration 
> options:\n{   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 
> 'true'}\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


cassandra git commit: Fix version check to enable streaming keep-alive

2017-03-31 Thread paulo
Repository: cassandra
Updated Branches:
  refs/heads/trunk 6f647aaa0 -> add855ae1


Fix version check to enable streaming keep-alive

Patch by Paulo Motta; Reviewed by Yuki Morishita for CASSANDRA-12929


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/add855ae
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/add855ae
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/add855ae

Branch: refs/heads/trunk
Commit: add855ae177d28d02f1172fb0070ef487237ead5
Parents: 6f647aa
Author: Paulo Motta 
Authored: Thu Mar 30 19:41:02 2017 -0300
Committer: Paulo Motta 
Committed: Fri Mar 31 20:18:58 2017 -0300

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/streaming/StreamSession.java | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/add855ae/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index d4b53d0..f64e6a0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Fix version check to enable streaming keep-alive (CASSANDRA-12929)
  * Make it possible to monitor an ideal consistency level separate from actual 
consistency level (CASSANDRA-13289)
  * Outbound TCP connections ignore internode authenticator (CASSANDRA-13324)
  * Upgrade junit from 4.6 to 4.12 (CASSANDRA-13360)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/add855ae/src/java/org/apache/cassandra/streaming/StreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java 
b/src/java/org/apache/cassandra/streaming/StreamSession.java
index 7ee99db..4f9d273 100644
--- a/src/java/org/apache/cassandra/streaming/StreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamSession.java
@@ -125,7 +125,7 @@ public class StreamSession implements 
IEndpointStateChangeSubscriber
 /**
  * Version where keep-alive support was added
  */
-private static final CassandraVersion STREAM_KEEP_ALIVE = new 
CassandraVersion("3.10");
+private static final CassandraVersion STREAM_KEEP_ALIVE_VERSION = new 
CassandraVersion("3.10");
 private static final Logger logger = 
LoggerFactory.getLogger(StreamSession.class);
 private static final DebuggableScheduledThreadPoolExecutor 
keepAliveExecutor = new 
DebuggableScheduledThreadPoolExecutor("StreamKeepAliveExecutor");
 static {
@@ -241,7 +241,7 @@ public class StreamSession implements 
IEndpointStateChangeSubscriber
 private boolean isKeepAliveSupported()
 {
 CassandraVersion peerVersion = 
Gossiper.instance.getReleaseVersion(peer);
-return STREAM_KEEP_ALIVE.isSupportedBy(peerVersion);
+return peerVersion.compareTo(STREAM_KEEP_ALIVE_VERSION) >= 0;
 }
 
 /**



[jira] [Commented] (CASSANDRA-13398) Skip mutating repairedAt of already repaired sstables

2017-03-31 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951781#comment-15951781
 ] 

Paulo Motta commented on CASSANDRA-13398:
-

Split {{no_anticompaction_of_already_repaired_test}} into two tests: one 
repairing partial range and one repairing full range which will reproduce the 
issue. [cassandra-dtest 
PR|https://github.com/riptano/cassandra-dtest/pull/1459].

patch and tests below:
||2.2||3.0||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-13398]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-13398]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-13398]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:13398]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-13398-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-13398-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13398-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-13398-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-13398-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13398-dtest/lastCompletedBuild/testReport/]|

[~spo...@gmail.com] or [~krummas], mind reviewing? Thanks!

> Skip mutating repairedAt of already repaired sstables
> -
>
> Key: CASSANDRA-13398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13398
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> CASSANDRA-13153 skips anticompaction on sstables already repaired, but we are 
> not skipping mutating repairedAt when the sstable is fully contained in the 
> repaired range.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13398) Skip mutating repairedAt of already repaired sstables

2017-03-31 Thread Paulo Motta (JIRA)
Paulo Motta created CASSANDRA-13398:
---

 Summary: Skip mutating repairedAt of already repaired sstables
 Key: CASSANDRA-13398
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13398
 Project: Cassandra
  Issue Type: Bug
Reporter: Paulo Motta
Assignee: Paulo Motta


CASSANDRA-13153 skips anticompaction on sstables already repaired, but we are 
not skipping mutating repairedAt when the sstable is fully contained in the 
repaired range.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked

2017-03-31 Thread Simon Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Zhou updated CASSANDRA-13397:
---
Description: 
While looking into repair code, I realize that we should check return value of 
CountDownLatch.await(). Most of the places that we don't check the return 
value, nothing bad would happen due to other protection. However, 
ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
{code}
public static void testLatch() throws InterruptedException {
CountDownLatch latch = new CountDownLatch(2);
latch.countDown();

new Thread(() -> {
try {
Thread.sleep(1200);
} catch (InterruptedException e) {
System.err.println("interrupted");
}
latch.countDown();
System.out.println("counted down");
}).start();


latch.await(1, TimeUnit.SECONDS);
if (latch.getCount() > 0) {
System.err.println("failed");
} else {
System.out.println("success");
}
}
{code}

  was:
While looking into repair code, I realize that we should check return value of 
CountDownLatch.await(). However, there are some places we don't check and some 
of them may cause bad consequent behavior, like in 
ActiveRepairService#prepareForRepair and StorageProxy#describeSchemaVersions. I 
haven't checked the original version that has this bug but at least 
StorageProxy#describeSchemaVersions has the bug starting from 2010. Code to 
reproduce:
{code}
public static void testLatch() throws InterruptedException {
CountDownLatch latch = new CountDownLatch(2);
latch.countDown();

new Thread(() -> {
try {
Thread.sleep(1200);
} catch (InterruptedException e) {
System.err.println("interrupted");
}
latch.countDown();
System.out.println("counted down");
}).start();


latch.await(1, TimeUnit.SECONDS);
if (latch.getCount() > 0) {
System.err.println("failed");
} else {
System.out.println("success");
}
}
{code}


> Return value of CountDownLatch.await() not being checked
> 
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked in Repair

2017-03-31 Thread Simon Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Zhou updated CASSANDRA-13397:
---
Summary: Return value of CountDownLatch.await() not being checked in Repair 
 (was: Return value of CountDownLatch.await() not being checked)

> Return value of CountDownLatch.await() not being checked in Repair
> --
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked

2017-03-31 Thread Simon Zhou (JIRA)
Simon Zhou created CASSANDRA-13397:
--

 Summary: Return value of CountDownLatch.await() not being checked
 Key: CASSANDRA-13397
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
 Project: Cassandra
  Issue Type: Bug
Reporter: Simon Zhou
Assignee: Simon Zhou


While looking into repair code, I realize that we should check return value of 
CountDownLatch.await(). However, there are some places we don't check and some 
of them may cause bad consequent behavior, like in 
ActiveRepairService#prepareForRepair and StorageProxy#describeSchemaVersions. I 
haven't checked the original version that has this bug but at least 
StorageProxy#describeSchemaVersions has the bug starting from 2010. Code to 
reproduce:
{code}
public static void testLatch() throws InterruptedException {
CountDownLatch latch = new CountDownLatch(2);
latch.countDown();

new Thread(() -> {
try {
Thread.sleep(1200);
} catch (InterruptedException e) {
System.err.println("interrupted");
}
latch.countDown();
System.out.println("counted down");
}).start();


latch.await(1, TimeUnit.SECONDS);
if (latch.getCount() > 0) {
System.err.println("failed");
} else {
System.out.println("success");
}
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12088) Upgrade corrupts SSTables

2017-03-31 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951654#comment-15951654
 ] 

Jeff Jirsa commented on CASSANDRA-12088:


[~cskr] - I'm trying to reproduce this, and I'm not having much luck.  Did this 
only happen with LWT? Were you able to reproduce it reliably from scratch, or 
only with the clusters you already had created?


> Upgrade corrupts SSTables
> -
>
> Key: CASSANDRA-12088
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12088
> Project: Cassandra
>  Issue Type: Bug
> Environment: OS: CentOS release 6.7 (Final)
> Cassandra version: 2.1, 3.7
>Reporter: Chandra Sekar S
>Priority: Critical
>
> When upgrading from 2.0 to 3.7, table was corrupted and an exception occurs 
> when performing LWT from Java Driver. The server was upgraded from 2.0 to 2.1 
> and then to 3.7. "nodetool upgradesstables" was run after each step of 
> upgrade.
> Schema of affected table:
> {code}
> CREATE TABLE payment.tbl (
> c1 text,
> c2 timestamp,
> c3 text,
> s1 timestamp static,
> s2 int static,
> c4 text,
> PRIMARY KEY (c1, c2)
> ) WITH CLUSTERING ORDER BY (c2 ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> Insertion that fails:
> {code:java}
> insert into tbl (c1, s2) values ('value', 0) if not exists;
> {code}
> The stack trace in system.log of cassandra server,
> {code}
> INFO  [HANDSHAKE-maven-repo.corp.zeta.in/10.1.5.13] 2016-06-24 22:23:14,887 
> OutboundTcpConnection.java:514 - Handshaking version with 
> maven-repo.corp.zeta.in/10.1.5.13
> ERROR [MessagingService-Incoming-/10.1.5.13] 2016-06-24 22:23:14,889 
> CassandraDaemon.java:217 - Exception in thread 
> Thread[MessagingService-Incoming-/10.1.5.13,5,main]
> java.io.IOError: java.io.IOException: Corrupt flags value for unfiltered 
> partition (isStatic flag set): 160
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:224)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:212)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:681)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:642)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.service.paxos.Commit$CommitSerializer.deserialize(Commit.java:131)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.service.paxos.PrepareResponse$PrepareResponseSerializer.deserialize(PrepareResponse.java:97)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.service.paxos.PrepareResponse$PrepareResponseSerializer.deserialize(PrepareResponse.java:66)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:114) 
> ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:190)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
> Caused by: java.io.IOException: Corrupt flags value for unfiltered partition 
> (isStatic flag set): 160
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:380)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:219)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
>   ... 11 common frames 

[jira] [Commented] (CASSANDRA-13236) corrupt flag error after upgrade from 2.2 to 3.0.10

2017-03-31 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951645#comment-15951645
 ] 

Jeff Jirsa commented on CASSANDRA-13236:


[~ingard] - do you recall if your errors occurred on a cluster that started out 
as 2.0 and was upgraded 2.0 -> 2.2 -> 3.0 ? 



> corrupt flag error after upgrade from 2.2 to 3.0.10
> ---
>
> Key: CASSANDRA-13236
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13236
> Project: Cassandra
>  Issue Type: Bug
> Environment: cassandra 3.0.10
>Reporter: ingard mevåg
>
> After upgrade from 2.2.5 to 3.0.9/10 we're getting a bunch of errors like 
> this:
> {code}
> ERROR [SharedPool-Worker-1] 2017-02-17 12:58:43,859 Message.java:617 - 
> Unexpected exception during request; channel = [id: 0xa8b98684, 
> /10.0.70.104:56814 => /10.0.80.24:9042]
> java.io.IOError: java.io.IOException: Corrupt flags value for unfiltered 
> partition (isStatic flag set): 160
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:210)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processPartition(SelectStatement.java:749)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:711)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:400)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:265)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:487)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:464)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
>  [apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
>  [apache-cassandra-3.0.10.jar:3.0.10]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_72]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  [apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.10.jar:3.0.10]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.io.IOException: Corrupt flags value for unfiltered partition 
> (isStatic flag set): 160
> at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:374)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:217)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
>

[jira] [Commented] (CASSANDRA-13327) Pending endpoints size check for CAS doesn't play nicely with writes-on-replacement

2017-03-31 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951550#comment-15951550
 ] 

Paulo Motta commented on CASSANDRA-13327:
-

Thanks for clarifying [~slebresne].

[~aweisberg] guess we can close this then? and maybe open another ticket with 
Sylvain's suggestion above to lift the max number of pending endpoints 
limitation for CAS if you're willing to take a shot at it?

> Pending endpoints size check for CAS doesn't play nicely with 
> writes-on-replacement
> ---
>
> Key: CASSANDRA-13327
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13327
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>
> Consider this ring:
> 127.0.0.1  MR UP JOINING -7301836195843364181
> 127.0.0.2MR UP NORMAL -7263405479023135948
> 127.0.0.3MR UP NORMAL -7205759403792793599
> 127.0.0.4   MR DOWN NORMAL -7148113328562451251
> where 127.0.0.1 was bootstrapping for cluster expansion. Note that, due to 
> the failure of 127.0.0.4, 127.0.0.1 was stuck trying to stream from it and 
> making no progress.
> Then the down node was replaced so we had:
> 127.0.0.1  MR UP JOINING -7301836195843364181
> 127.0.0.2MR UP NORMAL -7263405479023135948
> 127.0.0.3MR UP NORMAL -7205759403792793599
> 127.0.0.5   MR UP JOINING -7148113328562451251
> It’s confusing in the ring - the first JOINING is a genuine bootstrap, the 
> second is a replacement. We now had CAS unavailables (but no non-CAS 
> unvailables). I think it’s because the pending endpoints check thinks that 
> 127.0.0.5 is gaining a range when it’s just replacing.
> The workaround is to kill the stuck JOINING node, but Cassandra shouldn’t 
> unnecessarily fail these requests.
> It also appears like required participants is bumped by 1 during a host 
> replacement so if the replacing host fails you will get unavailables and 
> timeouts.
> This is related to the check added in CASSANDRA-8346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period

2017-03-31 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951523#comment-15951523
 ] 

Paulo Motta commented on CASSANDRA-12929:
-

Submitted [multiplexer 
run|https://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/391/]
 since this is a novnode test which did not get triggered in the previous run.

> dtest failure in 
> bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
> --
>
> Key: CASSANDRA-12929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12929
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Shuler
>Assignee: Paulo Motta
>  Labels: dtest, test-failure
>
> example failure:
> http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period
> {noformat}
> Error Message
> Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI
> dtest: DEBUG: Done setting configuration options:
> {   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'}
> cassandra.cluster: INFO: New Cassandra host  
> discovered
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in 
> wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in 
> simple_bootstrap_test_small_keepalive_period
> assert_bootstrap_state(self, node2, 'COMPLETED')
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in 
> assert_bootstrap_state
> assert_one(session, "SELECT bootstrapped FROM system.local WHERE 
> key='local'", [expected_bootstrap_state])
>   File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in 
> assert_one
> assert list_res == [expected], "Expected {} from {}, but got 
> {}".format([expected], query, list_res)
> "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE 
> key='local', but got [[u'IN_PROGRESS']]\n >> begin 
> captured logging << \ndtest: DEBUG: cluster ccm 
> directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration 
> options:\n{   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 
> 'true'}\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13321) Add a checksum component for the sstable metadata (-Statistics.db) file

2017-03-31 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951431#comment-15951431
 ] 

Jason Brown commented on CASSANDRA-13321:
-

Overall, I like the ideas here, and the {{VersionedComponent}} should work 
fine. I mostly just have some minor general questions and a few (perhaps 
premature) trivial nits.

questions:
- In {{SSTableReader#mutateRepaired}} and {{#mutateLevel}}, you call 
{{sstableMetadataVersion.get()}} and pass the value to 
{{MetadataSerializer#mutateLevel/mutateRepaired}}, where there's a bit of magic 
that adds 1 to the {{fileVersion}}. Then, the next line in 
{{SSTableReader#mutateRepaired}} calls 
{{sstableMetadataVersion.incrementAndGet()}}. Thus, I'm not sure if the 
{{fileVersion}} argument to {{MetadataSerializer#mutateLevel/mutateRepaired}} 
should be the current file version, or the next file version.
- when stats file is updated for a specific sstable generation, will all the 
versions of stats/crc files be cleaned up when the owning sstable is deleted?
- Is the addition of an extra "number" in the file name going to confuse any 
tools we have? Admiteddly I was lazy and didn't look yet.

trivial nits:
- it would be nice to add a comment to {{VersionedComponent}} to explain how 
{{VersionedComponent#version}} is different from {{Descriptor#generation}}.
- {{VersionedComponent}} constuctors. The variable name for the {{Type}} 
parameter is {{stats}} in several places. I think you meant to call it {{type}} 
(or something similar) and {{stats}} was just a mistake/quick thing to do while 
you were knocking it out.

> Add a checksum component for the sstable metadata (-Statistics.db) file
> ---
>
> Key: CASSANDRA-13321
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13321
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
>
> Since we keep important information in the sstable metadata file now, we 
> should add a checksum component for it. One danger being if a bit gets 
> flipped in repairedAt we could consider the sstable repaired when it is not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13339) java.nio.BufferOverflowException: null

2017-03-31 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13339:
---
Component/s: Core

> java.nio.BufferOverflowException: null
> --
>
> Key: CASSANDRA-13339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13339
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Chris Richards
>
> I'm seeing the following exception running Cassandra 3.9 (with Netty updated 
> to 4.1.8.Final) running on a 2 node cluster.  It would have been processing 
> around 50 queries/second at the time (mixture of 
> inserts/updates/selects/deletes) : there's a collection of tables (some with 
> counters some without) and a single materialized view.
> ERROR [MutationStage-4] 2017-03-15 22:50:33,052 StorageProxy.java:1353 - 
> Failed to apply mutation locally : {}
> java.nio.BufferOverflowException: null
>   at 
> org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:262)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.rows.EncodingStats$Serializer.serialize(EncodingStats.java:233)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.SerializationHeader$Serializer.serializeForMessaging(SerializationHeader.java:380)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:122)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:790)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:279) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> and then again shortly afterwards
> ERROR [MutationStage-3] 2017-03-15 23:27:36,198 StorageProxy.java:1353 - 
> Failed to apply mutation locally : {}
> java.nio.BufferOverflowException: null
>   at 
> org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:262)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.rows.EncodingStats$Serializer.serialize(EncodingStats.java:233)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.SerializationHeader$Serializer.serializeForMessaging(SerializationHeader.java:380)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> 

[jira] [Commented] (CASSANDRA-13373) Provide additional speculative retry statistics

2017-03-31 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951356#comment-15951356
 ] 

T Jake Luciani commented on CASSANDRA-13373:


Would you mind updating the docs and add these new metrics ?

> Provide additional speculative retry statistics
> ---
>
> Key: CASSANDRA-13373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13373
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.x
>
>
> Right now there is a single metric for speculative retry on reads that is the 
> number of speculative retries attempted. You can't tell how many of those 
> actually succeeded in salvaging the read.
> The metric is also per table and there is no keyspace level rollup as there 
> is for several other metrics.
> Add a metric that counts reads that attempt to speculate but fail to complete 
> before the timeout (ignoring read errors).
> Add a rollup metric for the current count of speculation attempts as well as 
> the count of failed speculations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13373) Provide additional speculative retry statistics

2017-03-31 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13373:

Status: Ready to Commit  (was: Patch Available)

> Provide additional speculative retry statistics
> ---
>
> Key: CASSANDRA-13373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13373
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.x
>
>
> Right now there is a single metric for speculative retry on reads that is the 
> number of speculative retries attempted. You can't tell how many of those 
> actually succeeded in salvaging the read.
> The metric is also per table and there is no keyspace level rollup as there 
> is for several other metrics.
> Add a metric that counts reads that attempt to speculate but fail to complete 
> before the timeout (ignoring read errors).
> Add a rollup metric for the current count of speculation attempts as well as 
> the count of failed speculations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13373) Provide additional speculative retry statistics

2017-03-31 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951349#comment-15951349
 ] 

Blake Eggleston commented on CASSANDRA-13373:
-

nice, +1

> Provide additional speculative retry statistics
> ---
>
> Key: CASSANDRA-13373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13373
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.x
>
>
> Right now there is a single metric for speculative retry on reads that is the 
> number of speculative retries attempted. You can't tell how many of those 
> actually succeeded in salvaging the read.
> The metric is also per table and there is no keyspace level rollup as there 
> is for several other metrics.
> Add a metric that counts reads that attempt to speculate but fail to complete 
> before the timeout (ignoring read errors).
> Add a rollup metric for the current count of speculation attempts as well as 
> the count of failed speculations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13339) java.nio.BufferOverflowException: null

2017-03-31 Thread Chris Richards (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951304#comment-15951304
 ] 

Chris Richards commented on CASSANDRA-13339:


CommitLog's add() function sees the mutation as being not "isEmpty()" and the 
PartitionUpdate in the modifications as being "isEmpty()" immediately prior to 
the call to Mutation.serializer.serializedSize(...);

Calling mutation.toString() in this case appears to delay things enough, or 
causes some change, that means that the serializedSize() is correct.

> java.nio.BufferOverflowException: null
> --
>
> Key: CASSANDRA-13339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13339
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Richards
>
> I'm seeing the following exception running Cassandra 3.9 (with Netty updated 
> to 4.1.8.Final) running on a 2 node cluster.  It would have been processing 
> around 50 queries/second at the time (mixture of 
> inserts/updates/selects/deletes) : there's a collection of tables (some with 
> counters some without) and a single materialized view.
> ERROR [MutationStage-4] 2017-03-15 22:50:33,052 StorageProxy.java:1353 - 
> Failed to apply mutation locally : {}
> java.nio.BufferOverflowException: null
>   at 
> org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:262)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.rows.EncodingStats$Serializer.serialize(EncodingStats.java:233)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.SerializationHeader$Serializer.serializeForMessaging(SerializationHeader.java:380)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:122)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:790)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:279) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
> ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1347)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2539)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  [apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> and then again shortly afterwards
> ERROR [MutationStage-3] 2017-03-15 23:27:36,198 StorageProxy.java:1353 - 
> Failed to apply mutation locally : {}
> java.nio.BufferOverflowException: null
>   at 
> org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:262)
>  ~[apache-cassandra-3.9.jar:3.9]
>   at 
> 

[jira] [Comment Edited] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951215#comment-15951215
 ] 

Edward Capriolo edited comment on CASSANDRA-13396 at 3/31/17 4:27 PM:
--

{quote}
people would complain that C* is slow but don't realize it's in this case 
because of that change. 
{quote}

First, its an obvious bug. The entire point of plug-gable logging 
implementations is so that you can replace them. 
 
Second, the only person being actually affected would be Anton, because 
effective no one else is changing logging implementations so no one else is 
hitting that block.

For Anton (and anyone else) they would have to manually change the files in the 
lib folder and the configuration. So nothing is 'hidden' to him. He/They make a 
change and they can report if there actually is a performance issue.  

Because they can "scratch their itch" of running Cassandra in a container they 
might find new problems or they might make new opportunities. For example, they 
may find that some other implementation is actually better or faster. 

If anyone was actually trying to convince me that this bug is intentional, 
(which is almost laughable). The proper practice would be:

{code}
if (!logger instanceof XYZ){
  throw new IllegalArgumentException("we only support XYZ for reasons ABC");
}
{code}

But instead we are attempting to pretend the opposite, that the bug is 
intentional and the correct thing to do is throw a ClassCastException. Which is 
a joke.



was (Author: appodictic):
{quote}
people would complain that C* is slow but don't realize it's in this case 
because of that change. 
{quote}

First, its an obvious bug. The entire point of plug-gable logging 
implementations is so that you can replace them. 
 
Second, the only person being actually affected would be Anton, because 
effective no one else is changing logging implementations so no one else is 
hitting that block.

For Anton (and anyone else) they would have to manually change the files in the 
lib folder and the configuration. So nothing is 'hidden' to him. He/They make a 
chance and they can report if there actually is a performance issue.  

Because they can "scratch their itch" of running Cassandra in a container they 
might find new problems or they might make new opportunities. For example, they 
may find that some other implementation is actually better or faster. 

If anyone was actually trying to convince me that this bug is intentional, 
(which is almost laughable). The proper practice would be:

{code}
if (!logger instanceof XYZ){
  throw new IllegalArgumentException("we only support XYZ for reasons ABC");
}
{code}

But instead we are attempting to pretend the opposite, that the bug is 
intentional and the correct thing to do is throw a ClassCastException. Which is 
a joke.


> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951215#comment-15951215
 ] 

Edward Capriolo edited comment on CASSANDRA-13396 at 3/31/17 4:26 PM:
--

{quote}
people would complain that C* is slow but don't realize it's in this case 
because of that change. 
{quote}

First, its an obvious bug. The entire point of plug-gable logging 
implementations is so that you can replace them. 
 
Second, the only person being actually affected would be Anton, because 
effective no one else is changing logging implementations so no one else is 
hitting that block.

For Anton (and anyone else) they would have to manually change the files in the 
lib folder and the configuration. So nothing is 'hidden' to him. He/They make a 
chance and they can report if there actually is a performance issue.  

Because they can "scratch their itch" of running Cassandra in a container they 
might find new problems or they might make new opportunities. For example, they 
may find that some other implementation is actually better or faster. 

If anyone was actually trying to convince me that this bug is intentional, 
(which is almost laughable). The proper practice would be:

{code}
if (!logger instanceof XYZ){
  throw new IllegalArgumentException("we only support XYZ for reasons ABC");
}
{code}

But instead we are attempting to pretend the opposite, that the bug is 
intentional and the correct thing to do is throw a ClassCastException. Which is 
a joke.



was (Author: appodictic):
{quote}
people would complain that C* is slow but don't realize it's in this case 
because of that change. 
{quote}

First, its an obvious bug. The entire point of plug-gable logging 
implementations is so that you can replace them. 
 
Second, the only person being actually affected would be Anton, because 
effective no one else is changing logging implementations so no one else is 
hitting that block.

For Anton (and anyone else) they would have to manually change the files in the 
lib folder and the configuration. So nothing is 'hidden' to him. He/They make a 
chance and they can report if there actually is a performance issue.  

Because they can "scratch their itch" of running Cassandra in a container they 
might find new problems or they might make new opportunities. For example, they 
may find that some other implementation is actually better or faster. 

If anyone was actually trying to convince be that this bug is intentional, 
(which is almost laughable). The proper practice would 

{code}
if (!logger instanceof XYZ){
  throw new IllegalArgumentException("we only support XYZ for reasons ABC");
}
{code}

But instead we are attempting to pretend the opposite that the bug is 
intentional and the correct thing to do is throw a ClassCastException. 


> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951215#comment-15951215
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

{quote}
people would complain that C* is slow but don't realize it's in this case 
because of that change. 
{quote}

First, its an obvious bug. The entire point of plug-gable logging 
implementations is so that you can replace them. 
 
Second, the only person being actually affected would be Anton, because 
effective no one else is changing logging implementations so no one else is 
hitting that block.

For Anton (and anyone else) they would have to manually change the files in the 
lib folder and the configuration. So nothing is 'hidden' to him. He/They make a 
chance and they can report if there actually is a performance issue.  

Because they can "scratch their itch" of running Cassandra in a container they 
might find new problems or they might make new opportunities. For example, they 
may find that some other implementation is actually better or faster. 

If anyone was actually trying to convince be that this bug is intentional, 
(which is almost laughable). The proper practice would 

{code}
if (!logger instanceof XYZ){
  throw new IllegalArgumentException("we only support XYZ for reasons ABC");
}
{code}

But instead we are attempting to pretend the opposite that the bug is 
intentional and the correct thing to do is throw a ClassCastException. 


> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951184#comment-15951184
 ] 

Robert Stupp commented on CASSANDRA-13396:
--

bq. just an application (a plain Java main()) that instantiates a 
CassandraDaemon

If that's just for testing, why not just use logback?

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951181#comment-15951181
 ] 

Robert Stupp commented on CASSANDRA-13396:
--

[~apassiou], C* is meant to run as a standalone application using the 
dependencies that are in the {{lib/}} folder. Any change to those dependencies 
and the way C* is started, is basically up to the person who changes the 
dependencies. We can of course talk about using a different logger 
implementation instead of logback and discuss the pros and cons. But that is 
IMO way beyond an {{instanceof}} check.

I'm generally concerned about stability and hidden performance issues and a 
change to a (logger implementation) library, which is nearly everywhere in the 
hot code path. Mean, we use logback now for a really long time - but we have no 
test nor production experience running something else. One example: one thing 
that may happen is some hidden contention in that logger library causing weird 
outliers - people would complain that C* is slow but don't realize it's in this 
case because of that change. That's one reason why we are so careful with 
library updates especially in minor versions. All I'm saying is, that getting 
_all_ the consequences of such a change is a lot of work.

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13229) dtest failure in topology_test.TestTopology.size_estimates_multidc_test

2017-03-31 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951175#comment-15951175
 ] 

Paulo Motta commented on CASSANDRA-13229:
-

Thanks for the feedback Marcus!

bq. But wouldn't this lead to the problems with disk blacklisting and 
resurrecting data like Paulo said in CASSANDRA-6696?

There's no actual problem from what Marcus said, we just wanted to keep vnodes 
tied to a single disk to be able to take vnodes offline if a specific disk 
fails but this hasn't been implemented yet. Furthermore CASSANDRA-10540 will 
probably benefit if vnodes are kept in a single disk.

With this said, we should probably keep your approach of falling back to split 
evenly if not all disks are used and also raise the minimum number of tokens to 
split by vnode from 2 to 8 to ensure a better balance (Marcus suggested 16 but 
8 should be good already since there are 24 local ranges with RF=3 and 
CASSANDRA-7032 should also improve load balancing). We can probably reconsider 
this if we ever implement the optimization of blacklisting specific vnodes when 
disks fail.

In your initial approach, I'd just move the fallback out of 
{{splitOwnedRanges}} because it's a bit wrong to pass {{dontSplitRanges=true}} 
and have your ranges splitted. :-)

> dtest failure in topology_test.TestTopology.size_estimates_multidc_test
> ---
>
> Key: CASSANDRA-13229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Sean McCarthy
>Assignee: Alex Petrov
>  Labels: dtest, test-failure
> Fix For: 4.0
>
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_novnode_dtest/508/testReport/topology_test/TestTopology/size_estimates_multidc_test
> {code}
> Standard Output
> Unexpected error in node1 log, error: 
> ERROR [MemtablePostFlush:1] 2017-02-15 16:07:33,837 CassandraDaemon.java:211 
> - Exception in thread Thread[MemtablePostFlush:1,5,main]
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45]
>   at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92)
>  ~[main/:na]
>   at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) 
> ~[main/:na]
>   at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
>  ~[main/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81)
>  [main/:na]
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown
>  Source) [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> Unexpected error in node1 log, error: 
> ERROR [MigrationStage:1] 2017-02-15 16:07:33,853 CassandraDaemon.java:211 - 
> Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$496(SchemaKeyspace.java:284)
>  ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace$$Lambda$222/1949434065.accept(Unknown
>  Source) ~[na:na]
>   at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:284) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.applyChanges(SchemaKeyspace.java:1265)
>  ~[main/:na]
>   at org.apache.cassandra.schema.Schema.merge(Schema.java:577) ~[main/:na]
>   at 
> org.apache.cassandra.schema.Schema.mergeAndAnnounceVersion(Schema.java:564) 
> ~[main/:na]
>   at 
> 

[jira] [Commented] (CASSANDRA-12627) Provide new seed providers

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951153#comment-15951153
 ] 

Edward Capriolo commented on CASSANDRA-12627:
-

Giving up on this one. Maybe I dont understand why exists a "plugable" seed 
provider so that there can only be 1 implementation of it.

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-12627) Provide new seed providers

2017-03-31 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated CASSANDRA-12627:

Status: Open  (was: Patch Available)

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-12627) Provide new seed providers

2017-03-31 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reassigned CASSANDRA-12627:
---

Assignee: (was: Edward Capriolo)

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-10825) OverloadedException is untested

2017-03-31 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reassigned CASSANDRA-10825:
---

Assignee: (was: Edward Capriolo)

> OverloadedException is untested
> ---
>
> Key: CASSANDRA-10825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10825
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
> Attachments: jmx-hint.png
>
>
> If you grep test/src and cassandra-dtest you will find that the string 
> OverloadedException doesn't appear anywhere.
> In CASSANDRA-10477 it was found that there were cases where Paxos should 
> back-pressure and throw OverloadedException but didn't.
> If OverloadedException is used for functional purposes then we should test 
> that it is thrown under expected conditions. If there are behaviors driven by 
> catching or tracking OverloadedException we should test those as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-11537) Give clear error when certain nodetool commands are issued before server is ready

2017-03-31 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reassigned CASSANDRA-11537:
---

Assignee: (was: Edward Capriolo)

> Give clear error when certain nodetool commands are issued before server is 
> ready
> -
>
> Key: CASSANDRA-11537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11537
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Edward Capriolo
>Priority: Minor
>  Labels: lhf
>
> As an ops person upgrading and servicing Cassandra servers, I require a more 
> clear message when I issue a nodetool command that the server is not ready 
> for it so that I am not confused.
> Technical description:
> If you deploy a new binary, restart, and issue nodetool 
> scrub/compact/updatess etc you get unfriendly assertion. An exception would 
> be easier to understand. Also if a user has turned assertions off it is 
> unclear what might happen. 
> {noformat}
> EC1: Throw exception to make it clear server is still in start up process. 
> :~# nodetool upgradesstables
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at org.apache.cassandra.db.Keyspace.open(Keyspace.java:97)
> at 
> org.apache.cassandra.service.StorageService.getValidKeyspace(StorageService.java:2573)
> at 
> org.apache.cassandra.service.StorageService.getValidColumnFamilies(StorageService.java:2661)
> at 
> org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2421)
> {noformat}
> EC1: 
> Patch against 2.1 (branch)
> https://github.com/apache/cassandra/compare/trunk...edwardcapriolo:exception-on-startup?expand=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Anton Passiouk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951137#comment-15951137
 ] 

Anton Passiouk commented on CASSANDRA-13396:


Edward, I appreciate that you wanted to help but please stop hijacking my 
question, or at least try to be constructive...

I don't know what you or Robert have understood when I said "container" but in 
my case it's just an application (a plain Java main()) that instantiates a 
CassandraDaemon and sometimes other stuff. But one could also do it in a "unit" 
test (which is not really a unit bt more an automatic integration test).

@Robert, I can understand your concerns about not-tested behavior of other 
bindings, then shouldn't it be stated in the docs that other bindings are not 
supported, and a more explicit error thrown?
But I don't think the performance impact is a good argument because logback and 
slf4j are configurable by themselves with configuration files it can have a 
very strong impact on the performance (log patterns, where you log to) even if 
one uses logback.

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12661) Make gc_log and gc_warn settable at runtime

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951131#comment-15951131
 ] 

Edward Capriolo commented on CASSANDRA-12661:
-

Good luck someone carry on my cause without me. Make logging great again.

> Make gc_log and gc_warn settable at runtime
> ---
>
> Key: CASSANDRA-12661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12661
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Priority: Minor
>
> Changes:
> * Move gc_log_threshold_in_ms and gc_warn_threshold_in_ms close together in 
> the config
> * rename variables to match properties
> * add unit tests to ensure hybration
> * add unit tests to ensure variables are set propertly
> * minor perf (do not consturct string from buffer f not logging)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-12661) Make gc_log and gc_warn settable at runtime

2017-03-31 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reassigned CASSANDRA-12661:
---

Assignee: (was: Edward Capriolo)

> Make gc_log and gc_warn settable at runtime
> ---
>
> Key: CASSANDRA-12661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12661
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Priority: Minor
>
> Changes:
> * Move gc_log_threshold_in_ms and gc_warn_threshold_in_ms close together in 
> the config
> * rename variables to match properties
> * add unit tests to ensure hybration
> * add unit tests to ensure variables are set propertly
> * minor perf (do not consturct string from buffer f not logging)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reassigned CASSANDRA-13396:
---

Assignee: (was: Edward Capriolo)

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951110#comment-15951110
 ] 

Edward Capriolo edited comment on CASSANDRA-13396 at 3/31/17 3:23 PM:
--

So strange:

No such statement about supporting containers seems to exist.
[edward@jackintosh cassandra]$ find . -type f | xargs grep containers
./src/java/org/apache/cassandra/db/ColumnFamilyStore.java: * thread safety. 
 All we do is wipe the sstable containers clean, while leaving the actual
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
private final List containers = new ArrayList<>();
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
containers.add(keys);
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
if (containers.size() > 0)
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
for (TokenTreeBuilder tokens : containers)
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
containers.clear();
./conf/jvm.options:# This helps prevent soft faults in containers and makes
Binary file 
./build/classes/main/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder$MutableDataBlock.class
 matches
[edward@jackintosh cassandra]$ find . -type f | xargs grep Containers

Its almost as if people just make up things, and then when you corner them on 
their position being false they just pivot and make up a new reason not to like 
the idea.


was (Author: appodictic):
So strange:

No such statement about supporting containers seems to exist.
[edward@jackintosh cassandra]$ find . -type f | xargs grep containers
./src/java/org/apache/cassandra/db/ColumnFamilyStore.java: * thread safety. 
 All we do is wipe the sstable containers clean, while leaving the actual
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
private final List containers = new ArrayList<>();
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
containers.add(keys);
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
if (containers.size() > 0)
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
for (TokenTreeBuilder tokens : containers)
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
containers.clear();
./conf/jvm.options:# This helps prevent soft faults in containers and makes
Binary file 
./build/classes/main/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder$MutableDataBlock.class
 matches
[edward@jackintosh cassandra]$ find . -type f | xargs grep Containers


> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951110#comment-15951110
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

So strange:

No such statement about supporting containers seems to exist.
[edward@jackintosh cassandra]$ find . -type f | xargs grep containers
./src/java/org/apache/cassandra/db/ColumnFamilyStore.java: * thread safety. 
 All we do is wipe the sstable containers clean, while leaving the actual
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
private final List containers = new ArrayList<>();
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
containers.add(keys);
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
if (containers.size() > 0)
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
for (TokenTreeBuilder tokens : containers)
./src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java:
containers.clear();
./conf/jvm.options:# This helps prevent soft faults in containers and makes
Binary file 
./build/classes/main/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder$MutableDataBlock.class
 matches
[edward@jackintosh cassandra]$ find . -type f | xargs grep Containers


> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951102#comment-15951102
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

{quote}
Reason is stuff like CASSANDRA-12535 and CASSANDRA-13173, which are hard to 
figure out and even harder to ensure its functionality in unit tests. 
{quote}

So because someone made bugs in the past, which are "hard to figure out" and 
you can not "foresee the consequences" . Is this back to the future part 4?

Please verify your claim of "not supporting containers" before finding other 
reasons to not like the idea of fixing an obvious problem.




> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951093#comment-15951093
 ] 

Robert Stupp commented on CASSANDRA-13396:
--

[~apassiou], right, it's true for any other slf4j binding. 
Reason is stuff like CASSANDRA-12535 and CASSANDRA-13173, which are hard to 
figure out and even harder to ensure its functionality in unit tests. That's 
why I'm against such a change. We cannot foresee the consequences, because we 
have not tested other bindings. Even further, the performance implications of 
using another logger implementation are not determined. Believe me, it's not 
blindly shooting something down - I had a hard time to fix this issue and do 
not like to see it happen again. BTW: It's late in the afternoon over here, so 
it's not a too quick reaction early in the morning.

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951086#comment-15951086
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

"Open discussions" in cassandra always start with the concept of "its not my 
idea so -1" which is the exact opposite of "scratch an itch". 

"We do not support embedding C* in a container"

Really? who says? where is it said? Who is "we"?

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Anton Passiouk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951053#comment-15951053
 ] 

Anton Passiouk edited comment on CASSANDRA-13396 at 3/31/17 2:56 PM:
-

@[~snazy]: OK but what if the cassandra daemon is not embedded anywhere but is 
simply running with a classpath containing several slf4j bindings?
It will still crash, right?

@[~appodictic]: please don't over-react (and don't hijack my question), it's an 
open discussion ;-)


was (Author: apassiou):
@[~snazy]: OK but what if the cassandra daemon is not embedded anywhere but is 
simply running with a classpath containing several slf4j bindings?
It will still crash, right?

@[~appodictic]: please don't over-react, it's an open discussion ;-)

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951062#comment-15951062
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

So funny that i litteraly wake up, go out of my way to fix an issue for 
someone, and even though everyone is Cassandra is too busy to reply to emails 
and help people they are Johnny on the spot to jump on Jira and -1 code.

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Anton Passiouk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951053#comment-15951053
 ] 

Anton Passiouk edited comment on CASSANDRA-13396 at 3/31/17 2:55 PM:
-

@[~snazy]: OK but what if the cassandra daemon is not embedded anywhere but is 
simply running with a classpath containing several slf4j bindings?
It will still crash, right?

@[~appodictic]: please don't over-react, it's an open discussion ;-)


was (Author: apassiou):
OK but what if the cassandra daemon is not embedded anywhere but is simply 
running with a classpath containing several slf4j bindings?
It will still crash, right?

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951057#comment-15951057
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

LOL I just posted this tweet yesterday.

https://twitter.com/edwardcapriolo/status/847484593041100800

What comedy cassandra is. No one even bothers to say "how can we work 
together?" or "how can we wrote the code to make all users happy" They just 
instantly drop a -1 on things. lol

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951052#comment-15951052
 ] 

Edward Capriolo edited comment on CASSANDRA-13396 at 3/31/17 2:51 PM:
--

How come everyone in Cassandra's first reaction is to -1 everything? 

The entire model of apache is "I have an itch to scratch". This person WANTS to 
run Cassandra in a container it is an "itch". The immediate opposition position 
should not be "BUT DON'T SCRATCH THAT ITCH", because I say so.




was (Author: appodictic):
How come everyone in Cassandra's first reaction is to -1 everything? 

The entire model of apache is "I have an itch to scratch". This person WANTS to 
run Cassandra in a container it is an "itch". The position should not just 
instantly be "BUT DON'T SCRATCH THAT ITCH". 



> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951052#comment-15951052
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

How come everyone in Cassandra's first reaction is to -1 everything? 

The entire model of apache is "I have an itch to scratch". This person WANTS to 
run Cassandra in a container it is an "itch". The position should not just 
instantly be "BUT DON'T SCRATCH THAT ITCH". 



> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Anton Passiouk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951053#comment-15951053
 ] 

Anton Passiouk commented on CASSANDRA-13396:


OK but what if the cassandra daemon is not embedded anywhere but is simply 
running with a classpath containing several slf4j bindings?
It will still crash, right?

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-13396:
-
  Priority: Minor  (was: Major)
Issue Type: Wish  (was: Bug)

I'm strongly -1 on this change.

This change will cause weird and hard to catch follow-up issues (see the 
discussions and issues around that piece code), which _cannot_ be caught by 
neither unit nor dtests because it's an unsupported setup. We do not support 
embedding C* in a container (i.e. a JVM not controlled "by us"). IMO, 
supporting C* in such an environment will cause other issues. Technically, it's 
not a major bug - changed it to wish.

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950990#comment-15950990
 ] 

Edward Capriolo commented on CASSANDRA-13396:
-

https://github.com/apache/cassandra/compare/trunk...edwardcapriolo:CASSANDRA-13396?expand=1

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated CASSANDRA-13396:

Status: Patch Available  (was: Open)

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13395) Expired rows without regular column data can crash upgradesstables

2017-03-31 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-13395:
---
 Reviewer: Sylvain Lebresne
Reproduced In: 3.0.12
   Status: Patch Available  (was: Open)

> Expired rows without regular column data can crash upgradesstables
> --
>
> Key: CASSANDRA-13395
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13395
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> In {{2.x}} if an expired row is compacted its row marker will be converted 
> into a {{DeletedCell}}. In {{3.0}}, when the row is read by {{LegacyLayout}} 
> it will be converted in a row without {{PrimaryKeyLivenessInfo}}. If the row 
> does not contains any data for the regular columns, or if the table simply 
> has no regular columns it will then be considered as {{empty}}. Which will 
> crash {{upgradesstables}} with the following error:
> {code}
> java.lang.AssertionError
> at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:207)
> at 
> org.apache.cassandra.db.transform.BaseRows.applyOne(BaseRows.java:116)
> at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:107)
> at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41)
> at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156)
> at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:147)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
> at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:308)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$0(NamedThreadFactory.java:79)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This problem is cause



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13395) Expired rows without regular column data can crash upgradesstables

2017-03-31 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950951#comment-15950951
 ] 

Benjamin Lerer commented on CASSANDRA-13395:


Waiting for the upgrade tests.


> Expired rows without regular column data can crash upgradesstables
> --
>
> Key: CASSANDRA-13395
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13395
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> In {{2.x}} if an expired row is compacted its row marker will be converted 
> into a {{DeletedCell}}. In {{3.0}}, when the row is read by {{LegacyLayout}} 
> it will be converted in a row without {{PrimaryKeyLivenessInfo}}. If the row 
> does not contains any data for the regular columns, or if the table simply 
> has no regular columns it will then be considered as {{empty}}. Which will 
> crash {{upgradesstables}} with the following error:
> {code}
> java.lang.AssertionError
> at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:207)
> at 
> org.apache.cassandra.db.transform.BaseRows.applyOne(BaseRows.java:116)
> at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:107)
> at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41)
> at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156)
> at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:147)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
> at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:308)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$0(NamedThreadFactory.java:79)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This problem is cause



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2017-03-31 Thread Edward Capriolo (JIRA)
Edward Capriolo created CASSANDRA-13396:
---

 Summary: Cassandra 3.10: ClassCastException in 
ThreadAwareSecurityManager
 Key: CASSANDRA-13396
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
 Project: Cassandra
  Issue Type: Bug
Reporter: Edward Capriolo
Assignee: Edward Capriolo


https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2017-03-31 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950929#comment-15950929
 ] 

Jason Brown commented on CASSANDRA-8457:


rebased my branch on trunk as of 2017-Mar-28, and added in CASSANDRA-13018 and 
CASSANDRA-13324

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13395) Expired rows without regular column data can crash upgradesstables

2017-03-31 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950928#comment-15950928
 ] 

Benjamin Lerer commented on CASSANDRA-13395:


I discussed offline with [~slebresne] and the internal iterators do not accept 
empty rows for performance reasons.
As we know that except for indexes the deleted cells are caused by the 
compaction of expired row marker we can avoid the empty row problem by treating 
those rows as the expired ones. The only information missing being the original 
TTL we can simply replace that one by a fake one.

I pushed an initial version of the patch 
[here|https://github.com/apache/cassandra/compare/trunk...blerer:13395-3.0].  

> Expired rows without regular column data can crash upgradesstables
> --
>
> Key: CASSANDRA-13395
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13395
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> In {{2.x}} if an expired row is compacted its row marker will be converted 
> into a {{DeletedCell}}. In {{3.0}}, when the row is read by {{LegacyLayout}} 
> it will be converted in a row without {{PrimaryKeyLivenessInfo}}. If the row 
> does not contains any data for the regular columns, or if the table simply 
> has no regular columns it will then be considered as {{empty}}. Which will 
> crash {{upgradesstables}} with the following error:
> {code}
> java.lang.AssertionError
> at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:207)
> at 
> org.apache.cassandra.db.transform.BaseRows.applyOne(BaseRows.java:116)
> at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:107)
> at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41)
> at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156)
> at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:147)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
> at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:308)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$0(NamedThreadFactory.java:79)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This problem is cause



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13395) Expired rows without regular column data can crash upgradesstables

2017-03-31 Thread Benjamin Lerer (JIRA)
Benjamin Lerer created CASSANDRA-13395:
--

 Summary: Expired rows without regular column data can crash 
upgradesstables
 Key: CASSANDRA-13395
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13395
 Project: Cassandra
  Issue Type: Bug
Reporter: Benjamin Lerer
Assignee: Benjamin Lerer


In {{2.x}} if an expired row is compacted its row marker will be converted into 
a {{DeletedCell}}. In {{3.0}}, when the row is read by {{LegacyLayout}} it will 
be converted in a row without {{PrimaryKeyLivenessInfo}}. If the row does not 
contains any data for the regular columns, or if the table simply has no 
regular columns it will then be considered as {{empty}}. Which will crash 
{{upgradesstables}} with the following error:
{code}
java.lang.AssertionError
at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:207)
at 
org.apache.cassandra.db.transform.BaseRows.applyOne(BaseRows.java:116)
at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:107)
at 
org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41)
at 
org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156)
at 
org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:147)
at 
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
at 
org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
at 
org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
at 
org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:416)
at 
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:308)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$0(NamedThreadFactory.java:79)
at java.lang.Thread.run(Thread.java:745)
{code}
This problem is cause



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13393) Invalid row cache size (in MB) is reported by JMX and NodeTool

2017-03-31 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950850#comment-15950850
 ] 

Robert Stupp commented on CASSANDRA-13393:
--

[~Fuud], mind to provide a patch?

> Invalid row cache size (in MB) is reported by JMX and NodeTool
> --
>
> Key: CASSANDRA-13393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13393
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Fuud
>Priority: Minor
>
> Row Cache size is reported in entries but should be reported in bytes (as 
> KeyCache do).
> It happens because incorrect OHCProvider.OHCacheAdapter.weightedSize method. 
> Currently it returns cache size but should return ohCache.memUsed()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13229) dtest failure in topology_test.TestTopology.size_estimates_multidc_test

2017-03-31 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950774#comment-15950774
 ] 

Alex Petrov commented on CASSANDRA-13229:
-

bq. If there are more disks than vnode ranges, I think we should fall back to 
splitting vnodes over the available disks, otherwise we would leave some disks 
unused

But wouldn't this lead to the problems with disk blacklisting and resurrecting 
data like Paulo said in [CASSANDRA-6696]?.. If not, we could go with "try 
dividing with {{splitOwnedRangesNoPartialRanges}} first, if some disks are 
still unutilized, fall back to splitting vnode ranges. 

> dtest failure in topology_test.TestTopology.size_estimates_multidc_test
> ---
>
> Key: CASSANDRA-13229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Sean McCarthy
>Assignee: Alex Petrov
>  Labels: dtest, test-failure
> Fix For: 4.0
>
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_novnode_dtest/508/testReport/topology_test/TestTopology/size_estimates_multidc_test
> {code}
> Standard Output
> Unexpected error in node1 log, error: 
> ERROR [MemtablePostFlush:1] 2017-02-15 16:07:33,837 CassandraDaemon.java:211 
> - Exception in thread Thread[MemtablePostFlush:1,5,main]
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45]
>   at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92)
>  ~[main/:na]
>   at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) 
> ~[main/:na]
>   at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
>  ~[main/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81)
>  [main/:na]
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown
>  Source) [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> Unexpected error in node1 log, error: 
> ERROR [MigrationStage:1] 2017-02-15 16:07:33,853 CassandraDaemon.java:211 - 
> Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$496(SchemaKeyspace.java:284)
>  ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace$$Lambda$222/1949434065.accept(Unknown
>  Source) ~[na:na]
>   at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:284) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.applyChanges(SchemaKeyspace.java:1265)
>  ~[main/:na]
>   at org.apache.cassandra.schema.Schema.merge(Schema.java:577) ~[main/:na]
>   at 
> org.apache.cassandra.schema.Schema.mergeAndAnnounceVersion(Schema.java:564) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.MigrationManager$1.runMayThrow(MigrationManager.java:402)
>  ~[main/:na]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at 
> 

[jira] [Commented] (CASSANDRA-13229) dtest failure in topology_test.TestTopology.size_estimates_multidc_test

2017-03-31 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950698#comment-15950698
 ] 

Marcus Eriksson commented on CASSANDRA-13229:
-

bq. it's expected that a single vnode range should not span more than 1 disk 
If there are more disks than vnode ranges, I think we should fall back to 
splitting vnodes over the available disks, otherwise we would leave some disks 
unused. This should be very rare in real production scenarios. I would assume 
that this is less surprising to a user than the fact that a vnode was split 
over several disks? Maybe even do something like avoiding to split vnode ranges 
onto multiple ranges if there are less than 16 tokens on the node (reason being 
that it is very hard to get a good balance). So, instead of checking if we run 
vnodes or single-token, we check if there are less than 16 tokens?

The optimisation we wanted to do in the future was to be able to take a bunch 
of vnode ranges offline if the disk backing the ranges fails, but not sure that 
is planned anymore.

> dtest failure in topology_test.TestTopology.size_estimates_multidc_test
> ---
>
> Key: CASSANDRA-13229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Sean McCarthy
>Assignee: Alex Petrov
>  Labels: dtest, test-failure
> Fix For: 4.0
>
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_novnode_dtest/508/testReport/topology_test/TestTopology/size_estimates_multidc_test
> {code}
> Standard Output
> Unexpected error in node1 log, error: 
> ERROR [MemtablePostFlush:1] 2017-02-15 16:07:33,837 CassandraDaemon.java:211 
> - Exception in thread Thread[MemtablePostFlush:1,5,main]
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45]
>   at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92)
>  ~[main/:na]
>   at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) 
> ~[main/:na]
>   at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
>  ~[main/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81)
>  [main/:na]
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown
>  Source) [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> Unexpected error in node1 log, error: 
> ERROR [MigrationStage:1] 2017-02-15 16:07:33,853 CassandraDaemon.java:211 - 
> Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>   at 
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$496(SchemaKeyspace.java:284)
>  ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace$$Lambda$222/1949434065.accept(Unknown
>  Source) ~[na:na]
>   at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:284) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.applyChanges(SchemaKeyspace.java:1265)
>  ~[main/:na]
>   at org.apache.cassandra.schema.Schema.merge(Schema.java:577) ~[main/:na]
>   at 
> org.apache.cassandra.schema.Schema.mergeAndAnnounceVersion(Schema.java:564) 
> ~[main/:na]
>   at 
> org.apache.cassandra.schema.MigrationManager$1.runMayThrow(MigrationManager.java:402)
>  ~[main/:na]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
>   at 
> 

[jira] [Commented] (CASSANDRA-13321) Add a checksum component for the sstable metadata (-Statistics.db) file

2017-03-31 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950594#comment-15950594
 ] 

Marcus Eriksson commented on CASSANDRA-13321:
-

pushed a commit to the repo above which handles having several versions of the 
statistics files. On startup we find the latest version and use that. When we 
mutate the level/repaired info, we create a new file, and then point the 
sstable reader to that new file

I have not written many tests for this yet, wanted to get some early feedback 
on the approach, wdyt [~jasobrown]?

> Add a checksum component for the sstable metadata (-Statistics.db) file
> ---
>
> Key: CASSANDRA-13321
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13321
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
>
> Since we keep important information in the sstable metadata file now, we 
> should add a checksum component for it. One danger being if a bit gets 
> flipped in repairedAt we could consider the sstable repaired when it is not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-13151) Unicode unittest fail

2017-03-31 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-13151:
---

Assignee: Alex Petrov

> Unicode unittest fail
> -
>
> Key: CASSANDRA-13151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13151
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jay Zhuang
>Assignee: Alex Petrov
>
> Following unittests are failed in 3.11 and trunk(4.0) branch
> {noformat}
> SASIIndexTest.testUnicodeSupport
> StandardAnalyzerTest.testTokenizationJaJp1
> StandardAnalyzerTest.testTokenizationJaJp2
> StandardAnalyzerTest.testTokenizationRuRu1
> StandardAnalyzerTest.testTokenizationZnTw1
> {noformat}
> It works fine on my local mac, but not linux server. I guess it's related to 
> Unicode setting. Does anyone have any idea on that? (could it be related to 
> CASSANDRA-11077, CASSANDRA-11431?)
> Here are the failure details
> {noformat}
> $ ant testsome -Dtest.name=org.apache.cassandra.index.sasi.SASIIndexTest 
> -Dtest.methods=testUnicodeSupport
> ...
> [junit] Testcase: 
> testUnicodeSupport(org.apache.cassandra.index.sasi.SASIIndexTest):
> FAILED
> [junit] []
> [junit] junit.framework.AssertionFailedError: []
> [junit] at 
> org.apache.cassandra.index.sasi.SASIIndexTest.testUnicodeSupport(SASIIndexTest.java:1159)
> [junit] at 
> org.apache.cassandra.index.sasi.SASIIndexTest.testUnicodeSupport(SASIIndexTest.java:1122)
> {noformat}
> {noformat}
> $ ant test -Dtest.name=StandardAnalyzerTest
> ...
> [junit] Testcase: 
> testTokenizationJaJp1(org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest):
>  FAILED
> [junit] expected:<210> but was:<0>
> [junit] junit.framework.AssertionFailedError: expected:<210> but was:<0>
> [junit] at 
> org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest.testTokenizationJaJp1(StandardAnalyzerTest.java:85)
> [junit]
> [junit]
> [junit] Testcase: 
> testTokenizationJaJp2(org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest):
>  FAILED
> [junit] expected:<57> but was:<9>
> [junit] junit.framework.AssertionFailedError: expected:<57> but was:<9>
> [junit] at 
> org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest.testTokenizationJaJp2(StandardAnalyzerTest.java:104)
> [junit]
> [junit]
> [junit] Testcase: 
> testTokenizationRuRu1(org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest):
>  FAILED
> [junit] expected:<456> but was:<0>
> [junit] junit.framework.AssertionFailedError: expected:<456> but was:<0>
> [junit] at 
> org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest.testTokenizationRuRu1(StandardAnalyzerTest.java:120)
> [junit]
> [junit]
> [junit] Testcase: 
> testTokenizationZnTw1(org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest):
>  FAILED
> [junit] expected:<963> but was:<0>
> [junit] junit.framework.AssertionFailedError: expected:<963> but was:<0>
> [junit] at 
> org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest.testTokenizationZnTw1(StandardAnalyzerTest.java:136)
> [junit]
> [junit]
> [junit] Test 
> org.apache.cassandra.index.sasi.analyzer.StandardAnalyzerTest FAILED
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13327) Pending endpoints size check for CAS doesn't play nicely with writes-on-replacement

2017-03-31 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950523#comment-15950523
 ] 

Sylvain Lebresne commented on CASSANDRA-13327:
--

bq. so bootstrap can be resumed

Forgot we supported resuming now :)

bq. by making the read phase use an extended (RF + P + 1) / 2 quorum?

Reading from pending nodes is a very bad idea since by definition those nodes 
don't have up-to-date data.

Well, I guess things are working as they do for decently good reason here. That 
said, thinking about it, it could be that the solution from CASSANDRA-8346 is a 
bit of a big hammer: I believe it's enough to ensure that we read from at least 
one replica that responded to PREPARE 'in the same Paxos round' But we have 
timeouts on the paxos round, so it could be it is possible to reduce 
drastically the time we consider a node pending for CAS so that it's not a real 
problem in practice. Something like having pending node move to a "almost 
there" state before becoming true replica, and staying in that state for 
basically the max time of a paxos round, and then Paxos might be able to 
replace "pending" nodes by those "almost there" for PREPARE.

With that said, anything paxos related is pretty subtle so I'm not saying this 
would work, one would have to look at the idea a lot more closely. Also, this 
probably wouldn't be a trivial change at all. And to be upfront, I'm unlikely 
to personally have cycles to devote to this in the short term. 


> Pending endpoints size check for CAS doesn't play nicely with 
> writes-on-replacement
> ---
>
> Key: CASSANDRA-13327
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13327
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>
> Consider this ring:
> 127.0.0.1  MR UP JOINING -7301836195843364181
> 127.0.0.2MR UP NORMAL -7263405479023135948
> 127.0.0.3MR UP NORMAL -7205759403792793599
> 127.0.0.4   MR DOWN NORMAL -7148113328562451251
> where 127.0.0.1 was bootstrapping for cluster expansion. Note that, due to 
> the failure of 127.0.0.4, 127.0.0.1 was stuck trying to stream from it and 
> making no progress.
> Then the down node was replaced so we had:
> 127.0.0.1  MR UP JOINING -7301836195843364181
> 127.0.0.2MR UP NORMAL -7263405479023135948
> 127.0.0.3MR UP NORMAL -7205759403792793599
> 127.0.0.5   MR UP JOINING -7148113328562451251
> It’s confusing in the ring - the first JOINING is a genuine bootstrap, the 
> second is a replacement. We now had CAS unavailables (but no non-CAS 
> unvailables). I think it’s because the pending endpoints check thinks that 
> 127.0.0.5 is gaining a range when it’s just replacing.
> The workaround is to kill the stuck JOINING node, but Cassandra shouldn’t 
> unnecessarily fail these requests.
> It also appears like required participants is bumped by 1 during a host 
> replacement so if the replacing host fails you will get unavailables and 
> timeouts.
> This is related to the check added in CASSANDRA-8346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-31 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13392:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

ok, lets leave it like it is

> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)