[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-07-05 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14252:
---
Fix Version/s: (was: 3.11.3)
   (was: 3.0.17)

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9067) BloomFilter serialization format should not change byte ordering

2018-01-18 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331592#comment-16331592
 ] 

Jay Zhuang commented on CASSANDRA-9067:
---

Good points. Thanks [~jasobrown] for the review.

Removed the {{oldBfFormat}} serialization and moved it to unittest.
Removed {{serialize()}} and {{deserialize()}} in {{FilterFactory}}, updated in 
the same branch:
| Branch | uTest |
| [9067|https://github.com/cooldoger/cassandra/tree/9067] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/9067.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/9067]
 |

> BloomFilter serialization format should not change byte ordering
> 
>
> Key: CASSANDRA-9067
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9067
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.x
>
>
> As a follow-up to CASSANDRA-9066 and CASSANDRA-9060, it appears we do some 
> unnecessary byte swapping during the serialization of bloom filters, which 
> makes the logic slower and harder to follow. We should either perform them 
> more efficiently (using Long.reverseBytes) or, preferably, eliminate the 
> conversion altogether since it does not appear to serve any purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14140) Add unittest for Schema migration change (CASSANDRA-14109)

2018-01-18 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331236#comment-16331236
 ] 

Jay Zhuang edited comment on CASSANDRA-14140 at 1/19/18 2:11 AM:
-

Thanks [~jasobrown] for the review. The JIRA title is updated.

I forgot to add user schema in the schema migration test, so the schema is 
always empty. The branch is updated:

| Branch | uTest |
| [14140-3.11|https://github.com/cooldoger/cassandra/tree/14140-3.11] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14140-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14140-3.11]
 |


was (Author: jay.zhuang):
Thanks [~jasobrown] for the review. The JIRA title is updated.

I forgot to add user schema in the schema migration test, so the schema is 
always empty. The branch is updated.

> Add unittest for Schema migration change (CASSANDRA-14109)
> --
>
> Key: CASSANDRA-14140
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14140
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: testing
>
> It's a fairly big change, would be better to have a few unittest.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14061) trunk eclipse-warnings

2018-03-07 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390459#comment-16390459
 ] 

Jay Zhuang commented on CASSANDRA-14061:


CASSANDRA-14296 would fix the new ones.

For the warnings in {{SSTableIdentityIterator.java}}, it cannot be reproduced 
consistently: [Cassandra-trunk-test-all: 
499|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test-all/498/console]
 vs. [Cassandra-trunk-test-all: 
499|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test-all/498/console].

Anyway, I rebased to the trunk to suppress warnings:
| Branch | uTest |
| [14061|https://github.com/cooldoger/cassandra/tree/14061] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14061.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14061]
 |


> trunk eclipse-warnings
> --
>
> Key: CASSANDRA-14061
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14061
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> {noformat}
> eclipse-warnings:
> [mkdir] Created dir: /home/ubuntu/cassandra/build/ecj
>  [echo] Running Eclipse Code Analysis.  Output logged to 
> /home/ubuntu/cassandra/build/ecj/eclipse_compiler_checks.txt
>  [java] --
>  [java] 1. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 59)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, file.getPath(), iterator);
>  [java]   
> ^^^
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 79)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, dfile.getPath(), iterator);
>  [java]   
> 
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2 problems (2 errors)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-11 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14308:
---
Description: 
If the JVM crash while compaction is in progress, the incompleted SSTable won't 
be cleaned up, which causes "{{Stats component is missing for sstable}}" error 
in the startup log:
{noformat}
ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
Exception in thread Thread[SSTableBatchOpen:3,5,main]
java.lang.AssertionError: Stats component is missing for sstable 
/cassandra/data/keyspace/table-id/mc-12345-big
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{noformat}

The accumulated incompleted SSTables could take lots of space, especially for 
STCS, which could have very large SSTables.
Here is the script we use to delete the SSTables after node is restarted:
{noformat}
grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' > 
~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
$ss*; sudo rm $ss* ; done
{noformat}
I would suggest to remove these incompleted SSTables while startup.

  was:
If the JVM crash while compaction is in progress, the incompleted SSTable won't 
be cleaned up, which causes "{{Stats component is missing for sstable}}" error 
in the start up log:
{noformat}
ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
Exception in thread Thread[SSTableBatchOpen:3,5,main]
java.lang.AssertionError: Stats component is missing for sstable 
/cassandra/data/keyspace/table-id/mc-12345-big
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{noformat}

The accumulated incompleted SSTables could take lots of space, especially for 
STCS, which could have very large SSTables.
Here is the script we use to delete the SSTables after node is restarted:
{noformat}
grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' > 
~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
$ss*; sudo rm $ss* ; done
{noformat}
I would suggest to remove these incompleted SSTables while startup.


> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> 

[jira] [Updated] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-11 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14308:
---
Description: 
If the JVM crash while compaction is in progress, the incompleted SSTable won't 
be cleaned up, which causes "{{Stats component is missing for sstable}}" error 
in the startup log:
{noformat}
ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
Exception in thread Thread[SSTableBatchOpen:3,5,main]
java.lang.AssertionError: Stats component is missing for sstable 
/cassandra/data/keyspace/table-id/mc-12345-big
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{noformat}

The accumulated incompleted SSTables could take lots of space, especially for 
STCS which could have very large SSTables.
Here is the script we use to delete the SSTables after node is restarted:
{noformat}
grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' > 
~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
$ss*; sudo rm $ss* ; done
{noformat}
I would suggest to remove these incompleted SSTables while startup.

  was:
If the JVM crash while compaction is in progress, the incompleted SSTable won't 
be cleaned up, which causes "{{Stats component is missing for sstable}}" error 
in the startup log:
{noformat}
ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
Exception in thread Thread[SSTableBatchOpen:3,5,main]
java.lang.AssertionError: Stats component is missing for sstable 
/cassandra/data/keyspace/table-id/mc-12345-big
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{noformat}

The accumulated incompleted SSTables could take lots of space, especially for 
STCS, which could have very large SSTables.
Here is the script we use to delete the SSTables after node is restarted:
{noformat}
grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' > 
~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
$ss*; sudo rm $ss* ; done
{noformat}
I would suggest to remove these incompleted SSTables while startup.


> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> 

[jira] [Updated] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-11 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14308:
---
Description: 
If the JVM crash while compaction is in progress, the incompleted SSTable won't 
be cleaned up, which causes "{{Stats component is missing for sstable}}" error 
in the start up log:
{noformat}
ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
Exception in thread Thread[SSTableBatchOpen:3,5,main]
java.lang.AssertionError: Stats component is missing for sstable 
/cassandra/data/keyspace/table-id/mc-12345-big
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{noformat}

The accumulated incompleted SSTables could take lots of space, especially for 
STCS, which could have very large SSTables.
Here is the script we use to delete the SSTables after node is restarted:
{noformat}
grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' > 
~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
$ss*; sudo rm $ss* ; done
{noformat}
I would suggest to remove these incompleted SSTables while startup.

  was:
If the JVM crash while compaction is in progress, the incompleted SSTable won't 
be cleaned up, which causes {{Stats component is missing for sstable}} error in 
the start up log:
{noformat}
ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
Exception in thread Thread[SSTableBatchOpen:3,5,main]
java.lang.AssertionError: Stats component is missing for sstable 
/cassandra/data/keyspace/table-id/mc-12345-big
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{noformat}

The accumulated incompleted SSTables could take lots of space, especially for 
STCS, which could have very large SSTables.
Here is the script we use to delete the SSTables after node is restarted:
{noformat}
grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' > 
~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
$ss*; sudo rm $ss* ; done
{noformat}
I would suggest to remove these incompleted SSTables while startup.


> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the start up log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> 

[jira] [Created] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-11 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14308:
--

 Summary: Remove invalid SSTables from interrupted compaction
 Key: CASSANDRA-14308
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction
Reporter: Jay Zhuang
Assignee: Jay Zhuang


If the JVM crash while compaction is in progress, the incompleted SSTable won't 
be cleaned up, which causes {{Stats component is missing for sstable}} error in 
the start up log:
{noformat}
ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
Exception in thread Thread[SSTableBatchOpen:3,5,main]
java.lang.AssertionError: Stats component is missing for sstable 
/cassandra/data/keyspace/table-id/mc-12345-big
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{noformat}

The accumulated incompleted SSTables could take lots of space, especially for 
STCS, which could have very large SSTables.
Here is the script we use to delete the SSTables after node is restarted:
{noformat}
grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' > 
~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
$ss*; sudo rm $ss* ; done
{noformat}
I would suggest to remove these incompleted SSTables while startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14061) trunk eclipse-warnings

2018-03-08 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14061:
---
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

> trunk eclipse-warnings
> --
>
> Key: CASSANDRA-14061
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14061
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> {noformat}
> eclipse-warnings:
> [mkdir] Created dir: /home/ubuntu/cassandra/build/ecj
>  [echo] Running Eclipse Code Analysis.  Output logged to 
> /home/ubuntu/cassandra/build/ecj/eclipse_compiler_checks.txt
>  [java] --
>  [java] 1. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 59)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, file.getPath(), iterator);
>  [java]   
> ^^^
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 79)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, dfile.getPath(), iterator);
>  [java]   
> 
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2 problems (2 errors)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14061) trunk eclipse-warnings

2018-03-08 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392113#comment-16392113
 ] 

Jay Zhuang commented on CASSANDRA-14061:


Thanks for the review.
Committed as 
[a7141e6|https://github.com/apache/cassandra/commit/a7141e6c9df03287567c22c76372e166fc83d18e].

> trunk eclipse-warnings
> --
>
> Key: CASSANDRA-14061
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14061
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> {noformat}
> eclipse-warnings:
> [mkdir] Created dir: /home/ubuntu/cassandra/build/ecj
>  [echo] Running Eclipse Code Analysis.  Output logged to 
> /home/ubuntu/cassandra/build/ecj/eclipse_compiler_checks.txt
>  [java] --
>  [java] 1. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 59)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, file.getPath(), iterator);
>  [java]   
> ^^^
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 79)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, dfile.getPath(), iterator);
>  [java]   
> 
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2 problems (2 errors)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14061) trunk eclipse-warnings

2018-03-08 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14061:
---
Fix Version/s: 4.0

> trunk eclipse-warnings
> --
>
> Key: CASSANDRA-14061
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14061
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.0
>
>
> {noformat}
> eclipse-warnings:
> [mkdir] Created dir: /home/ubuntu/cassandra/build/ecj
>  [echo] Running Eclipse Code Analysis.  Output logged to 
> /home/ubuntu/cassandra/build/ecj/eclipse_compiler_checks.txt
>  [java] --
>  [java] 1. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 59)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, file.getPath(), iterator);
>  [java]   
> ^^^
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 79)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, dfile.getPath(), iterator);
>  [java]   
> 
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2 problems (2 errors)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14379) Better handling of missing partition columns in system_schema.columns during startup

2018-04-12 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14379:
---
Status: Patch Available  (was: Open)

> Better handling of missing partition columns in system_schema.columns during 
> startup
> 
>
> Key: CASSANDRA-14379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Distributed Metadata
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Major
>
> Follow up for CASSANDRA-13180, during table deletion/creation, we saw one 
> table having partially deleted columns (no partition column, only regular 
> column). It's blocking node from startup:
> {noformat}
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.db.marshal.CompositeType.getInstance(CompositeType.java:103)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:308) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:288) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:363) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1028)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:987)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:945)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:922)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:910)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:138) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:128) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:241) 
> [apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.14.x.jar:3.0.14.x]
> {noformat}
> As partition column is mandatory, it should throw 
> [{{MissingColumns}}|https://github.com/apache/cassandra/blob/60563f4e8910fb59af141fd24f1fc1f98f34f705/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L1351],
>  the same as CASSANDRA-13180, so the user is able to cleanup the schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14379) Better handling of missing partition columns in system_schema.columns during startup

2018-04-12 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436378#comment-16436378
 ] 

Jay Zhuang commented on CASSANDRA-14379:


| Branch | uTest | dTest |
| [14379-3.0|https://github.com/cooldoger/cassandra/tree/14379-3.0] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14379-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14379-3.0]
 | 
[#522|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/522]
| [14379-3.11|https://github.com/cooldoger/cassandra/tree/14379-3.11] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14379-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14379-3.11]
 | 
[#523|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/523]
| [14379-trunk|https://github.com/cooldoger/cassandra/tree/14379-trunk] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14379-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14379-trunk]
 | 
[#524|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/524]

> Better handling of missing partition columns in system_schema.columns during 
> startup
> 
>
> Key: CASSANDRA-14379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14379
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Distributed Metadata
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Major
>
> Follow up for CASSANDRA-13180, during table deletion/creation, we saw one 
> table having partially deleted columns (no partition column, only regular 
> column). It's blocking node from startup:
> {noformat}
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.db.marshal.CompositeType.getInstance(CompositeType.java:103)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:308) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:288) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:363) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1028)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:987)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:945)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:922)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:910)
>  ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:138) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:128) 
> ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:241) 
> [apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.14.x.jar:3.0.14.x]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.14.x.jar:3.0.14.x]
> {noformat}
> As partition column is mandatory, it should throw 
> [{{MissingColumns}}|https://github.com/apache/cassandra/blob/60563f4e8910fb59af141fd24f1fc1f98f34f705/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L1351],
>  the same as CASSANDRA-13180, so the user is able to cleanup the schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14381) nodetool listsnapshots is missing snapshots

2018-04-12 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436463#comment-16436463
 ] 

Jay Zhuang commented on CASSANDRA-14381:


{{listsnapshots}} excludes local system keyspaces ({{system}} and 
{{system_schema}}): 
[{{StorageService.java:3290}}|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/service/StorageService.java#L3290],
 but not sure why.

> nodetool listsnapshots is missing snapshots
> ---
>
> Key: CASSANDRA-14381
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14381
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: MacOs 10.12.5
> Java 1.8.0_144
> Cassandra 3.11.2 (brew install)
>Reporter: Cyril Scetbon
>Priority: Major
>
> The output of *nodetool listsnapshots* is inconsistent with the snapshots 
> created :
> {code:java}
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ nodetool snapshot -t tag1 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag1] and 
> options {skipFlush=false}
> Snapshot directory: tag1
> $ nodetool snapshot -t tag2 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag2] and 
> options {skipFlush=false}
> Snapshot directory: tag2
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ ls 
> /usr/local/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/snapshots/
> tag1 tag2{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14299) cqlsh: ssl setting not read from cqlshrc in 3.11

2018-04-13 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14299:
---
Reviewer: Jay Zhuang

> cqlsh: ssl setting not read from cqlshrc in 3.11 
> -
>
> Key: CASSANDRA-14299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14299
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Christian Becker
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 3.11.x, 4.x
>
>
> With CASSANDRA-10458 an option was added to read the {{--ssl}} flag from 
> cqlshrc, however the commit seems to have been incorrectly merged or the 
> changes were dropped somehow.
> Currently adding the following has no effect:
> {code:java}
> [connection]
> ssl = true{code}
> When looking at the current tree it's obvious that the flag is not read: 
> [https://github.com/apache/cassandra/blame/cassandra-3.11/bin/cqlsh.py#L2247]
> However it should have been added with 
> [https://github.com/apache/cassandra/commit/70649a8d65825144fcdbde136d9b6354ef1fb911]
> The values like {{DEFAULT_SSL = False}}  are present, but the 
> {{option_with_default()}} call is missing.
> Git blame also shows no change to that line which would have reverted the 
> change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14299) cqlsh: ssl setting not read from cqlshrc in 3.11

2018-04-13 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437857#comment-16437857
 ] 

Jay Zhuang commented on CASSANDRA-14299:


+1

> cqlsh: ssl setting not read from cqlshrc in 3.11 
> -
>
> Key: CASSANDRA-14299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14299
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Christian Becker
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 3.11.x, 4.x
>
>
> With CASSANDRA-10458 an option was added to read the {{--ssl}} flag from 
> cqlshrc, however the commit seems to have been incorrectly merged or the 
> changes were dropped somehow.
> Currently adding the following has no effect:
> {code:java}
> [connection]
> ssl = true{code}
> When looking at the current tree it's obvious that the flag is not read: 
> [https://github.com/apache/cassandra/blame/cassandra-3.11/bin/cqlsh.py#L2247]
> However it should have been added with 
> [https://github.com/apache/cassandra/commit/70649a8d65825144fcdbde136d9b6354ef1fb911]
> The values like {{DEFAULT_SSL = False}}  are present, but the 
> {{option_with_default()}} call is missing.
> Git blame also shows no change to that line which would have reverted the 
> change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14381) nodetool listsnapshots is missing snapshots

2018-04-13 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437673#comment-16437673
 ] 

Jay Zhuang commented on CASSANDRA-14381:


[~cscetbon], are you interested in putting a quick patch for it? I think it 
should be {{trunk}} only, as it's just a nodetool display problem, and we don't 
want to change the behavior for {{3.0}} and {{3.11}}.

> nodetool listsnapshots is missing snapshots
> ---
>
> Key: CASSANDRA-14381
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14381
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: MacOs 10.12.5
> Java 1.8.0_144
> Cassandra 3.11.2 (brew install)
>Reporter: Cyril Scetbon
>Priority: Major
>
> The output of *nodetool listsnapshots* is inconsistent with the snapshots 
> created :
> {code:java}
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ nodetool snapshot -t tag1 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag1] and 
> options {skipFlush=false}
> Snapshot directory: tag1
> $ nodetool snapshot -t tag2 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag2] and 
> options {skipFlush=false}
> Snapshot directory: tag2
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ ls 
> /usr/local/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/snapshots/
> tag1 tag2{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14370) Reduce level of log from debug to trace in CommitLogSegmentManager.java

2018-04-09 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14370:
--

Assignee: Nicolas Guyomar

> Reduce level of log from debug to trace in CommitLogSegmentManager.java  
> -
>
> Key: CASSANDRA-14370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14370
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jay Zhuang
>Assignee: Nicolas Guyomar
>Priority: Trivial
>  Labels: lhf
>
> [{{AbstractCommitLogSegmentManager.java:112}}|https://github.com/apache/cassandra/blob/2402acd47e3bb514981cde742b7330666c564869/src/java/org/apache/cassandra/db/commitlog/AbstractCommitLogSegmentManager.java#L112]
> It's changed to trace() in cassandra-3.0 with 
> CASSANDRA-10241:https://github.com/pauloricardomg/cassandra/commit/3ef1b18fa76dce7cd65b73977fc30e51301f3fed#diff-d07279710c482983e537aed26df80400
> but not in cassandra-3.11 and trunk. I think it makes sense to make them 
> consistent and downgrade to {{trace()}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14370) Reduce level of log from debug to trace in CommitLogSegmentManager.java

2018-04-09 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431412#comment-16431412
 ] 

Jay Zhuang commented on CASSANDRA-14370:


+1

Thanks [~nicolas.guyomar] for the patch, committed as 
[{{b3e9908}}|https://github.com/apache/cassandra/commit/b3e99085a5c34754fbbc2350f0e69c1691b06a11]

> Reduce level of log from debug to trace in CommitLogSegmentManager.java  
> -
>
> Key: CASSANDRA-14370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14370
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jay Zhuang
>Assignee: Nicolas Guyomar
>Priority: Trivial
>  Labels: lhf
> Fix For: 4.0, 3.11.3
>
>
> [{{AbstractCommitLogSegmentManager.java:112}}|https://github.com/apache/cassandra/blob/2402acd47e3bb514981cde742b7330666c564869/src/java/org/apache/cassandra/db/commitlog/AbstractCommitLogSegmentManager.java#L112]
> It's changed to trace() in cassandra-3.0 with 
> CASSANDRA-10241:https://github.com/pauloricardomg/cassandra/commit/3ef1b18fa76dce7cd65b73977fc30e51301f3fed#diff-d07279710c482983e537aed26df80400
> but not in cassandra-3.11 and trunk. I think it makes sense to make them 
> consistent and downgrade to {{trace()}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14370) Reduce level of log from debug to trace in CommitLogSegmentManager.java

2018-04-09 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14370:
---
   Resolution: Fixed
Fix Version/s: 3.11.3
   4.0
   Status: Resolved  (was: Patch Available)

> Reduce level of log from debug to trace in CommitLogSegmentManager.java  
> -
>
> Key: CASSANDRA-14370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14370
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jay Zhuang
>Assignee: Nicolas Guyomar
>Priority: Trivial
>  Labels: lhf
> Fix For: 4.0, 3.11.3
>
>
> [{{AbstractCommitLogSegmentManager.java:112}}|https://github.com/apache/cassandra/blob/2402acd47e3bb514981cde742b7330666c564869/src/java/org/apache/cassandra/db/commitlog/AbstractCommitLogSegmentManager.java#L112]
> It's changed to trace() in cassandra-3.0 with 
> CASSANDRA-10241:https://github.com/pauloricardomg/cassandra/commit/3ef1b18fa76dce7cd65b73977fc30e51301f3fed#diff-d07279710c482983e537aed26df80400
> but not in cassandra-3.11 and trunk. I think it makes sense to make them 
> consistent and downgrade to {{trace()}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2018-04-09 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431515#comment-16431515
 ] 

Jay Zhuang commented on CASSANDRA-13696:


Hi [~vinegh], this should be a different issue, as 
[{{HintsDispatcher.java:128}}|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L128]
 sends hints with {{buffer}}s, this patch is only to fix the digest mismatch 
for 
[{{HintsDispatcher.java:129}}|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L129],
 which sends hints one by one.

Do you see the digest mismatch issue on one node or multiple nodes? Do you have 
schema change during that? Maybe you should file a separate ticket for that.

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Blocker
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
> ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (CASSANDRA-14370) Reduce level of log from debug to trace in CommitLogSegmentManager.java

2018-04-07 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14370:
---
Labels: lhf  (was: )

> Reduce level of log from debug to trace in CommitLogSegmentManager.java  
> -
>
> Key: CASSANDRA-14370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14370
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jay Zhuang
>Priority: Trivial
>  Labels: lhf
>
> [{{AbstractCommitLogSegmentManager.java:112}}|https://github.com/apache/cassandra/blob/2402acd47e3bb514981cde742b7330666c564869/src/java/org/apache/cassandra/db/commitlog/AbstractCommitLogSegmentManager.java#L112]
> It's changed to trace() in cassandra-3.0 with 
> CASSANDRA-10241:https://github.com/pauloricardomg/cassandra/commit/3ef1b18fa76dce7cd65b73977fc30e51301f3fed#diff-d07279710c482983e537aed26df80400
> but not in cassandra-3.11 and trunk. I think it makes sense to make them 
> consistent and downgrade to {{trace()}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14370) Reduce level of log from debug to trace in CommitLogSegmentManager.java

2018-04-07 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14370:
--

 Summary: Reduce level of log from debug to trace in 
CommitLogSegmentManager.java  
 Key: CASSANDRA-14370
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14370
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jay Zhuang


[{{AbstractCommitLogSegmentManager.java:112}}|https://github.com/apache/cassandra/blob/2402acd47e3bb514981cde742b7330666c564869/src/java/org/apache/cassandra/db/commitlog/AbstractCommitLogSegmentManager.java#L112]
It's changed to trace() in cassandra-3.0 with 
CASSANDRA-10241:https://github.com/pauloricardomg/cassandra/commit/3ef1b18fa76dce7cd65b73977fc30e51301f3fed#diff-d07279710c482983e537aed26df80400

but not in cassandra-3.11 and trunk. I think it makes sense to make them 
consistent and downgrade to {{trace()}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14375) Digest mismatch Exception when sending raw hints in cluster

2018-04-12 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14375:
---
Description: 
We have 14 nodes cluster where we seen hints file getting corrupted and 
resulting in the following error
{noformat}
ERROR [HintsDispatcher:1] 2018-04-06 16:26:44,423 CassandraDaemon.java:228 - 
Exception in thread Thread[HintsDispatcher:1,1,main]
 org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
 at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:298)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:263)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:169) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:128)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:113) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:94) 
~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:278)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:260)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:238)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:217)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_141]
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_141]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[na:1.8.0_141]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_141]
 at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
 [apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_141]
 Caused by: java.io.IOException: Digest mismatch exception
 at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:315)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:289)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 ... 16 common frames omitted
{noformat}
Notes on cluster and investigation done so far
1. Cassandra used here is built locally from 3.11.1 branch along with following 
patch from issue: CASSANDRA-14080
 
[https://github.com/apache/cassandra/commit/68079e4b2ed4e58dbede70af45414b3d4214e195]
2. The bootstrap of 14 nodes happens in the following way:
 - Out of 14 nodes only 3 nodes are picked as seed nodes.
 - Only 1 out 3 seed nodes is started and schema is created if it was not 
created previously.
 - Post this, rest of nodes are bootstrapped.
 - In failure scenario, only 5 out of 14 succesfully formed the cassandra 
cluster. The failed nodes include two seed nodes.
3. We confirmed the following patch from issue: CASSANDRA-13696 has been 
applied. From confirmed from Jay Zhuang that this is different issue from what 
was previously fixed.
"this should be a different issue, as HintsDispatcher.java:128 sends hints with 
\{{buffer}}s, this patch is only to fix the digest mismatch for 
HintsDispatcher.java:129, which sends hints one by one."
4. Application uses java driver with quoram setting for cassandra
5. We saw this issue on 7 node cluster too (different from 14 node cluster)
6. We are able to workaround by running nodetool truncatehints on failed nodes 
and restarting cassandra.

  was:
We have 14 nodes cluster where we seen hints file getting corrupted and 
resulting in the following error

ERROR [HintsDispatcher:1] 2018-04-06 16:26:44,423 CassandraDaemon.java:228 - 
Exception in thread Thread[HintsDispatcher:1,1,main]
 org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
 at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:298)
 ~[apache-cassandra-3.11.1.jar:3.11.1-SNAPSHOT]
 at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:263)
 

[jira] [Commented] (CASSANDRA-14381) nodetool listsnapshots is missing snapshots

2018-04-12 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436769#comment-16436769
 ] 

Jay Zhuang commented on CASSANDRA-14381:


The snapshot is taken and the user is still able to restore data for system 
keyspaces, just the snapshot is not in the {{listsnapshots}} output. Sometimes 
we do need to restore data for system tables, for example, if the schema is 
corrupted: 
[{{SchemaKeyspace.java:951}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L951]
@[~aweisberg], @[~stefania_alborghetti], what do you think to list 
system/system_schema snapshots?

> nodetool listsnapshots is missing snapshots
> ---
>
> Key: CASSANDRA-14381
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14381
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: MacOs 10.12.5
> Java 1.8.0_144
> Cassandra 3.11.2 (brew install)
>Reporter: Cyril Scetbon
>Priority: Major
>
> The output of *nodetool listsnapshots* is inconsistent with the snapshots 
> created :
> {code:java}
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ nodetool snapshot -t tag1 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag1] and 
> options {skipFlush=false}
> Snapshot directory: tag1
> $ nodetool snapshot -t tag2 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag2] and 
> options {skipFlush=false}
> Snapshot directory: tag2
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ ls 
> /usr/local/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/snapshots/
> tag1 tag2{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14298) cqlshlib tests broken on b.a.o

2018-04-12 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14298:
--

Assignee: Patrick Bannister

> cqlshlib tests broken on b.a.o
> --
>
> Key: CASSANDRA-14298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Testing
>Reporter: Stefan Podkowinski
>Assignee: Patrick Bannister
>Priority: Major
>
> It appears that cqlsh-tests on builds.apache.org on all branches stopped 
> working since we removed nosetests from the system environment. See e.g. 
> [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-cqlsh-tests/458/cython=no,jdk=JDK%201.8%20(latest),label=cassandra/console].
>  Looks like we either have to make nosetests available again or migrate to 
> pytest as we did with dtests. Giving pytest a quick try resulted in many 
> errors locally, but I haven't inspected them in detail yet. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14375) Digest mismatch Exception when sending raw hints in cluster

2018-04-12 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436714#comment-16436714
 ] 

Jay Zhuang commented on CASSANDRA-14375:


We saw the same issue in {{3.0.14}} 2 times in the last one week:
{noformat}
ERROR [HintsDispatcher:1] 2018-04-10 23:43:47,930 
HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
d921cf74-c064-465d-82b4-aa964cb3b8f6-1523401451406-1.hints: file is corrupted 
({})
org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:296)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:261)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:138)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_121]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [apache-cassandra-3.0.14.x.jar:3.0.14.x]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
Caused by: java.io.IOException: Digest mismatch exception
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:313)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:287)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
... 16 common frames omitted
ERROR [HintsDispatcher:1] 2018-04-10 23:43:47,931 CassandraDaemon.java:207 - 
Exception in thread Thread[HintsDispatcher:1,1,main]
org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
exception
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:296)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:261)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:138)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 

[jira] [Created] (CASSANDRA-14379) Better handling of missing partition columns in system_schema.columns during startup

2018-04-11 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14379:
--

 Summary: Better handling of missing partition columns in 
system_schema.columns during startup
 Key: CASSANDRA-14379
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14379
 Project: Cassandra
  Issue Type: Improvement
  Components: Distributed Metadata
Reporter: Jay Zhuang
Assignee: Jay Zhuang


Follow up for CASSANDRA-13180, during table deletion/creation, we saw one table 
having partially deleted columns (no partition column, only regular column). 
It's blocking node from startup:
{noformat}
java.lang.AssertionError: null
at 
org.apache.cassandra.db.marshal.CompositeType.getInstance(CompositeType.java:103)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:308) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:288) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:363) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1028) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:987) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:945)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:922)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:910)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:138) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:128) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:241) 
[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) 
[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
[apache-cassandra-3.0.14.x.jar:3.0.14.x]
{noformat}

As partition column is mandatory, it should throw 
[{{MissingColumns}}|https://github.com/apache/cassandra/blob/60563f4e8910fb59af141fd24f1fc1f98f34f705/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L1351],
 the same as CASSANDRA-13180, so the user is able to cleanup the schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14381) nodetool listsnapshots is missing snapshots

2018-04-17 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441171#comment-16441171
 ] 

Jay Zhuang commented on CASSANDRA-14381:


+1

LGTM. Is the circleci configuration change needed?

> nodetool listsnapshots is missing snapshots
> ---
>
> Key: CASSANDRA-14381
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14381
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: MacOs 10.12.5
> Java 1.8.0_144
> Cassandra 3.11.2 (brew install)
>Reporter: Cyril Scetbon
>Assignee: Ariel Weisberg
>Priority: Major
>
> The output of *nodetool listsnapshots* is inconsistent with the snapshots 
> created :
> {code:java}
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ nodetool snapshot -t tag1 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag1] and 
> options {skipFlush=false}
> Snapshot directory: tag1
> $ nodetool snapshot -t tag2 --table local system
> Requested creating snapshot(s) for [system] with snapshot name [tag2] and 
> options {skipFlush=false}
> Snapshot directory: tag2
> $ nodetool listsnapshots
> Snapshot Details:
> There are no snapshots
> $ ls 
> /usr/local/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/snapshots/
> tag1 tag2{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12743) Assertion error while running compaction

2018-04-24 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-12743:
---
Reproduced In: 3.0.14, 2.2.7  (was: 3.0.14)
   Status: Patch Available  (was: Reopened)

> Assertion error while running compaction 
> -
>
> Key: CASSANDRA-12743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: unix
>Reporter: Jean-Baptiste Le Duigou
>Assignee: Jay Zhuang
>Priority: Major
>
> While running compaction I run into an error sometimes :
> {noformat}
> nodetool compact
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:463)
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:228)
> at 
> org.apache.cassandra.io.util.CompressedSegmentedFile.createMappedSegments(CompressedSegmentedFile.java:80)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.(CompressedPoolingSegmentedFile.java:38)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:101)
> at 
> org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:198)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:315)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:171)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:116)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.append(DefaultCompactionWriter.java:64)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:184)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:599)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Why is that happening?
> Is there anyway to provide more details (e.g. which SSTable cannot be 
> compacted)?
> We are using Cassandra 2.2.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12743) Assertion error while running compaction

2018-04-24 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451469#comment-16451469
 ] 

Jay Zhuang commented on CASSANDRA-12743:


The problem is because 
[{{dataSyncPosition}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L64]
 is the compressed file size (set here: 
[{{BigTableWriter.java:445}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L445]),
 VS. 
[{{lastReadableByData}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L58]
 is having [uncompressed data 
size|https://github.com/apache/cassandra/blob/5dc55e715eba6667c388da9f8f1eb7a46489b35c/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L185]:
 
[{{IndexSummaryBuilder.java:222}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L222].
So if the compression ratio is bigger or around {{1.0}} and [the index file is 
synced faster than the data 
file|https://github.com/apache/cassandra/blob/5dc55e715eba6667c388da9f8f1eb7a46489b35c/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L174],
 
[{{openEarly()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableWriter.java#L287]
 may open data that haven't been synced.

Here is the patch, please review:
| Branch | uTest | dTest |
| [12743-2.2|https://github.com/cooldoger/cassandra/tree/12743-2.2] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/12743-2.2.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/12743-2.2]
 | 
[#526|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/522]
| [12743-3.0|https://github.com/cooldoger/cassandra/tree/12743-3.0] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/12743-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/12743-3.0]
 | 
[#527|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/522]
| [12743-3.11|https://github.com/cooldoger/cassandra/tree/12743-3.11] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/12743-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/12743-3.11]
 | 
[#528|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/523]
| [12743-trunk|https://github.com/cooldoger/cassandra/tree/12743-trunk] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/12743-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/12743-trunk]
 | 
[#529|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/524]

> Assertion error while running compaction 
> -
>
> Key: CASSANDRA-12743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: unix
>Reporter: Jean-Baptiste Le Duigou
>Assignee: Jay Zhuang
>Priority: Major
>
> While running compaction I run into an error sometimes :
> {noformat}
> nodetool compact
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:463)
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:228)
> at 
> org.apache.cassandra.io.util.CompressedSegmentedFile.createMappedSegments(CompressedSegmentedFile.java:80)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.(CompressedPoolingSegmentedFile.java:38)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:101)
> at 
> org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:198)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:315)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:171)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:116)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.append(DefaultCompactionWriter.java:64)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:184)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> 

[jira] [Assigned] (CASSANDRA-12743) Assertion error while running compaction

2018-04-24 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-12743:
--

Assignee: Jay Zhuang

> Assertion error while running compaction 
> -
>
> Key: CASSANDRA-12743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: unix
>Reporter: Jean-Baptiste Le Duigou
>Assignee: Jay Zhuang
>Priority: Major
>
> While running compaction I run into an error sometimes :
> {noformat}
> nodetool compact
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:463)
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:228)
> at 
> org.apache.cassandra.io.util.CompressedSegmentedFile.createMappedSegments(CompressedSegmentedFile.java:80)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.(CompressedPoolingSegmentedFile.java:38)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:101)
> at 
> org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:198)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:315)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:171)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:116)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.append(DefaultCompactionWriter.java:64)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:184)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:599)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Why is that happening?
> Is there anyway to provide more details (e.g. which SSTable cannot be 
> compacted)?
> We are using Cassandra 2.2.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14349) Untracked CDC segment files are not deleted after replay

2018-03-30 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420955#comment-16420955
 ] 

Jay Zhuang commented on CASSANDRA-14349:


Hi [~shichao.an], nice finding. It would be great to have a dTest for that, 
just restart the node a few time and check if there's any orphaned commitlog in 
{{cdc_raw}} directory. Any non-active commitlog that doesn't have idx file 
should be considered orphaned.

cc. [~JoshuaMcKenzie]

> Untracked CDC segment files are not deleted after replay
> 
>
> Key: CASSANDRA-14349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Shichao An
>Assignee: Shichao An
>Priority: Minor
>
> When CDC is enabled, a hard link to each commit log file will be created in 
> cdc_raw directory. Those commit logs with CDC mutations will also have cdc 
> index files created along with the hard links; these are intended for the 
> consumer to handle and clean them up.
> However, if we don't produce any CDC traffic, those hard links in cdc_raw 
> will be never cleaned up (because hard links will still be created, without 
> the index files), whereas the real original commit logs are correctly deleted 
> after replay during process startup. This will results in many untracked hard 
> links in cdc_raw if we restart the cassandra process many times. I am able to 
> use CCM to reproduce it in trunk version which has the CASSANDRA-12148 
> changes.
> This seems a bug in handleReplayedSegment of the commit log segment manager 
> which neglects to take care of CDC commit logs. I will attach a patch here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14349) Untracked CDC segment files are not deleted after replay

2018-03-28 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14349:
--

Assignee: Shichao An

> Untracked CDC segment files are not deleted after replay
> 
>
> Key: CASSANDRA-14349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Shichao An
>Assignee: Shichao An
>Priority: Minor
>
> When CDC is enabled, a hard link to each commit log file will be created in 
> cdc_raw directory. Those commit logs with CDC mutations will also have cdc 
> index files created along with the hard links; these are intended for the 
> consumer to handle and clean them up.
> However, if we don't produce any CDC traffic, those hard links in cdc_raw 
> will be never cleaned up (because hard links will still be created, without 
> the index files), whereas the real original commit logs are correctly deleted 
> after replay during process startup. This will results in many untracked hard 
> links in cdc_raw if we restart the cassandra process many times. I am able to 
> use CCM to reproduce it in trunk version which has the CASSANDRA-12148 
> changes.
> This seems a bug in handleReplayedSegment of the commit log segment manager 
> which neglects to take care of CDC commit logs. I will attach a patch here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-29 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13884:
---
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.x
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-29 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13884:
---
Fix Version/s: (was: 4.x)
   4.0

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.0
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-29 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419471#comment-16419471
 ] 

Jay Zhuang commented on CASSANDRA-13884:


Thanks [~chovatia.jayd...@gmail.com] for the fix.
Committed 
[c22ee2b|https://github.com/apache/cassandra/commit/c22ee2bd451d030e99cfb65be839bbc735a5352f].

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.x
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-27 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13884:
---
Status: Patch Available  (was: Open)

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.x
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-27 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13884:
---
Status: Ready to Commit  (was: Patch Available)

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.x
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-27 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416282#comment-16416282
 ] 

Jay Zhuang commented on CASSANDRA-13884:


+1

I added a check for empty target keyspace: [Check if target keyspace is empty 
and print error 
message|https://github.com/cooldoger/cassandra/commit/9536d99b86dd37984f056cc525618e9a2e391a75]
|dTest|
|[519|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/519/]
 |

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.x
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14252:
---
Reviewer: Jay Zhuang

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407165#comment-16407165
 ] 

Jay Zhuang commented on CASSANDRA-14252:


Hi [~dikanggu], instead of committing the change to each branch separately, I 
think it would be better to merge up to later branches:
https://cassandra.apache.org/doc/latest/development/how_to_commit.html

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406947#comment-16406947
 ] 

Jay Zhuang commented on CASSANDRA-14252:


+1

Good catch. I created a dtest to reproduce the problem: 
[14252|https://github.com/cooldoger/cassandra-dtest/tree/14252]
Also when comparing 2 versions, the existing code uses {{0.0}} as default 
value: 
[{{DynamicEndpointSnitch.java:267}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java#L267]

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14319) nodetool rebuild from DC lets you pass invalid datacenters

2018-03-20 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14319:
--

Assignee: Vinay Chella

> nodetool rebuild from DC lets you pass invalid datacenters 
> ---
>
> Key: CASSANDRA-14319
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14319
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Vinay Chella
>Priority: Major
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> If you pass an invalid datacenter to nodetool rebuild, you'll get an error 
> like this:
> {code}
> Unable to find sufficient sources for streaming range 
> (3074457345618258602,-9223372036854775808] in keyspace system_distributed
> {code}
> Unfortunately, this is a rabbit hole of frustration if you are using caps for 
> your DC names and you pass in a lowercase DC name, or you just typo the DC.  
> Let's do the following:
> # Check the DC name that's passed in against the list of DCs we know about
> # If we don't find it, let's output a reasonable error, and list all the DCs 
> someone could put in.
> # Ideally we indicate which keyspaces are set to replicate to this DC and 
> which aren't



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-19 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14252:
---
Attachment: IMG_3180.jpg

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-19 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405172#comment-16405172
 ] 

Jay Zhuang commented on CASSANDRA-14252:


Hi [~dikanggu], would you please help me to understand the scenario?
Assume there're 3 nodes: coordinator node, a degraded node, and a healthy node:

!IMG_3180.jpg!

When the issue happens, the coordinator node doesn't have the score for either 
degraded node nor healthy node, so it follows subsnitch ordering and always 
talk to the degraded node, is that right? Or are the coordinator node and the 
degraded node the same node?

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-19 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14252:
---
Attachment: (was: IMG_3180.jpg)

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-19 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14252:
---
Attachment: IMG_3180.jpg

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-18 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404233#comment-16404233
 ] 

Jay Zhuang commented on CASSANDRA-13884:


Hi [~chovatia.jayd...@gmail.com], seems the unittest ({{SSTableLoaderTest}}) is 
failing, would you please take a look? And it would be better to add a few 
tests:
{noformat}
$ ant test -Dtest.name=SSTableLoaderTest
...
[junit] Testcase: 
testLoadingSSTable(org.apache.cassandra.io.sstable.SSTableLoaderTest):FAILED
[junit] expected:<1> but was:<0>
[junit] junit.framework.AssertionFailedError: expected:<1> but was:<0>
[junit] at 
org.apache.cassandra.io.sstable.SSTableLoaderTest.testLoadingSSTable(SSTableLoaderTest.java:144)
[junit]
[junit]
[junit] Testcase: 
testLoadingIncompleteSSTable(org.apache.cassandra.io.sstable.SSTableLoaderTest):
  FAILED
[junit] null
[junit] junit.framework.AssertionFailedError
[junit] at 
org.apache.cassandra.io.sstable.SSTableLoaderTest.testLoadingIncompleteSSTable(SSTableLoaderTest.java:195)
[junit]
[junit]
[junit] Test org.apache.cassandra.io.sstable.SSTableLoaderTest FAILED
{noformat}

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.x
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-19 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405519#comment-16405519
 ] 

Jay Zhuang commented on CASSANDRA-14252:


Should "Speculative Retry" help in this situation?

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-20 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407273#comment-16407273
 ] 

Jay Zhuang commented on CASSANDRA-14308:


Here is one proposal Option1: delete SSTables without stats component: 
[14308-wip|https://github.com/cooldoger/cassandra/tree/14308-wip], but not sure 
if it's safe to delete these SSTables and may cause data lost if there's a bug.

Option2: as the incompleted SSTables are generated by compaction, how about 
storing the new SSTable in a temporary directory (e.g.: 
{{data/keyspace/table-id/compaction/}}). Once the compaction is done, move the 
new SSTable back to the data directory. So while Cassandra startup, it can 
clean up SSTables in the temporary directory (if there's any interrupted 
compaction). I'm not familar with compaction, not sure if it's possible or the 
right way to do it. Any suggestions are welcomed.

cc @[~jasobrown], @[~jjirsa]

> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
> The accumulated incompleted SSTables could take lots of space, especially for 
> STCS which could have very large SSTables.
> Here is the script we use to delete the SSTables after node is restarted:
> {noformat}
> grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' 
> > ~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
> $ss*; sudo rm $ss* ; done
> {noformat}
> I would suggest to remove these incompleted SSTables while startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-21 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408844#comment-16408844
 ] 

Jay Zhuang commented on CASSANDRA-14308:


Thanks for the information. You're right, I'm unable to reproduce the problem 
locally with ccm if I just kill the process while compacting.
I'm trying to find more logs related to this, several clusters are having this 
orphaned SSTables issue, but maybe they're accumulated for a long time.

On the other hand, should we just remove these SSTables without {{Stats}} (or 
have an option to do that)? Just like remove SSTables without {{Data}}: 
[{{ColumnFamilyStore.java:648}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L648]

> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
> The accumulated incompleted SSTables could take lots of space, especially for 
> STCS which could have very large SSTables.
> Here is the script we use to delete the SSTables after node is restarted:
> {noformat}
> grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' 
> > ~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
> $ss*; sudo rm $ss* ; done
> {noformat}
> I would suggest to remove these incompleted SSTables while startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory

2018-03-04 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385610#comment-16385610
 ] 

Jay Zhuang commented on CASSANDRA-13929:


Hi [~tjake], are you interested in reviewing the patch? The trunk uTest failure 
is because of CASSANDRA-14119.

> BTree$Builder / io.netty.util.Recycler$Stack leaking memory
> ---
>
> Key: CASSANDRA-13929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13929
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Thomas Steinmaurer
>Assignee: Jay Zhuang
>Priority: Major
> Fix For: 3.11.x
>
> Attachments: cassandra_3.11.0_min_memory_utilization.jpg, 
> cassandra_3.11.1_NORECYCLE_memory_utilization.jpg, 
> cassandra_3.11.1_mat_dominator_classes.png, 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, 
> cassandra_3.11.1_vs_3.11.2recyclernullingpatch.png, 
> cassandra_heapcpu_memleak_patching_test_30d.png, 
> dtest_example_80_request.png, dtest_example_80_request_fix.png, 
> dtest_example_heap.png, memleak_heapdump_recyclerstack.png
>
>
> Different to CASSANDRA-13754, there seems to be another memory leak in 
> 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack.
> * heap utilization increase after upgrading to 3.11.0 => 
> cassandra_3.11.0_min_memory_utilization.jpg
> * No difference after upgrading to 3.11.1 (snapshot build) => 
> cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing 
> CASSANDRA-13754, more visible now
> * MAT shows io.netty.util.Recycler$Stack as top contributing class => 
> cassandra_3.11.1_mat_dominator_classes.png
> * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart 
> after ~ 72 hours
> Verified the following fix, namely explicitly unreferencing the 
> _recycleHandle_ member (making it non-final). In 
> _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_
> {code}
> public void recycle()
> {
> if (recycleHandle != null)
> {
> this.cleanup();
> builderRecycler.recycle(this, recycleHandle);
> recycleHandle = null; // ADDED
> }
> }
> {code}
> Patched a single node in our loadtest cluster with this change and after ~ 10 
> hours uptime, no sign of the previously offending class in MAT anymore => 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png
> Can' say if this has any other side effects etc., but I doubt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14280) Fix timeout test - org.apache.cassandra.cql3.ViewTest

2018-03-04 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385491#comment-16385491
 ] 

Jay Zhuang commented on CASSANDRA-14280:


Seems duplicated to CASSANDRA-14119, we should merge them.

> Fix timeout test - org.apache.cassandra.cql3.ViewTest
> -
>
> Key: CASSANDRA-14280
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14280
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0
>
>
> The test timeout very often, it seems too big, try to split it into multiple 
> tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13884) sstableloader option to accept target keyspace name

2018-03-04 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13884:
---
Reviewer: Jay Zhuang

> sstableloader option to accept target keyspace name
> ---
>
> Key: CASSANDRA-13884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13884
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 4.x
>
>
> Often as part of backup people store entire {{data}} directory. When they see 
> some corruption in data then they would like to restore data in same cluster 
> (for large clusters 200 nodes) but with different keyspace name. 
> Currently {{sstableloader}} uses parent folder as {{keyspace}}, it would be 
> nice to have an option to specify target keyspace name as part of 
> {{sstableloader}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14280) Fix timeout test - org.apache.cassandra.cql3.ViewTest

2018-03-04 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385491#comment-16385491
 ] 

Jay Zhuang edited comment on CASSANDRA-14280 at 3/5/18 2:20 AM:


Seems duplicated to CASSANDRA-14119, please merge them.


was (Author: jay.zhuang):
Seems duplicated to CASSANDRA-14119, we should merge them.

> Fix timeout test - org.apache.cassandra.cql3.ViewTest
> -
>
> Key: CASSANDRA-14280
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14280
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0
>
>
> The test timeout very often, it seems too big, try to split it into multiple 
> tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12743) Assertion error while running compaction

2018-04-25 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453166#comment-16453166
 ] 

Jay Zhuang commented on CASSANDRA-12743:


Thanks [~krummas] for the review. I think it makes sense to backport the 
{{truncate()}} part to {{2.2}} and {{3.0}}. The branch is updated, please 
review.

> Assertion error while running compaction 
> -
>
> Key: CASSANDRA-12743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: unix
>Reporter: Jean-Baptiste Le Duigou
>Assignee: Jay Zhuang
>Priority: Major
>
> While running compaction I run into an error sometimes :
> {noformat}
> nodetool compact
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:463)
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:228)
> at 
> org.apache.cassandra.io.util.CompressedSegmentedFile.createMappedSegments(CompressedSegmentedFile.java:80)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.(CompressedPoolingSegmentedFile.java:38)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:101)
> at 
> org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:198)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:315)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:171)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:116)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.append(DefaultCompactionWriter.java:64)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:184)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:599)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Why is that happening?
> Is there anyway to provide more details (e.g. which SSTable cannot be 
> compacted)?
> We are using Cassandra 2.2.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12743) Assertion error while running compaction

2018-04-25 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453166#comment-16453166
 ] 

Jay Zhuang edited comment on CASSANDRA-12743 at 4/25/18 11:30 PM:
--

Thanks [~krummas] for the review. It makes sense to backport the {{truncate()}} 
part to {{2.2}} and {{3.0}}. The branch is updated, please review.

It's really hard to reproduce it locally, as the most of time the index file is 
synced slower than the data file, so it uses index file synced position (even 
the data synced position is wrong, it won't cause the problem): 
[{{IndexSummaryBuilder.java:174}}|https://github.com/apache/cassandra/blob/5dc55e715eba6667c388da9f8f1eb7a46489b35c/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L174].
But in one of our cluster, it happens every dozens hours per node. I patched 
the fix and no longer see the issue.


was (Author: jay.zhuang):
Thanks [~krummas] for the review. I think it makes sense to backport the 
{{truncate()}} part to {{2.2}} and {{3.0}}. The branch is updated, please 
review.

> Assertion error while running compaction 
> -
>
> Key: CASSANDRA-12743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: unix
>Reporter: Jean-Baptiste Le Duigou
>Assignee: Jay Zhuang
>Priority: Major
>
> While running compaction I run into an error sometimes :
> {noformat}
> nodetool compact
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:463)
> at 
> org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:228)
> at 
> org.apache.cassandra.io.util.CompressedSegmentedFile.createMappedSegments(CompressedSegmentedFile.java:80)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.(CompressedPoolingSegmentedFile.java:38)
> at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:101)
> at 
> org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:198)
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:315)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:171)
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:116)
> at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.append(DefaultCompactionWriter.java:64)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:184)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:599)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Why is that happening?
> Is there anyway to provide more details (e.g. which SSTable cannot be 
> compacted)?
> We are using Cassandra 2.2.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14422) Missing dependencies airline and ohc-core-j8 for pom-all

2018-04-26 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14422:
--

Assignee: Shichao An

> Missing dependencies airline and ohc-core-j8 for pom-all
> 
>
> Key: CASSANDRA-14422
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14422
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Shichao An
>Assignee: Shichao An
>Priority: Minor
>
> I found two missing dependencies for pom-all (cassandra-all):
>  * airline
>  * ohc-core-j8
>  
> This doesn't affect current build scheme because their jars are hardcoded in 
> the lib directory. However, if we depend on cassandra-all in our downstream 
> projects to resolve and fetch dependencies (instead of using the official 
> tarball), Cassandra will have problems, e.g. airline is required by nodetool, 
> and it will fail our dtests.
> I will attach the patch shortly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-10-07 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14791:
---
Assignee: Jay Zhuang
  Status: Patch Available  (was: Open)

> [utest] tests unable to write system tmp directory
> --
>
> Key: CASSANDRA-14791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
> Project: Cassandra
>  Issue Type: Task
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> Some tests are failing from time to time because it cannot write to directory 
> {{/tmp/}}:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/
> {noformat}
> java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> /tmp/na-1-big-Data.db
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
>   at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
> {noformat}
>  I guess it's because some Jenkins slaves don't have proper permission set. 
> For slave {{cassandra16}}, the tests are fine:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-10-07 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641314#comment-16641314
 ] 

Jay Zhuang commented on CASSANDRA-14791:


The root cause of this test failure is not because {{/tmp/}} directory is not 
writable. But because the unittest generated tmp files 
{{/tmp/na-1-big-Data.db}} and {{/tmp/na-1-big-CompressionInfo.db}} are not 
deleted after the test. So I guess on these nodes, the test was run by other 
user, which left the tmp files that the current user cannot override. I'm able 
to reproduce the same error message by:
{noformat}
sudo chown root:root /tmp/na-1-big-Data.db
{noformat}

Here is a patch for trunk:
| Branch | uTest |
| [14791|https://github.com/cooldoger/cassandra/tree/14791] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14791.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14791]
 |

Passed the tests in Jenkins:
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/36/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/

> [utest] tests unable to write system tmp directory
> --
>
> Key: CASSANDRA-14791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
> Project: Cassandra
>  Issue Type: Task
>  Components: Testing
>Reporter: Jay Zhuang
>Priority: Minor
>
> Some tests are failing from time to time because it cannot write to directory 
> {{/tmp/}}:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/
> {noformat}
> java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> /tmp/na-1-big-Data.db
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
>   at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
> {noformat}
>  I guess it's because some Jenkins slaves don't have proper permission set. 
> For slave {{cassandra16}}, the tests are fine:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-10-08 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14791:
---
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

Thanks [~krummas] for the review. Committed as 
[{{73ebd20}}|https://github.com/apache/cassandra/commit/73ebd200c04335624f956e79624cf8494d872f19].

> [utest] tests unable to write system tmp directory
> --
>
> Key: CASSANDRA-14791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
> Project: Cassandra
>  Issue Type: Task
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> Some tests are failing from time to time because it cannot write to directory 
> {{/tmp/}}:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/
> {noformat}
> java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> /tmp/na-1-big-Data.db
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
>   at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
> {noformat}
>  I guess it's because some Jenkins slaves don't have proper permission set. 
> For slave {{cassandra16}}, the tests are fine:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-10-08 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14791:
---
Issue Type: Bug  (was: Task)

> [utest] tests unable to write system tmp directory
> --
>
> Key: CASSANDRA-14791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.0
>
>
> Some tests are failing from time to time because it cannot write to directory 
> {{/tmp/}}:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/
> {noformat}
> java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> /tmp/na-1-big-Data.db
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
>   at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
> {noformat}
>  I guess it's because some Jenkins slaves don't have proper permission set. 
> For slave {{cassandra16}}, the tests are fine:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-10-08 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14791:
---
Fix Version/s: 4.0

> [utest] tests unable to write system tmp directory
> --
>
> Key: CASSANDRA-14791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.0
>
>
> Some tests are failing from time to time because it cannot write to directory 
> {{/tmp/}}:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/
> {noformat}
> java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> /tmp/na-1-big-Data.db
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
>   at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
> {noformat}
>  I guess it's because some Jenkins slaves don't have proper permission set. 
> For slave {{cassandra16}}, the tests are fine:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14526) dtest to validate Cassandra state post failed/successful bootstrap

2018-10-27 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1241#comment-1241
 ] 

Jay Zhuang commented on CASSANDRA-14526:


Hi [~chovatia.jayd...@gmail.com], the function name is duplicated 
(https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk#diff-0b30b9f097df89d74be1d1af8205ac7eR707),
 I assume the first one could be removed.

> dtest to validate Cassandra state post failed/successful bootstrap
> --
>
> Key: CASSANDRA-14526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14526
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>  Labels: dtest
>
> Please find dtest here:
> || dtest ||
> | [patch 
> |https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-10-27 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1220#comment-1220
 ] 

Jay Zhuang commented on CASSANDRA-14525:


Sure, I'll kick off the tests.

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> [StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L999]
>  will not be invoked.
> API 

[jira] [Created] (CASSANDRA-14890) cassandra-stress hang for 200 seconds if n is not specified

2018-11-13 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14890:
--

 Summary: cassandra-stress hang for 200 seconds if n is not 
specified
 Key: CASSANDRA-14890
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14890
 Project: Cassandra
  Issue Type: Bug
  Components: Stress
Reporter: Jay Zhuang
Assignee: Jay Zhuang


if parameter {{n}} is not specified, cassandra-stress will hang (wait) for 200 
seconds between warm-up and sending traffic.
For example, the following command will hang 200 seconds before sending the 
traffic:
{noformat}
$ ./tools/bin/cassandra-stress write
...
Created keyspaces. Sleeping 1s for propagation.
Sleeping 2s...
Warming up WRITE with 0 iterations...
Failed to connect over JMX; not collecting these stats
{noformat}

It's waiting for this: 
[https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/util/Uncertainty.java#L72]
As there's no warm-up traffic (CASSANDRA-13773), it will wait until:
{noformat}
(measurements >= waiter.maxMeasurements)
{noformat}
{{maxMeasurements}} is 200 by default:
[https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/settings/SettingsCommand.java#L153]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14890) cassandra-stress hang for 200 seconds if `n` is not specified

2018-11-14 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14890:
---
Summary: cassandra-stress hang for 200 seconds if `n` is not specified  
(was: cassandra-stress hang for 200 seconds if n is not specified)

> cassandra-stress hang for 200 seconds if `n` is not specified
> -
>
> Key: CASSANDRA-14890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14890
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> if parameter {{n}} is not specified, cassandra-stress will hang (wait) for 
> 200 seconds between warm-up and sending traffic.
> For example, the following command will hang 200 seconds before sending the 
> traffic:
> {noformat}
> $ ./tools/bin/cassandra-stress write
> ...
> Created keyspaces. Sleeping 1s for propagation.
> Sleeping 2s...
> Warming up WRITE with 0 iterations...
> Failed to connect over JMX; not collecting these stats
> {noformat}
> It's waiting for this: 
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/util/Uncertainty.java#L72]
> As there's no warm-up traffic (CASSANDRA-13773), it will wait until:
> {noformat}
> (measurements >= waiter.maxMeasurements)
> {noformat}
> {{maxMeasurements}} is 200 by default:
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/settings/SettingsCommand.java#L153]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14890) cassandra-stress hang for 200 seconds if n is not specified

2018-11-14 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687254#comment-16687254
 ] 

Jay Zhuang commented on CASSANDRA-14890:


Here is a patch to re-enable {{warm-up}} if `n` is not set 
([{{StressAction.java}}|https://github.com/apache/cassandra/commit/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117#diff-fd2f2d2364937fcb1c0d73c8314f1418L90]):
| Branch | uTest | dTest |
| [14890-3.0|https://github.com/cooldoger/cassandra/tree/14890-3.0] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/]
 |
| [14890-3.11|https://github.com/cooldoger/cassandra/tree/14890-3.11] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/]
 |
| [14890-trunk|https://github.com/cooldoger/cassandra/tree/14890-trunk] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/]
 |

Here is the dTest to reproduce the problem:
|[14890|https://github.com/cooldoger/cassandra-dtest/tree/14890]|

[~Stefania] would you please review?

> cassandra-stress hang for 200 seconds if n is not specified
> ---
>
> Key: CASSANDRA-14890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14890
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> if parameter {{n}} is not specified, cassandra-stress will hang (wait) for 
> 200 seconds between warm-up and sending traffic.
> For example, the following command will hang 200 seconds before sending the 
> traffic:
> {noformat}
> $ ./tools/bin/cassandra-stress write
> ...
> Created keyspaces. Sleeping 1s for propagation.
> Sleeping 2s...
> Warming up WRITE with 0 iterations...
> Failed to connect over JMX; not collecting these stats
> {noformat}
> It's waiting for this: 
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/util/Uncertainty.java#L72]
> As there's no warm-up traffic (CASSANDRA-13773), it will wait until:
> {noformat}
> (measurements >= waiter.maxMeasurements)
> {noformat}
> {{maxMeasurements}} is 200 by default:
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/settings/SettingsCommand.java#L153]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14616:
---
Reproduced In: 3.11.0, 4.0
   Status: Patch Available  (was: Open)

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Assignee: Jeremy
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14890) cassandra-stress hang for 200 seconds if n is not specified

2018-11-14 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687254#comment-16687254
 ] 

Jay Zhuang edited comment on CASSANDRA-14890 at 11/14/18 10:51 PM:
---

Here is a patch to re-enable {{warm-up}} if `n` is not set 
([{{StressAction.java}}|https://github.com/apache/cassandra/commit/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117#diff-fd2f2d2364937fcb1c0d73c8314f1418L90]):
|Branch|uTest|dTest|
|[14890-3.0|https://github.com/cooldoger/cassandra/tree/14890-3.0]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/]|
|[14890-3.11|https://github.com/cooldoger/cassandra/tree/14890-3.11]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/]|
|[14890-trunk|https://github.com/cooldoger/cassandra/tree/14890-trunk]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/]|

Here is a dTest to reproduce the problem:
|[14890|https://github.com/cooldoger/cassandra-dtest/tree/14890]|

[~Stefania] would you please review?


was (Author: jay.zhuang):
Here is a patch to re-enable {{warm-up}} if `n` is not set 
([{{StressAction.java}}|https://github.com/apache/cassandra/commit/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117#diff-fd2f2d2364937fcb1c0d73c8314f1418L90]):
| Branch | uTest | dTest |
| [14890-3.0|https://github.com/cooldoger/cassandra/tree/14890-3.0] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/]
 |
| [14890-3.11|https://github.com/cooldoger/cassandra/tree/14890-3.11] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/]
 |
| [14890-trunk|https://github.com/cooldoger/cassandra/tree/14890-trunk] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/]
 |

Here is the dTest to reproduce the problem:
|[14890|https://github.com/cooldoger/cassandra-dtest/tree/14890]|

[~Stefania] would you please review?

> cassandra-stress hang for 200 seconds if n is not specified
> ---
>
> Key: CASSANDRA-14890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14890
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> if parameter {{n}} is not specified, cassandra-stress will hang (wait) for 
> 200 seconds between warm-up and sending traffic.
> For example, the following command will hang 200 seconds before sending the 
> traffic:
> {noformat}
> $ ./tools/bin/cassandra-stress write
> ...
> Created keyspaces. Sleeping 1s for propagation.
> Sleeping 2s...
> Warming up WRITE with 0 iterations...
> Failed to connect over JMX; not collecting these stats
> {noformat}
> It's waiting for this: 
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/util/Uncertainty.java#L72]
> As there's no warm-up traffic (CASSANDRA-13773), it will wait until:
> {noformat}
> (measurements >= waiter.maxMeasurements)
> {noformat}
> {{maxMeasurements}} is 200 by default:
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/settings/SettingsCommand.java#L153]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CASSANDRA-14890) cassandra-stress hang for 200 seconds if n is not specified

2018-11-14 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14890:
---
Status: Patch Available  (was: Open)

> cassandra-stress hang for 200 seconds if n is not specified
> ---
>
> Key: CASSANDRA-14890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14890
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> if parameter {{n}} is not specified, cassandra-stress will hang (wait) for 
> 200 seconds between warm-up and sending traffic.
> For example, the following command will hang 200 seconds before sending the 
> traffic:
> {noformat}
> $ ./tools/bin/cassandra-stress write
> ...
> Created keyspaces. Sleeping 1s for propagation.
> Sleeping 2s...
> Warming up WRITE with 0 iterations...
> Failed to connect over JMX; not collecting these stats
> {noformat}
> It's waiting for this: 
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/util/Uncertainty.java#L72]
> As there's no warm-up traffic (CASSANDRA-13773), it will wait until:
> {noformat}
> (measurements >= waiter.maxMeasurements)
> {noformat}
> {{maxMeasurements}} is 200 by default:
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/settings/SettingsCommand.java#L153]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14616:
--

Assignee: Jeremy Quinn

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Assignee: Jeremy Quinn
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687347#comment-16687347
 ] 

Jay Zhuang commented on CASSANDRA-14616:


The failed the utest is because of CASSANDRA-14891

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Assignee: Jeremy
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687300#comment-16687300
 ] 

Jay Zhuang commented on CASSANDRA-14616:


Hi [~Yarnspinner], the fix looks good. I had the similar fix which re-enables 
{{warm-up}} to 50k as before 
([{{StressAction.java}}|https://github.com/apache/cassandra/commit/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117#diff-fd2f2d2364937fcb1c0d73c8314f1418L90])

|Branch|uTest|dTest|
|[14890-3.0|https://github.com/cooldoger/cassandra/tree/14890-3.0]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/]|
|[14890-3.11|https://github.com/cooldoger/cassandra/tree/14890-3.11]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/]|
|[14890-trunk|https://github.com/cooldoger/cassandra/tree/14890-trunk]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/]|

Here is a dTest to reproduce the problem:
|[14890|https://github.com/cooldoger/cassandra-dtest/tree/14890]|

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14616:
--

Assignee: Jeremy  (was: Jeremy Quinn)

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Assignee: Jeremy
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14890) cassandra-stress hang for 200 seconds if `n` is not specified

2018-11-14 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14890:
---
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Resolve as duplication to CASSANDRA-14616.

> cassandra-stress hang for 200 seconds if `n` is not specified
> -
>
> Key: CASSANDRA-14890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14890
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> if parameter {{n}} is not specified, cassandra-stress will hang (wait) for 
> 200 seconds between warm-up and sending traffic.
> For example, the following command will hang 200 seconds before sending the 
> traffic:
> {noformat}
> $ ./tools/bin/cassandra-stress write
> ...
> Created keyspaces. Sleeping 1s for propagation.
> Sleeping 2s...
> Warming up WRITE with 0 iterations...
> Failed to connect over JMX; not collecting these stats
> {noformat}
> It's waiting for this: 
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/util/Uncertainty.java#L72]
> As there's no warm-up traffic (CASSANDRA-13773), it will wait until:
> {noformat}
> (measurements >= waiter.maxMeasurements)
> {noformat}
> {{maxMeasurements}} is 200 by default:
> [https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/tools/stress/src/org/apache/cassandra/stress/settings/SettingsCommand.java#L153]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14891) [utest] LegacySSTableTest.testInaccurateSSTableMinMax test failed

2018-11-14 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14891:
--

 Summary: [utest] LegacySSTableTest.testInaccurateSSTableMinMax 
test failed
 Key: CASSANDRA-14891
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14891
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Jay Zhuang


{noformat}
junit.framework.AssertionFailedError
at 
org.apache.cassandra.db.SinglePartitionSliceCommandTest.getUnfilteredsFromSinglePartition(SinglePartitionSliceCommandTest.java:404)
at 
org.apache.cassandra.io.sstable.LegacySSTableTest.ttestInaccurateSSTableMinMax(LegacySSTableTest.java:323)
{noformat}

Related to CASSANDRA-14861



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14610) Flaky dtest: nodetool_test.TestNodetool.test_describecluster_more_information_three_datacenters

2018-10-06 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14610:
---
Reviewer: Jay Zhuang

> Flaky dtest: 
> nodetool_test.TestNodetool.test_describecluster_more_information_three_datacenters
> ---
>
> Key: CASSANDRA-14610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14610
> Project: Cassandra
>  Issue Type: Task
>  Components: Testing, Tools
>Reporter: Jason Brown
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: dtest
>
> @jay zhuang observed 
> nodetool_test.TestNodetool.test_describecluster_more_information_three_datacenters
>  being flaky in Apache Jenkins. I ran locally and got a different flaky 
> behavior:
> {noformat}
> out_node1_dc3, err, _ = node1_dc3.nodetool('describecluster')
> assert 0 == len(err), err
> >   assert out_node1_dc1 == out_node1_dc3
> E   AssertionError: assert 'Cluster Info...1=3, dc3=1}\n' == 'Cluster 
> Infor...1=3, dc3=1}\n'
> E   Cluster Information:
> E Name: test
> E Snitch: org.apache.cassandra.locator.PropertyFileSnitch
> E DynamicEndPointSnitch: enabled
> E Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> E Schema versions:
> E fc9ec7cd-80ba-3f27-87af-fc0bafcf7a03: [127.0.0.6, 
> 127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1]...
> E 
> E ...Full output truncated (26 lines hidden), use '-vv' to show
> 09:58:14,357 ccm DEBUG Log-watching thread exiting.
> ===Flaky Test Report===
> test_describecluster_more_information_three_datacenters failed and was not 
> selected for rerun.
>   
>   assert 'Cluster Info...1=3, dc3=1}\n' == 'Cluster Infor...1=3, dc3=1}\n'
> Cluster Information:
>   Name: test
>   Snitch: org.apache.cassandra.locator.PropertyFileSnitch
>   DynamicEndPointSnitch: enabled
>   Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>   Schema versions:
>   fc9ec7cd-80ba-3f27-87af-fc0bafcf7a03: [127.0.0.6, 127.0.0.5, 
> 127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1]...
>   
>   ...Full output truncated (26 lines hidden), use '-vv' to show
>   [ /opt/orig/1/opt/dev/cassandra-dtest/nodetool_test.py:373>]
> ===End Flaky Test Report===
> {noformat}
> As this test is for a patch that was introduced for 4.0, this dtest (should) 
> only be failing on trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14610) Flaky dtest: nodetool_test.TestNodetool.test_describecluster_more_information_three_datacenters

2018-10-06 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640961#comment-16640961
 ] 

Jay Zhuang commented on CASSANDRA-14610:


I'm unable to reproduce the problem locally, for the failed job in Jenkins, 
seems mostly it's because timeout to populate 6 nodes:
{noformat}
Error Message
ccmlib.node.NodeError: Error starting node1.
Stacktrace
self = 

@since('4.0')
def test_describecluster_more_information_three_datacenters(self):
"""
nodetool describecluster should be more informative. It should 
include detailes
for total node count, list of datacenters, RF, number of nodes per 
dc, how many
are down and version(s).
@jira_ticket CASSANDRA-13853
@expected_result This test invokes nodetool describecluster and 
matches the output with the expected one
"""
cluster = self.cluster
>   cluster.populate([2, 3, 1]).start(wait_for_binary_proto=True)
{noformat}

Other tests which requires 6 nodes all marked as 
{{@pytest.mark.resource_intensive}} (then these tests are skipped). I think 
reducing the node number from 6 to 4 should help.

+1 for the patch (also it passed 100 times locally run).

> Flaky dtest: 
> nodetool_test.TestNodetool.test_describecluster_more_information_three_datacenters
> ---
>
> Key: CASSANDRA-14610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14610
> Project: Cassandra
>  Issue Type: Task
>  Components: Testing, Tools
>Reporter: Jason Brown
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: dtest
>
> @jay zhuang observed 
> nodetool_test.TestNodetool.test_describecluster_more_information_three_datacenters
>  being flaky in Apache Jenkins. I ran locally and got a different flaky 
> behavior:
> {noformat}
> out_node1_dc3, err, _ = node1_dc3.nodetool('describecluster')
> assert 0 == len(err), err
> >   assert out_node1_dc1 == out_node1_dc3
> E   AssertionError: assert 'Cluster Info...1=3, dc3=1}\n' == 'Cluster 
> Infor...1=3, dc3=1}\n'
> E   Cluster Information:
> E Name: test
> E Snitch: org.apache.cassandra.locator.PropertyFileSnitch
> E DynamicEndPointSnitch: enabled
> E Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> E Schema versions:
> E fc9ec7cd-80ba-3f27-87af-fc0bafcf7a03: [127.0.0.6, 
> 127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1]...
> E 
> E ...Full output truncated (26 lines hidden), use '-vv' to show
> 09:58:14,357 ccm DEBUG Log-watching thread exiting.
> ===Flaky Test Report===
> test_describecluster_more_information_three_datacenters failed and was not 
> selected for rerun.
>   
>   assert 'Cluster Info...1=3, dc3=1}\n' == 'Cluster Infor...1=3, dc3=1}\n'
> Cluster Information:
>   Name: test
>   Snitch: org.apache.cassandra.locator.PropertyFileSnitch
>   DynamicEndPointSnitch: enabled
>   Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>   Schema versions:
>   fc9ec7cd-80ba-3f27-87af-fc0bafcf7a03: [127.0.0.6, 127.0.0.5, 
> 127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1]...
>   
>   ...Full output truncated (26 lines hidden), use '-vv' to show
>   [ /opt/orig/1/opt/dev/cassandra-dtest/nodetool_test.py:373>]
> ===End Flaky Test Report===
> {noformat}
> As this test is for a patch that was introduced for 4.0, this dtest (should) 
> only be failing on trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9989) Optimise BTree.build

2018-08-30 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-9989:
--
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

Thanks [~benedict] for the review. Committed as 
[{{2e59ea8}}|https://github.com/apache/cassandra/commit/2e59ea8c7f21cb11b7ce71a5cdf303a8ed453bc0].

> Optimise BTree.build
> 
>
> Key: CASSANDRA-9989
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9989
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 9989-trunk.txt
>
>
> BTree.Builder could reduce its copying, and exploit toArray more efficiently, 
> with some work. It's not very important right now because we don't make as 
> much use of its bulk-add methods as we otherwise might, however over time 
> this work will become more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-09-26 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629010#comment-16629010
 ] 

Jay Zhuang commented on CASSANDRA-14791:


Hi [~mshuler], [~spo...@gmail.com], any idea if there's a permission setting we 
could set for the Jenkins Job/Slave?

> [utest] tests unable to write system tmp directory
> --
>
> Key: CASSANDRA-14791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
> Project: Cassandra
>  Issue Type: Task
>  Components: Testing
>Reporter: Jay Zhuang
>Priority: Minor
>
> Some tests are failing from time to time because it cannot write to directory 
> {{/tmp/}}:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/
> {noformat}
> java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> /tmp/na-1-big-Data.db
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
>   at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
> {noformat}
>  I guess it's because some Jenkins slaves don't have proper permission set. 
> For slave {{cassandra16}}, the tests are fine:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-09-25 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14791:
--

 Summary: [utest] tests unable to write system tmp directory
 Key: CASSANDRA-14791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
 Project: Cassandra
  Issue Type: Task
  Components: Testing
Reporter: Jay Zhuang


Some tests are failing from time to time because it cannot write to directory 
{{/tmp/}}:
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/

{noformat}
java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
/tmp/na-1-big-Data.db
at 
org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
at 
org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
at 
org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
at 
org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
at 
org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
at 
org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
at java.nio.channels.FileChannel.open(FileChannel.java:287)
at java.nio.channels.FileChannel.open(FileChannel.java:335)
at 
org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
{noformat}

 I guess it's because some Jenkins slaves don't have proper permission set. For 
slave {{cassandra16}}, the tests are fine:
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12704) snapshot build never be able to publish to mvn artifactory

2018-09-27 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630699#comment-16630699
 ] 

Jay Zhuang commented on CASSANDRA-12704:


Do you think the change should go to trunk only or other branches too?
I would prefer branches from 2.2, as we might have snapshot artifacts for all 
active branches.

> snapshot build never be able to publish to mvn artifactory
> --
>
> Key: CASSANDRA-12704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12704
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Attachments: 12704-trunk.txt
>
>
> {code}
> $ ant publish
> {code}
> works fine when property "release" is set, which publishes the binaries to 
> release Artifactory.
> But for daily snapshot build, if "release" is set, it won't be snapshot build:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L74
> if "release" is not set, it doesn't publish to snapshot Artifactory:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L1888
> I would suggest just removing the "if check" for target "publish".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12704) snapshot build never be able to publish to mvn artifactory

2018-09-27 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-12704:
---
Reviewer: mck

> snapshot build never be able to publish to mvn artifactory
> --
>
> Key: CASSANDRA-12704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12704
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Attachments: 12704-trunk.txt
>
>
> {code}
> $ ant publish
> {code}
> works fine when property "release" is set, which publishes the binaries to 
> release Artifactory.
> But for daily snapshot build, if "release" is set, it won't be snapshot build:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L74
> if "release" is not set, it doesn't publish to snapshot Artifactory:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L1888
> I would suggest just removing the "if check" for target "publish".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14791) [utest] tests unable to write system tmp directory

2018-09-27 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630718#comment-16630718
 ] 

Jay Zhuang commented on CASSANDRA-14791:


[~mshuler] talked about the docker option in the last NGCC: 
https://github.com/ngcc/ngcc2017/blob/master/Help_Test_Apache_Cassandra-NGCC_2017.pdf
 . Any idea how we can move forward with this?

> [utest] tests unable to write system tmp directory
> --
>
> Key: CASSANDRA-14791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14791
> Project: Cassandra
>  Issue Type: Task
>  Components: Testing
>Reporter: Jay Zhuang
>Priority: Minor
>
> Some tests are failing from time to time because it cannot write to directory 
> {{/tmp/}}:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/
> {noformat}
> java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> /tmp/na-1-big-Data.db
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:152)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141)
>   at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:119)
>   at 
> org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testException(CompressedInputStreamTest.java:78)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
> {noformat}
>  I guess it's because some Jenkins slaves don't have proper permission set. 
> For slave {{cassandra16}}, the tests are fine:
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/723/testReport/junit/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testException/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12704) snapshot build never be able to publish to mvn artifactory

2018-09-27 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-12704:
---
Status: Ready to Commit  (was: Patch Available)

> snapshot build never be able to publish to mvn artifactory
> --
>
> Key: CASSANDRA-12704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12704
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Attachments: 12704-trunk.txt
>
>
> {code}
> $ ant publish
> {code}
> works fine when property "release" is set, which publishes the binaries to 
> release Artifactory.
> But for daily snapshot build, if "release" is set, it won't be snapshot build:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L74
> if "release" is not set, it doesn't publish to snapshot Artifactory:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L1888
> I would suggest just removing the "if check" for target "publish".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12704) snapshot build never be able to publish to mvn artifactory

2018-09-27 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631220#comment-16631220
 ] 

Jay Zhuang commented on CASSANDRA-12704:


Thanks [~michaelsembwever]. Committed to trunk as 
[{{87a}}|https://github.com/apache/cassandra/commit/87abe7249f7ad8b11235d61e048735bd6d62].

> snapshot build never be able to publish to mvn artifactory
> --
>
> Key: CASSANDRA-12704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12704
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Attachments: 12704-trunk.txt
>
>
> {code}
> $ ant publish
> {code}
> works fine when property "release" is set, which publishes the binaries to 
> release Artifactory.
> But for daily snapshot build, if "release" is set, it won't be snapshot build:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L74
> if "release" is not set, it doesn't publish to snapshot Artifactory:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L1888
> I would suggest just removing the "if check" for target "publish".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12704) snapshot build never be able to publish to mvn artifactory

2018-09-27 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-12704:
---
   Resolution: Fixed
Fix Version/s: 4.0
   Status: Resolved  (was: Ready to Commit)

> snapshot build never be able to publish to mvn artifactory
> --
>
> Key: CASSANDRA-12704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12704
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.0
>
> Attachments: 12704-trunk.txt
>
>
> {code}
> $ ant publish
> {code}
> works fine when property "release" is set, which publishes the binaries to 
> release Artifactory.
> But for daily snapshot build, if "release" is set, it won't be snapshot build:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L74
> if "release" is not set, it doesn't publish to snapshot Artifactory:
> https://github.com/apache/cassandra/blob/cassandra-3.7/build.xml#L1888
> I would suggest just removing the "if check" for target "publish".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14526) dtest to validate Cassandra state post failed/successful bootstrap

2019-01-01 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731753#comment-16731753
 ] 

Jay Zhuang edited comment on CASSANDRA-14526 at 1/2/19 4:08 AM:


Seems the new test is failing:
{noformat}
$ pytest --cassandra-dir=/Users/zjay/ws/cassandra 
bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled
...
>   self.assert_log_had_msg(node3, 'Not starting client transports as 
> bootstrap has not completed', timeout=30)
...
E   ccmlib.node.TimeoutError: 02 Jan 2019 03:45:28 [node3] 
Missing: ['Not starting client transports as bootstrap has not completed']:
E   INFO  [main] 2019-01-01 19:44:46,816 Config.java:4.
E   See system.log for remainder
...
{noformat}


was (Author: jay.zhuang):
Seems the test is failing for branch 2.2:
{noformat}
$ pytest --cassandra-dir=/Users/zjay/ws/cassandra 
bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled
...
E   ccmlib.node.TimeoutError: 02 Jan 2019 03:45:28 [node3] 
Missing: ['Not starting client transports as bootstrap has not completed']:
E   INFO  [main] 2019-01-01 19:44:46,816 Config.java:4.
E   See system.log for remainder
...
{noformat}

> dtest to validate Cassandra state post failed/successful bootstrap
> --
>
> Key: CASSANDRA-14526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14526
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>  Labels: dtest
>
> Please find dtest here:
> || dtest ||
> | [patch 
> |https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14526) dtest to validate Cassandra state post failed/successful bootstrap

2019-01-01 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731753#comment-16731753
 ] 

Jay Zhuang commented on CASSANDRA-14526:


Seems the test is failing for branch 2.2:
{noformat}
$ pytest --cassandra-dir=/Users/zjay/ws/cassandra 
bootstrap_test.py::TestBootstrap::test_bootstrap_binary_disabled
...
E   ccmlib.node.TimeoutError: 02 Jan 2019 03:45:28 [node3] 
Missing: ['Not starting client transports as bootstrap has not completed']:
E   INFO  [main] 2019-01-01 19:44:46,816 Config.java:4.
E   See system.log for remainder
...
{noformat}

> dtest to validate Cassandra state post failed/successful bootstrap
> --
>
> Key: CASSANDRA-14526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14526
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>  Labels: dtest
>
> Please find dtest here:
> || dtest ||
> | [patch 
> |https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14526) dtest to validate Cassandra state post failed/successful bootstrap

2019-01-02 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732692#comment-16732692
 ] 

Jay Zhuang commented on CASSANDRA-14526:


I still see the same failure after updating the branch:
{noformat}
node3.start(jvm_args=["-Dcassandra.write_survey=true", 
"-Dcassandra.ring_delay_ms=5000"], wait_other_notice=True)
self.assert_log_had_msg(node3, 'Some data streaming failed', timeout=30)
>   self.assert_log_had_msg(node3, 'Not starting client transports as 
> bootstrap has not completed', timeout=30)

bootstrap_test.py:767:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _

self = , node = 
, msg = 'Not starting client transports 
as bootstrap has not completed', timeout = 30, kwargs = {}

def assert_log_had_msg(self, node, msg, timeout=600, **kwargs):
"""
Wrapper for ccmlib.node.Node#watch_log_for to cause an assertion 
failure when a log message isn't found
within the timeout.
:param node: Node which logs we should watch
:param msg: String message we expect to see in the logs.
:param timeout: Seconds to wait for msg to appear
"""
try:
node.watch_log_for(msg, timeout=timeout, **kwargs)
except TimeoutError:
>   pytest.fail("Log message was not seen within 
> timeout:\n{0}".format(msg))
E   Failed: Log message was not seen within timeout:
E   Not starting client transports as bootstrap has not completed

dtest.py:266: Failed
{noformat}

> dtest to validate Cassandra state post failed/successful bootstrap
> --
>
> Key: CASSANDRA-14526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14526
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>  Labels: dtest
>
> Please find dtest here:
> || dtest ||
> | [patch 
> |https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14526) dtest to validate Cassandra state post failed/successful bootstrap

2019-01-15 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14526:
---
Resolution: Fixed
  Reviewer: Jay Zhuang
Status: Resolved  (was: Patch Available)

> dtest to validate Cassandra state post failed/successful bootstrap
> --
>
> Key: CASSANDRA-14526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14526
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/dtest
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>  Labels: dtest
>
> Please find dtest here:
> || dtest ||
> | [patch 
> |https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14526) dtest to validate Cassandra state post failed/successful bootstrap

2019-01-15 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743263#comment-16743263
 ] 

Jay Zhuang commented on CASSANDRA-14526:


The new tests passed locally (except: CASSANDRA-14984, not related to this 
change or CASSANDRA-14525). Committed as 
[{{e6f58cb}}|https://github.com/apache/cassandra-dtest/commit/e6f58cb33f7a09f273c5990d5d21c7b529ba80bf].
 Thanks [~chovatia.jayd...@gmail.com].

> dtest to validate Cassandra state post failed/successful bootstrap
> --
>
> Key: CASSANDRA-14526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14526
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/dtest
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>  Labels: dtest
>
> Please find dtest here:
> || dtest ||
> | [patch 
> |https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14984) [dtest] 2 TestBootstrap tests failed for branch 2.2

2019-01-15 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14984:
--

 Summary: [dtest] 2 TestBootstrap tests failed for branch 2.2
 Key: CASSANDRA-14984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14984
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest
Reporter: Jay Zhuang


Failed tests:
{noformat}
test_decommissioned_wiped_node_can_join
test_decommissioned_wiped_node_can_gossip_to_single_seed
{noformat}

Error:
{noformat}
...
# Decommision the new node and kill it
logger.debug("Decommissioning & stopping node2")
>   node2.decommission()
...
def handle_external_tool_process(process, cmd_args):
out, err = process.communicate()
if (out is not None) and isinstance(out, bytes):
out = out.decode()
if (err is not None) and isinstance(err, bytes):
err = err.decode()
rc = process.returncode

if rc != 0:
>   raise ToolError(cmd_args, rc, out, err)
E   ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', 
'-p', '7200', 'decommission'] exited with non-zero status; exit status: 2;
E   stderr: error: Thread signal failed
E   -- StackTrace --
E   java.io.IOException: Thread signal failed
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14526) dtest to validate Cassandra state post failed/successful bootstrap

2019-01-14 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742385#comment-16742385
 ] 

Jay Zhuang commented on CASSANDRA-14526:


Thanks [~chovatia.jayd...@gmail.com], the change looks good to me. Kick off a 
build:

|| Branch || dTest ||
| 2.2 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/672/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/672/]
 |
| 3.0 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/671/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/671/]
 |
| 3.11 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/670/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/670/]
 |
| trunk | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/669/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/669/]
 |



> dtest to validate Cassandra state post failed/successful bootstrap
> --
>
> Key: CASSANDRA-14526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14526
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/dtest
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>  Labels: dtest
>
> Please find dtest here:
> || dtest ||
> | [patch 
> |https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-12-21 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727118#comment-16727118
 ] 

Jay Zhuang commented on CASSANDRA-14525:


Rebased the code and started the tests:
| Branch | uTest | dTest |
| [14525-2.2|https://github.com/cooldoger/cassandra/tree/14525-2.2] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14525-2.2.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14525-2.2]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/664/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/664/]
 |
| [14525-3.0|https://github.com/cooldoger/cassandra/tree/14525-3.0] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14525-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14525-3.0]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/665/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/665/]
 |
| [14525-3.11|https://github.com/cooldoger/cassandra/tree/14525-3.11] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14525-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14525-3.11]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/666/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/666/]
 |
| [14525-trunk|https://github.com/cooldoger/cassandra/tree/14525-trunk] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/14525-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14525-trunk]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/667/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/667/]
 |

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> 

[jira] [Updated] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-12-06 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14616:
---
   Resolution: Fixed
Fix Version/s: 4.0
   3.11.4
   3.0.18
   Status: Resolved  (was: Ready to Commit)

Thank you [~Stefania] for the review. Committed as 
[{{bbf7dac}}|https://github.com/apache/cassandra/commit/bbf7dac87cdc41bf8e138a99f630e7a827ad0d98].
 The dTest is committed as 
[{{325ef3f}}|https://github.com/apache/cassandra-dtest/commit/325ef3fa063252e6dad88473613abbd829e8c24d].

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Chris Lohfink
>Assignee: Jay Zhuang
>Priority: Major
> Fix For: 3.0.18, 3.11.4, 4.0
>
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-12-06 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14616:
--

Assignee: Jay Zhuang  (was: Jeremy)

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Chris Lohfink
>Assignee: Jay Zhuang
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2019-01-09 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739100#comment-16739100
 ] 

Jay Zhuang commented on CASSANDRA-14525:


I'm sorry for not committing the dtest change. Will do that ASAP (I'm still 
trying to confirm a few flaky tests are not introduced by the changes).

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> 

<    2   3   4   5   6   7   8   >