[jira] [Assigned] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-14 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi reassigned CASSANDRA-14525:


Assignee: Jaydeepkumar Chovatia
Reviewer: Dinesh Joshi

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> [StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L999]
>  will not be invoked.
> API [StorageService.java::joi

[jira] [Commented] (CASSANDRA-14114) uTest failed: NettyFactoryTest.createServerChannel_UnbindableAddress()

2018-06-14 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513359#comment-16513359
 ] 

Dinesh Joshi commented on CASSANDRA-14114:
--

This is possibly due to the fact that {{net.ipv4.ip_nonlocal_bind}} is set 
which allows non-local binding of sockets.

> uTest failed: NettyFactoryTest.createServerChannel_UnbindableAddress()
> --
>
> Key: CASSANDRA-14114
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14114
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jay Zhuang
>Priority: Minor
>  Labels: Testing
>
> {noformat}
> [junit] Testcase: 
> createServerChannel_UnbindableAddress(org.apache.cassandra.net.async.NettyFactoryTest):
>FAILED
> [junit] Expected exception: 
> org.apache.cassandra.exceptions.ConfigurationException
> [junit] junit.framework.AssertionFailedError: Expected exception: 
> org.apache.cassandra.exceptions.ConfigurationException
> [junit]
> [junit]
> [junit] Test org.apache.cassandra.net.async.NettyFactoryTest FAILED
> {noformat}
> I'm unable to reproduce the problem on a mac or circleCI, but on some hosts 
> (Linux 4.4.38), it's able to bind IP {{1.1.1.1}}, or any other valid IP 
> (which breaks the testcase):
> {noformat}
> ...
> [junit] INFO  [main] 2017-12-13 21:20:48,470 NettyFactory.java:190 - 
> Starting Messaging Service on /1.1.1.1:9876 , encryption: disabled
> ...
> {noformat}
> Is it because a network/kernal configuration?
> +[~jasobrown]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-14 Thread Jaydeepkumar Chovatia (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaydeepkumar Chovatia updated CASSANDRA-14525:
--
Fix Version/s: 3.0.x
   2.2.x
   4.0

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 4.0, 2.2.x, 3.0.x
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> [StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L999]
>  will not be invoked.
> API [StorageService.java::joinTokenRing 
> |https://g

[jira] [Updated] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-14 Thread Jaydeepkumar Chovatia (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaydeepkumar Chovatia updated CASSANDRA-14525:
--
Description: 
If bootstrap fails for newly joining node (most common reason is due to 
streaming failure) then Cassandra state remains in {{joining}} state which is 
fine but Cassandra also enables Native transport which makes overall state 
inconsistent. This further creates NullPointer exception if auth is enabled on 
the new node, please find reproducible steps here:

For example if bootstrap fails due to streaming errors like
{quote}java.util.concurrent.ExecutionException: 
org.apache.cassandra.streaming.StreamException: Stream failed
 at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
 ~[guava-18.0.jar:na]
 at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
 ~[guava-18.0.jar:na]
 at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
~[guava-18.0.jar:na]
 at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256) 
[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
 [apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:660) 
[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:573) 
[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) 
[apache-cassandra-3.0.16.jar:3.0.16]
 at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
[apache-cassandra-3.0.16.jar:3.0.16]
 Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
 at 
org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
 at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
~[guava-18.0.jar:na]
 at 
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
 ~[guava-18.0.jar:na]
 at 
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
 ~[guava-18.0.jar:na]
 at 
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) 
~[guava-18.0.jar:na]
 at 
com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
 ~[guava-18.0.jar:na]
 at 
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
~[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
 at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{quote}
then variable [StorageService.java::dataAvailable 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
 will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not call 
[StorageService.java::finishJoiningRing 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
 and as a result 
[StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L999]
 will not be invoked.

API [StorageService.java::joinTokenRing 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L763]
 returns without any problem. After this 
[CassandraDaemon.java::start|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/CassandraDaemon.java#L584]
 is invoked which starts native transport at 
 [CassandraDaemon.java::startNativeTransport 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/CassandraDaemon.java#L478]

At this point daemon’s bootstrap is still not finished and transport is 
enabled. So client will connect to the node and will encounter 
{{java.lang.NullPointerException}} as following:
{qu

[jira] [Updated] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-14 Thread Jaydeepkumar Chovatia (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaydeepkumar Chovatia updated CASSANDRA-14525:
--
Assignee: (was: Jaydeepkumar Chovatia)
  Status: Patch Available  (was: Open)

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Priority: Major
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence at it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> [StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src

[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-14 Thread Jaydeepkumar Chovatia (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513350#comment-16513350
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-14525:
---

In my opinion daemon should enable native transport only after successful 
bootstrap to avoid inconsistent state.

Please find patch with the fix here:
||trunk||3.0||2.x||
|[!https://circleci.com/gh/jaydeepkumar1984/cassandra/tree/14525-trunk.svg?style=svg!
 
|https://circleci.com/gh/jaydeepkumar1984/cassandra/76]|[!https://circleci.com/gh/jaydeepkumar1984/cassandra/tree/14525-3.0.svg?style=svg!
 |https://circleci.com/gh/jaydeepkumar1984/cassandra/73]  |  
[!https://circleci.com/gh/jaydeepkumar1984/cassandra/tree/14525-2.2.svg?style=svg!
 |https://circleci.com/gh/jaydeepkumar1984/cassandra/74]|
|[patch 
|https://github.com/apache/cassandra/compare/trunk...jaydeepkumar1984:14525-trunk]
 
|[patch|https://github.com/apache/cassandra/compare/trunk...jaydeepkumar1984:14525-3.0]
 |[patch 
|https://github.com/apache/cassandra/compare/trunk...jaydeepkumar1984:14525-2.2]
 |

Please review it and let me know your opinion.

 

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
> at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>   

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-06-14 Thread Kurt Greaves (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513345#comment-16513345
 ] 

Kurt Greaves commented on CASSANDRA-14423:
--

Patches for all branches:

|[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...kgreav:14423-2.2]|
|[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...kgreav:14423-3.0]|
|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...kgreav:14423-3.11]|

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstone

[jira] [Commented] (CASSANDRA-14470) Repair validation failed/unable to create merkle tree

2018-06-14 Thread Kurt Greaves (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513335#comment-16513335
 ] 

Kurt Greaves commented on CASSANDRA-14470:
--

Sounds likely, if you're happy with that do you mind if we close the ticket?

> Repair validation failed/unable to create merkle tree
> -
>
> Key: CASSANDRA-14470
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14470
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Harry Hough
>Priority: Major
>
> I had trouble repairing with a full repair across all nodes and keyspaces so 
> I swapped to doing table by table. This table will not repair even after 
> scrub/restart of all nodes. I am using command:
> {code:java}
> nodetool repair -full -seq keyspace table
> {code}
> {code:java}
> [2018-05-25 19:26:36,525] Repair session 0198ee50-6050-11e8-a3b7-9d0793eab507 
> for range [(165598500763544933,166800441975877433], 
> (-5455068259072262254,-5445777107512274819], 
> (-4614366950466274594,-4609359222424798148], 
> (3417371506258365094,3421921915575816226], 
> (5221788898381458942,5222846663270250559], 
> (3421921915575816226,3429175540277204991], 
> (3276484330153091115,3282213186258578546], 
> (-3306169730424140596,-3303439264231406101], 
> (5228704360821395206,5242415853745535023], 
> (5808045095951939338,5808562658315740708], 
> (-3303439264231406101,-3302592736123212969]] finished (progress: 1%)
> [2018-05-25 19:27:23,848] Repair session 0180f980-6050-11e8-a3b7-9d0793eab507 
> for range [(-8495158945319933291,-8482949618583319581], 
> (1803296697741516342,1805330812863783941], 
> (8633191319643427141,8637771071728131257], 
> (2214097236323810344,2218253238829661319], 
> (8637771071728131257,8639627594735133685], 
> (2195525904029414718,2214097236323810344], 
> (-8500127431270773970,-8495158945319933291], 
> (7151693083782264341,7152162989417914407], 
> (-8482949618583319581,-8481973749935314249]] finished (progress: 1%)
> [2018-05-25 19:30:32,590] Repair session 01ac9d62-6050-11e8-a3b7-9d0793eab507 
> for range [(7887346492105510731,7893062759268864220], 
> (-153277717939330979,-151986584968539220], 
> (-6351665356961460262,-6336288442758847669], 
> (7881942012672602731,7887346492105510731], 
> (-5884528383037906783,-5878097817437987368], 
> (6054625594262089428,6060773114960761336], 
> (-6354401100436622515,-6351665356961460262], 
> (3358411934943460772,336336663817876], 
> (6255644242745576360,6278718135193665575], 
> (-6321106762570843270,-6316788220143151823], 
> (1754319239259058661,1759314644652031521], 
> (7893062759268864220,7894890594190784729], 
> (-8012293411840276426,-8011781808288431224]] failed with error [repair 
> #01ac9d62-6050-11e8-a3b7-9d0793eab507 on keyspace/table, 
> [(7887346492105510731,7893062759268864220], 
> (-153277717939330979,-151986584968539220], 
> (-6351665356961460262,-6336288442758847669], 
> (7881942012672602731,7887346492105510731],
> (-5884528383037906783,-5878097817437987368], 
> (6054625594262089428,6060773114960761336], 
> (-6354401100436622515,-6351665356961460262], 
> (3358411934943460772,336336663817876], 
> (6255644242745576360,6278718135193665575], 
> (-6321106762570843270,-6316788220143151823], 
> (1754319239259058661,1759314644652031521], 
> (7893062759268864220,7894890594190784729], 
> (-8012293411840276426,-8011781808288431224]]] Validation failed in 
> /192.168.8.64 (progress: 1%)
> [2018-05-25 19:30:38,744] Repair session 01ab16c1-6050-11e8-a3b7-9d0793eab507 
> for range [(4474598255414218354,4477186372547790770], 
> (-8368931070988054567,-8367389908801757978], 
> (4445104759712094068,4445123832517144036], 
> (6749641233379918040,6749879473217708908], 
> (717627050679001698,729408043324000761], 
> (8984622403893999385,8990662643404904110], 
> (4457612694557846994,4474598255414218354], 
> (5589049422573545528,5593079877787783784], 
> (3609693317839644945,3613727999875360405], 
> (8499016262183246473,8504603366117127178], 
> (-5421277973540712245,-5417725796037372830], 
> (5586405751301680690,5589049422573545528], 
> (-2611069890590917549,-2603911539353128123], 
> (2424772330724108233,2427564448454334730], 
> (3172651438220766183,3175226710613527829], 
> (4445123832517144036,4457612694557846994], 
> (-6827531712183440570,-6800863837312326365], 
> (5593079877787783784,5596020904874304252], 
> (716705770783505310,717627050679001698], 
> (115377252345874298,119626359210683992], 
> (239394377432130766,240250561347730054]] failed with error [repair 
> #01ab16c1-6050-11e8-a3b7-9d0793eab507 on keyspace/table, 
> [(4474598255414218354,4477186372547790770], 
> (-8368931070988054567,-8367389908801757978], 
> (4445104759712094068,4445123832517144036], 
> (6749641233379918040,6749879473217708908], 
> (717627050679001698,729408043324000761], 
> (8984622403893999385,899

[jira] [Created] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2018-06-14 Thread Jaydeepkumar Chovatia (JIRA)
Jaydeepkumar Chovatia created CASSANDRA-14525:
-

 Summary: streaming failure during bootstrap makes new node into 
inconsistent state
 Key: CASSANDRA-14525
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jaydeepkumar Chovatia
Assignee: Jaydeepkumar Chovatia


If bootstrap fails for newly joining node (most common reason is due to 
streaming failure) then Cassandra state remains in {{joining}} state which is 
fine but Cassandra also enables Native transport which makes overall state 
inconsistent. This further creates NullPointer exception if auth is enabled on 
the new node, please find reproducible steps here:

For example if bootstrap fails due to streaming errors like
{quote}
java.util.concurrent.ExecutionException: 
org.apache.cassandra.streaming.StreamException: Stream failed
at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
 ~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
 ~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
~[guava-18.0.jar:na]
at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256) 
[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
 [apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:660) 
[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:573) 
[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) 
[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
[apache-cassandra-3.0.16.jar:3.0.16]
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
at 
org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
 ~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
 ~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) 
~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
 ~[guava-18.0.jar:na]
at 
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
~[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 ~[apache-cassandra-3.0.16.jar:3.0.16]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
{quote}


then variable [StorageService.java::dataAvailable 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
 will be {{false}}. Since {{dataAvailable}} is {{false}} hence at it will not 
call [StorageService.java::finishJoiningRing 
|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
 and as a result 
[StorageService.java::doAuthSetup|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L999]
 will not be invoked. 

API [StorageService.java::joinTokenRing | 
https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L763]
 returns without any problem. After this 
[CassandraDaemon.java::start|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/CassandraDaemon.java#L5

[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted

2018-06-14 Thread Kurt Greaves (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves updated CASSANDRA-14423:
-
Reproduced In: 3.11.2, 3.11.0  (was: 3.11.0)

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy 
> Compaction buckets are 
> [[BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-6716

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-06-14 Thread Kurt Greaves (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513294#comment-16513294
 ] 

Kurt Greaves commented on CASSANDRA-14423:
--

In 
{{org.apache.cassandra.db.compaction.CompactionManager#submitAntiCompaction}} 
we create a transaction over all SSTables included in the repair (including 
repaired SSTables when doing full repair) and pass that through to 
{{performAntiCompaction}} in which two things can happen:

1. The SSTable is fully contained within the repairing ranges, and in that case 
we mutate repairedAt to the current time of repair and add it to 
{{mutatedRepairStatuses}}
2. The SSTable isn't fully contained within the repairing ranges (highly likely 
if vnodes or single tokens with >RF nodes). In this case we don't add the 
_already repaired_ SSTable to {{mutatedRepairStatuses}}.

We then remove all SSTables from the transaction in {{mutatedRepairStatuses}} 
[here|https://github.com/apache/cassandra/blob/191ad7b87a4ded26be4ab0bd192ef676f059276c/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L704].

If *2* occured, the already repaired SSTables were not in 
{{mutatedRepairStatuses}} and thus didn't get removed from the transaction and 
when {{txn.finish()}} is called they get removed from the CompactionStrategy's 
view of sstables via 
{{org.apache.cassandra.db.lifecycle.LifecycleTransaction#doCommit}} calling 
{{Tracker#notifySSTablesChanged}} which will not include the already repaired 
SSTables.

The reason CASSANDRA-13153 brought this bug to light was because up until that 
point we _were_ anti-compacting already repaired SSTables, and thus upon 
anti-compaction (rewrite) they would be added back into the transaction and the 
old SSTable would be removed as usual and the new SSTable would take its place.

Seeing as the existing consensus seems to be that there's no real value at the 
moment in mutating repaired times on already repaired SSTables I think the best 
solution is to not include the repaired SSTables in the transaction in the 
first place. This corresponds with how trunk currently works and also is a lot 
cleaner, which is how it works in my patch mentioned above. The alternative 
would be to remove them from the transaction regardless of if they were 
mutated, but this seems pointless considering we don't do anything with it. If 
we ever decide there is value in updating repairedAt on already repaired 
SSTables, we can add it back and handle it then. 

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cell

[jira] [Updated] (CASSANDRA-14423) SSTables stop being compacted

2018-06-14 Thread Kurt Greaves (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves updated CASSANDRA-14423:
-
Fix Version/s: 3.0.17
   2.2.13

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy 
> Compaction buckets are 
> [[BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-6

[jira] [Commented] (CASSANDRA-10751) "Pool is shutdown" error when running Hadoop jobs on Yarn

2018-06-14 Thread Cyril Scetbon (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513111#comment-16513111
 ] 

Cyril Scetbon commented on CASSANDRA-10751:
---

[~michaelsembwever], It's a problem on 2.1.14. I haven't check on another 
version

> "Pool is shutdown" error when running Hadoop jobs on Yarn
> -
>
> Key: CASSANDRA-10751
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10751
> Project: Cassandra
>  Issue Type: Bug
> Environment: Hadoop 2.7.1 (HDP 2.3.2)
> Cassandra 2.1.11
>Reporter: Cyril Scetbon
>Assignee: Cyril Scetbon
>Priority: Major
> Fix For: 4.0, 2.2.13, 3.0.17, 3.11.3
>
> Attachments: CASSANDRA-10751-2.2.patch, CASSANDRA-10751-3.0.patch, 
> output.log
>
>
> Trying to execute an Hadoop job on Yarn, I get errors from Cassandra's 
> internal code. It seems that connections are shutdown but we can't understand 
> why ...
> Here is a subtract of the errors. I also add a file with the complete debug 
> logs.
> {code}
> 15/11/22 20:05:54 [main]: DEBUG core.RequestHandler: Error querying 
> node006.internal.net/192.168.12.22:9042, trying next host (error is: 
> com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown)
> Failed with exception java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> 15/11/22 20:05:54 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> java.io.IOException: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1672)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:132)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:674)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:324)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:446)
>   ... 15 more
> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All 
> host(s) tried for query failed (tried: 
> node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
>   at 
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:

[jira] [Comment Edited] (CASSANDRA-10751) "Pool is shutdown" error when running Hadoop jobs on Yarn

2018-06-14 Thread Cyril Scetbon (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513111#comment-16513111
 ] 

Cyril Scetbon edited comment on CASSANDRA-10751 at 6/14/18 11:07 PM:
-

[~michaelsembwever], It's a problem on 2.1.14. I haven't checked on another 
version


was (Author: cscetbon):
[~michaelsembwever], It's a problem on 2.1.14. I haven't check on another 
version

> "Pool is shutdown" error when running Hadoop jobs on Yarn
> -
>
> Key: CASSANDRA-10751
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10751
> Project: Cassandra
>  Issue Type: Bug
> Environment: Hadoop 2.7.1 (HDP 2.3.2)
> Cassandra 2.1.11
>Reporter: Cyril Scetbon
>Assignee: Cyril Scetbon
>Priority: Major
> Fix For: 4.0, 2.2.13, 3.0.17, 3.11.3
>
> Attachments: CASSANDRA-10751-2.2.patch, CASSANDRA-10751-3.0.patch, 
> output.log
>
>
> Trying to execute an Hadoop job on Yarn, I get errors from Cassandra's 
> internal code. It seems that connections are shutdown but we can't understand 
> why ...
> Here is a subtract of the errors. I also add a file with the complete debug 
> logs.
> {code}
> 15/11/22 20:05:54 [main]: DEBUG core.RequestHandler: Error querying 
> node006.internal.net/192.168.12.22:9042, trying next host (error is: 
> com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown)
> Failed with exception java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> 15/11/22 20:05:54 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> java.io.IOException: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1672)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:132)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:674)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:324)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:446)
>   ... 15 more
> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All 
> host(s) tried for query failed (tried: 
> node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> com.datastax.driver.core.except

[jira] [Commented] (CASSANDRA-10751) "Pool is shutdown" error when running Hadoop jobs on Yarn

2018-06-14 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513079#comment-16513079
 ] 

mck commented on CASSANDRA-10751:
-

Andy, you're were quite right, i did a terrible job of reviewing that, no idea 
how i connected the methods when i looked at it.

[~cscetbon], what version of Cassandra are you running? Can you confirm it's a 
problem on 2.2+ ?

> "Pool is shutdown" error when running Hadoop jobs on Yarn
> -
>
> Key: CASSANDRA-10751
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10751
> Project: Cassandra
>  Issue Type: Bug
> Environment: Hadoop 2.7.1 (HDP 2.3.2)
> Cassandra 2.1.11
>Reporter: Cyril Scetbon
>Assignee: Cyril Scetbon
>Priority: Major
> Fix For: 4.0, 2.2.13, 3.0.17, 3.11.3
>
> Attachments: CASSANDRA-10751-2.2.patch, CASSANDRA-10751-3.0.patch, 
> output.log
>
>
> Trying to execute an Hadoop job on Yarn, I get errors from Cassandra's 
> internal code. It seems that connections are shutdown but we can't understand 
> why ...
> Here is a subtract of the errors. I also add a file with the complete debug 
> logs.
> {code}
> 15/11/22 20:05:54 [main]: DEBUG core.RequestHandler: Error querying 
> node006.internal.net/192.168.12.22:9042, trying next host (error is: 
> com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown)
> Failed with exception java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> 15/11/22 20:05:54 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> java.io.IOException: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1672)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:132)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:674)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:324)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:446)
>   ... 15 more
> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All 
> host(s) tried for query failed (tried: 
> node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(No

[jira] [Commented] (CASSANDRA-10751) "Pool is shutdown" error when running Hadoop jobs on Yarn

2018-06-14 Thread Cyril Scetbon (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512948#comment-16512948
 ] 

Cyril Scetbon commented on CASSANDRA-10751:
---

Hey [~jjordan] [~michaelsembwever] Are you saying that even before it wasn't 
needed ? I can guarantee that it was. That's been running for almost 2 years 
now on production. If it's not needed anymore, then great!

> "Pool is shutdown" error when running Hadoop jobs on Yarn
> -
>
> Key: CASSANDRA-10751
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10751
> Project: Cassandra
>  Issue Type: Bug
> Environment: Hadoop 2.7.1 (HDP 2.3.2)
> Cassandra 2.1.11
>Reporter: Cyril Scetbon
>Assignee: Cyril Scetbon
>Priority: Major
> Fix For: 4.0, 2.2.13, 3.0.17, 3.11.3
>
> Attachments: CASSANDRA-10751-2.2.patch, CASSANDRA-10751-3.0.patch, 
> output.log
>
>
> Trying to execute an Hadoop job on Yarn, I get errors from Cassandra's 
> internal code. It seems that connections are shutdown but we can't understand 
> why ...
> Here is a subtract of the errors. I also add a file with the complete debug 
> logs.
> {code}
> 15/11/22 20:05:54 [main]: DEBUG core.RequestHandler: Error querying 
> node006.internal.net/192.168.12.22:9042, trying next host (error is: 
> com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown)
> Failed with exception java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> 15/11/22 20:05:54 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> java.io.IOException: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1672)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:132)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:674)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:324)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:446)
>   ... 15 more
> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All 
> host(s) tried for query failed (tried: 
> node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> com.datastax.driver.core.exceptions.NoHostAvailableException.

[jira] [Updated] (CASSANDRA-14524) Client metrics refactor

2018-06-14 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14524:
--
Status: Patch Available  (was: Open)

> Client metrics refactor
> ---
>
> Key: CASSANDRA-14524
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14524
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>
> Moving refactoring out of CASSANDRA-14458 to be reviewed separately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14524) Client metrics refactor

2018-06-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512902#comment-16512902
 ] 

ASF GitHub Bot commented on CASSANDRA-14524:


GitHub user clohfink opened a pull request:

https://github.com/apache/cassandra/pull/235

Refactor client metrics for CASSANDRA-14524



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/clohfink/cassandra nativestatsrefactor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cassandra/pull/235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #235


commit 51f1ed997e71283451b02de018b0d934127582ba
Author: Chris Lohfink 
Date:   2018-06-14T19:32:45Z

Refactor client metrics for CASSANDRA-14524




> Client metrics refactor
> ---
>
> Key: CASSANDRA-14524
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14524
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>
> Moving refactoring out of CASSANDRA-14458 to be reviewed separately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14524) Client metrics refactor

2018-06-14 Thread Chris Lohfink (JIRA)
Chris Lohfink created CASSANDRA-14524:
-

 Summary: Client metrics refactor
 Key: CASSANDRA-14524
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14524
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Lohfink
Assignee: Chris Lohfink


Moving refactoring out of CASSANDRA-14458 to be reviewed separately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14522) sstableloader options assume the rpc/native interface is the same as the internode interface

2018-06-14 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14522:
-
Labels: lhf  (was: )

> sstableloader options assume the rpc/native interface is the same as the 
> internode interface
> 
>
> Key: CASSANDRA-14522
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14522
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jeremy Hanna
>Priority: Major
>  Labels: lhf
>
> Currently, in the LoaderOptions for the BulkLoader, the user can give a list 
> of initial host addresses.  That's to do the initial connection to the 
> cluster but also to stream the sstables.  If you have two physical 
> interfaces, one for rpc, the other for internode traffic, then bulk loader 
> won't currently work.  It will throw an error such as:
> {quote}
> > sstableloader -v -u cassadmin -pw xxx -d 
> > 10.133.210.101,10.133.210.102,10.133.210.103,10.133.210.104 
> > /var/lib/cassandra/commitlog/backup_tmp/test_bkup/bkup_tbl
> Established connection to initial hosts
> Opening sstables and calculating sections to stream
> Streaming relevant part of 
> /var/lib/cassandra/commitlog/backup_tmp/test_bkup/bkup_tbl/mc-1-big-Data.db 
> /var/lib/cassandra/commitlog/backup_tmp/test_bkup/bkup_tbl/mc-2-big-Data.db  
> to [/10.133.210.101, /10.133.210.103, /10.133.210.102, /10.133.210.104]
> progress: total: 100% 0  MB/s(avg: 0 MB/s)ERROR 10:16:05,311 [Stream 
> #9ed00130-6ff6-11e8-965c-93a78bf96e60] Streaming error occurred
> java.net.ConnectException: Connection refused
> at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_101]
> at sun.nio.ch.Net.connect(Net.java:454) ~[na:1.8.0_101]
> at sun.nio.ch.Net.connect(Net.java:446) ~[na:1.8.0_101]
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) 
> ~[na:1.8.0_101]
> at java.nio.channels.SocketChannel.open(SocketChannel.java:189) 
> ~[na:1.8.0_101]
> at 
> org.apache.cassandra.tools.BulkLoadConnectionFactory.createConnection(BulkLoadConnectionFactory.java:60)
>  ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.StreamSession.createConnection(StreamSession.java:266)
>  ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:86)
>  ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:253) 
> ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:212)
>  [cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_101]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  ~[netty-all-4.0.54.Final.jar:4.0.54.Final]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
> ERROR 10:16:05,312 [Stream #9ed00130-6ff6-11e8-965c-93a78bf96e60] Streaming 
> error occurred
> java.net.ConnectException: Connection refused
> at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_101]
> at sun.nio.ch.Net.connect(Net.java:454) ~[na:1.8.0_101]
> at sun.nio.ch.Net.connect(Net.java:446) ~[na:1.8.0_101]
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) 
> ~[na:1.8.0_101]
> at java.nio.channels.SocketChannel.open(SocketChannel.java:189) 
> ~[na:1.8.0_101]
> at 
> org.apache.cassandra.tools.BulkLoadConnectionFactory.createConnection(BulkLoadConnectionFactory.java:60)
>  ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.StreamSession.createConnection(StreamSession.java:266)
>  ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:86)
>  ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:253) 
> ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:212)
>  [cassandra-all-3.0.15.2128.jar:3.0.15.2128]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11

[jira] [Commented] (CASSANDRA-14513) Reverse order queries in presence of range tombstones may cause permanent data loss

2018-06-14 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512878#comment-16512878
 ] 

Aleksey Yeschenko commented on CASSANDRA-14513:
---

Committed the dtests as 
[51c8352020b8df3fe04344ae88c29d2a73a228bd|https://github.com/apache/cassandra-dtest/commit/51c8352020b8df3fe04344ae88c29d2a73a228bd].

> Reverse order queries in presence of range tombstones may cause permanent 
> data loss
> ---
>
> Key: CASSANDRA-14513
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14513
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Local Write-Read Paths
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Blocker
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> Slice queries in descending sort order can create oversized artificial range 
> tombstones. At CL > ONE, read repair can propagate these tombstones to all 
> replicas, wiping out vast data ranges that they mistakenly cover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: Add reproduction/regression tests for CASSANDRA-14513

2018-06-14 Thread aleksey
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master cf1e5a66a -> 51c835202


Add reproduction/regression tests for CASSANDRA-14513

patch by Aleksey Yeschenko; reviewed by Sam Tunnicliffe for
CASSANDRA-14513


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/51c83520
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/51c83520
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/51c83520

Branch: refs/heads/master
Commit: 51c8352020b8df3fe04344ae88c29d2a73a228bd
Parents: cf1e5a6
Author: Aleksey Yeschenko 
Authored: Mon Jun 11 15:32:31 2018 +0100
Committer: Aleksey Yeschenko 
Committed: Tue Jun 12 16:28:14 2018 +0100

--
 consistency_test.py | 135 ++-
 1 file changed, 134 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/51c83520/consistency_test.py
--
diff --git a/consistency_test.py b/consistency_test.py
index d6b957a..8a8d8cf 100644
--- a/consistency_test.py
+++ b/consistency_test.py
@@ -8,7 +8,7 @@ from collections import OrderedDict, namedtuple
 from copy import deepcopy
 
 from cassandra import ConsistencyLevel, consistency_value_to_name
-from cassandra.query import SimpleStatement
+from cassandra.query import BatchStatement, BatchType, SimpleStatement
 
 from tools.assertions import (assert_all, assert_length_equal, assert_none,
   assert_unavailable)
@@ -768,6 +768,139 @@ class TestAccuracy(TestHelper):
 class TestConsistency(Tester):
 
 @since('3.0')
+def test_14513_transient(self):
+"""
+@jira_ticket CASSANDRA-14513
+
+A reproduction / regression test to illustrate CASSANDRA-14513:
+transient data loss when doing reverse-order queries with range
+tombstones in place.
+
+This test shows how the bug can cause queries to return invalid
+results by just a single node.
+"""
+cluster = self.cluster
+
+# set column_index_size_in_kb to 1 for a slightly easier reproduction 
sequence
+cluster.set_configuration_options(values={'column_index_size_in_kb': 
1})
+
+cluster.populate(1).start(wait_other_notice=True)
+node1 = cluster.nodelist()[0]
+
+session = self.patient_cql_connection(node1)
+
+query = "CREATE KEYSPACE journals WITH replication = {'class': 
'NetworkTopologyStrategy', 'datacenter1': 1};"
+session.execute(query)
+
+query = 'CREATE TABLE journals.logs (user text, year int, month int, 
day int, title text, body text, PRIMARY KEY ((user), year, month, day, 
title));';
+session.execute(query)
+
+# populate the table
+stmt = session.prepare('INSERT INTO journals.logs (user, year, month, 
day, title, body) VALUES (?, ?, ?, ?, ?, ?);');
+for year in range(2011, 2018):
+for month in range(1, 13):
+for day in range(1, 31):
+session.execute(stmt, ['beobal', year, month, day, 
'title', 'Lorem ipsum dolor sit amet'], ConsistencyLevel.ONE)
+node1.flush()
+
+# make sure the data is there
+assert_all(session,
+   "SELECT COUNT(*) FROM journals.logs WHERE user = 'beobal' 
AND year < 2018 ORDER BY year DESC;",
+   [[7 * 12 * 30]],
+   cl=ConsistencyLevel.ONE)
+
+# generate an sstable with an RT that opens in the penultimate block 
and closes in the last one
+stmt = session.prepare('DELETE FROM journals.logs WHERE user = ? AND 
year = ? AND month = ? AND day = ?;')
+batch = BatchStatement(batch_type=BatchType.UNLOGGED)
+for day in range(1, 31):
+batch.add(stmt, ['beobal', 2018, 1, day])
+session.execute(batch)
+node1.flush()
+
+# the data should still be there for years 2011-2017, but prior to 
CASSANDRA-14513 it would've been gone
+assert_all(session,
+   "SELECT COUNT(*) FROM journals.logs WHERE user = 'beobal' 
AND year < 2018 ORDER BY year DESC;",
+   [[7 * 12 * 30]],
+   cl=ConsistencyLevel.ONE)
+
+@since('3.0')
+def test_14513_permanent(self):
+"""
+@jira_ticket CASSANDRA-14513
+
+A reproduction / regression test to illustrate CASSANDRA-14513:
+permanent data loss when doing reverse-order queries with range
+tombstones in place.
+
+This test shows how the invalid RT can propagate to other replicas
+and delete data permanently.
+"""
+cluster = self.cluster
+
+# disable hinted handoff and set batch commit log so this doesn't 
interfere with the test
+# set column

[jira] [Updated] (CASSANDRA-14523) Thread pool stats virtual table

2018-06-14 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14523:
--
Description: 
Expose the thread pools like in status logger/tpstats. Additionally be nice to 
include the scheduled executor pools that are currently unmonitored.
{code:java}
cqlsh> select * from system_views.thread_pools;

 thread_pool  | active | active_max | completed | pending | 
tasks_blocked | total_blocked
--+++---+-+---+---
   anti_entropy_stage |  0 |  1 | 0 |   0 | 
0 | 0
   cache_cleanup_executor |  0 |  1 | 0 |   0 | 
0 | 0
  compaction_executor |  0 |  4 |41 |   0 | 
0 | 0
   counter_mutation_stage |  0 | 32 | 0 |   0 | 
0 | 0
 gossip_stage |  0 |  1 | 0 |   0 | 
0 | 0
 hints_dispatcher |  0 |  2 | 0 |   0 | 
0 | 0
  internal_response_stage |  0 |  8 | 0 |   0 | 
0 | 0
memtable_flush_writer |  0 |  2 | 5 |   0 | 
0 | 0
  memtable_post_flush |  0 |  1 |20 |   0 | 
0 | 0
  memtable_reclaim_memory |  0 |  1 | 5 |   0 | 
0 | 0
  migration_stage |  0 |  1 | 0 |   0 | 
0 | 0
   misc_stage |  0 |  1 | 0 |   0 | 
0 | 0
   mutation_stage |  0 | 32 |   247 |   0 | 
0 | 0
native_transport_requests |  1 |128 |28 |   0 | 
0 | 0
 pending_range_calculator |  0 |  1 | 2 |   0 | 
0 | 0
 per_disk_memtable_flush_writer_0 |  0 |  2 | 5 |   0 | 
0 | 0
read_repair_stage |  0 |  8 | 0 |   0 | 
0 | 0
   read_stage |  0 | 32 |13 |   0 | 
0 | 0
  repair_task |  0 | 2147483647 | 0 |   0 | 
0 | 0
   request_response_stage |  0 |  8 | 0 |   0 | 
0 | 0
  sampler |  0 |  1 | 0 |   0 | 
0 | 0
 scheduled_fast_tasks |  0 | 2147483647 |  1398 |   1 | 
0 | 0
  scheduled_heartbeat |  0 | 2147483647 |14 |   1 | 
0 | 0
scheduled_hotness_tracker |  0 | 2147483647 | 0 |   1 | 
0 | 0
 scheduled_non_periodic_tasks |  0 | 2147483647 |10 |   0 | 
0 | 0
 scheduled_optional_tasks |  0 | 2147483647 | 5 |   8 | 
0 | 0
scheduled_summary_builder |  0 | 2147483647 | 0 |   1 | 
0 | 0
  scheduled_tasks |  0 | 2147483647 |   194 |  74 | 
0 | 0
   secondary_index_management |  0 |  1 | 0 |   0 | 
0 | 0
  validation_executor |  0 | 2147483647 | 0 |   0 | 
0 | 0
  view_build_executor |  0 |  1 | 0 |   0 | 
0 | 0
  view_mutation_stage |  0 | 32 | 0 |   0 | 
0 | 0
{code}

  was:
Expose the thread pools like in status logger/tpstats. Additionally be nice to 
include the scheduled executor pools that are currently unmonitored.

{code}
cqlsh> select * from system_views.thread_pools;

 stage| active | active_max | completed | pending | 
tasks_blocked | total_blocked
--+++---+-+---+---
   anti_entropy_stage |  0 |  1 | 0 |   0 | 
0 | 0
   cache_cleanup_executor |  0 |  1 | 0 |   0 | 
0 | 0
  compaction_executor |  0 |  4 |41 |   0 | 
0 | 0
 

[jira] [Created] (CASSANDRA-14523) Thread pool stats virtual table

2018-06-14 Thread Chris Lohfink (JIRA)
Chris Lohfink created CASSANDRA-14523:
-

 Summary: Thread pool stats virtual table
 Key: CASSANDRA-14523
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14523
 Project: Cassandra
  Issue Type: New Feature
Reporter: Chris Lohfink
Assignee: Chris Lohfink


Expose the thread pools like in status logger/tpstats. Additionally be nice to 
include the scheduled executor pools that are currently unmonitored.

{code}
cqlsh> select * from system_views.thread_pools;

 stage| active | active_max | completed | pending | 
tasks_blocked | total_blocked
--+++---+-+---+---
   anti_entropy_stage |  0 |  1 | 0 |   0 | 
0 | 0
   cache_cleanup_executor |  0 |  1 | 0 |   0 | 
0 | 0
  compaction_executor |  0 |  4 |41 |   0 | 
0 | 0
   counter_mutation_stage |  0 | 32 | 0 |   0 | 
0 | 0
 gossip_stage |  0 |  1 | 0 |   0 | 
0 | 0
 hints_dispatcher |  0 |  2 | 0 |   0 | 
0 | 0
  internal_response_stage |  0 |  8 | 0 |   0 | 
0 | 0
memtable_flush_writer |  0 |  2 | 5 |   0 | 
0 | 0
  memtable_post_flush |  0 |  1 |20 |   0 | 
0 | 0
  memtable_reclaim_memory |  0 |  1 | 5 |   0 | 
0 | 0
  migration_stage |  0 |  1 | 0 |   0 | 
0 | 0
   misc_stage |  0 |  1 | 0 |   0 | 
0 | 0
   mutation_stage |  0 | 32 |   247 |   0 | 
0 | 0
native_transport_requests |  1 |128 |28 |   0 | 
0 | 0
 pending_range_calculator |  0 |  1 | 2 |   0 | 
0 | 0
 per_disk_memtable_flush_writer_0 |  0 |  2 | 5 |   0 | 
0 | 0
read_repair_stage |  0 |  8 | 0 |   0 | 
0 | 0
   read_stage |  0 | 32 |13 |   0 | 
0 | 0
  repair_task |  0 | 2147483647 | 0 |   0 | 
0 | 0
   request_response_stage |  0 |  8 | 0 |   0 | 
0 | 0
  sampler |  0 |  1 | 0 |   0 | 
0 | 0
 scheduled_fast_tasks |  0 | 2147483647 |  1398 |   1 | 
0 | 0
  scheduled_heartbeat |  0 | 2147483647 |14 |   1 | 
0 | 0
scheduled_hotness_tracker |  0 | 2147483647 | 0 |   1 | 
0 | 0
 scheduled_non_periodic_tasks |  0 | 2147483647 |10 |   0 | 
0 | 0
 scheduled_optional_tasks |  0 | 2147483647 | 5 |   8 | 
0 | 0
scheduled_summary_builder |  0 | 2147483647 | 0 |   1 | 
0 | 0
  scheduled_tasks |  0 | 2147483647 |   194 |  74 | 
0 | 0
   secondary_index_management |  0 |  1 | 0 |   0 | 
0 | 0
  validation_executor |  0 | 2147483647 | 0 |   0 | 
0 | 0
  view_build_executor |  0 |  1 | 0 |   0 | 
0 | 0
  view_mutation_stage |  0 | 32 | 0 |   0 | 
0 | 0
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14522) sstableloader options assume the rpc/native interface is the same as the internode interface

2018-06-14 Thread Jeremy Hanna (JIRA)
Jeremy Hanna created CASSANDRA-14522:


 Summary: sstableloader options assume the rpc/native interface is 
the same as the internode interface
 Key: CASSANDRA-14522
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14522
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jeremy Hanna


Currently, in the LoaderOptions for the BulkLoader, the user can give a list of 
initial host addresses.  That's to do the initial connection to the cluster but 
also to stream the sstables.  If you have two physical interfaces, one for rpc, 
the other for internode traffic, then bulk loader won't currently work.  It 
will throw an error such as:

{quote}
> sstableloader -v -u cassadmin -pw xxx -d 
> 10.133.210.101,10.133.210.102,10.133.210.103,10.133.210.104 
> /var/lib/cassandra/commitlog/backup_tmp/test_bkup/bkup_tbl
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of 
/var/lib/cassandra/commitlog/backup_tmp/test_bkup/bkup_tbl/mc-1-big-Data.db 
/var/lib/cassandra/commitlog/backup_tmp/test_bkup/bkup_tbl/mc-2-big-Data.db  to 
[/10.133.210.101, /10.133.210.103, /10.133.210.102, /10.133.210.104]
progress: total: 100% 0  MB/s(avg: 0 MB/s)ERROR 10:16:05,311 [Stream 
#9ed00130-6ff6-11e8-965c-93a78bf96e60] Streaming error occurred
java.net.ConnectException: Connection refused
at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_101]
at sun.nio.ch.Net.connect(Net.java:454) ~[na:1.8.0_101]
at sun.nio.ch.Net.connect(Net.java:446) ~[na:1.8.0_101]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) 
~[na:1.8.0_101]
at java.nio.channels.SocketChannel.open(SocketChannel.java:189) 
~[na:1.8.0_101]
at 
org.apache.cassandra.tools.BulkLoadConnectionFactory.createConnection(BulkLoadConnectionFactory.java:60)
 ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.StreamSession.createConnection(StreamSession.java:266)
 ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:86)
 ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:253) 
~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:212)
 [cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_101]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 ~[netty-all-4.0.54.Final.jar:4.0.54.Final]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
ERROR 10:16:05,312 [Stream #9ed00130-6ff6-11e8-965c-93a78bf96e60] Streaming 
error occurred
java.net.ConnectException: Connection refused
at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_101]
at sun.nio.ch.Net.connect(Net.java:454) ~[na:1.8.0_101]
at sun.nio.ch.Net.connect(Net.java:446) ~[na:1.8.0_101]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) 
~[na:1.8.0_101]
at java.nio.channels.SocketChannel.open(SocketChannel.java:189) 
~[na:1.8.0_101]
at 
org.apache.cassandra.tools.BulkLoadConnectionFactory.createConnection(BulkLoadConnectionFactory.java:60)
 ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.StreamSession.createConnection(StreamSession.java:266)
 ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:86)
 ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:253) 
~[cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:212)
 [cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_101]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 [cassandra-all-3.0.15.2128.jar:3.0.15.2128]
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 ~[netty-all-4.0.54.Final.jar:4.0.54.Final]
at java.la

[jira] [Updated] (CASSANDRA-14513) Reverse order queries in presence of range tombstones may cause permanent data loss

2018-06-14 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14513:

   Resolution: Fixed
Fix Version/s: (was: 4.0.x)
   (was: 3.11.x)
   (was: 3.0.x)
   3.11.3
   3.0.17
   4.0
   Status: Resolved  (was: Ready to Commit)

Thanks. Committed to 3.0 as {{eb91942f64972bef04c4e965dcdf788ae1f21a60}} and 
merged to 3.11 & trunk.

> Reverse order queries in presence of range tombstones may cause permanent 
> data loss
> ---
>
> Key: CASSANDRA-14513
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14513
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Local Write-Read Paths
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Blocker
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> Slice queries in descending sort order can create oversized artificial range 
> tombstones. At CL > ONE, read repair can propagate these tombstones to all 
> replicas, wiping out vast data ranges that they mistakenly cover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/6] cassandra git commit: Update current index block pointer when slice precedes partition start

2018-06-14 Thread samt
Update current index block pointer when slice precedes partition start

When reverse iterating an indexed sstable partition, if the end bound of
the query slice is < the first unfiltered in the partition, the current
index block pointer is not updated. This causes the reader to incorrectly
jump to the end of the partition and start reading from there once the
initial emtpy iterator has been consumed.

Patch by Sam Tunnicliffe; reviewed by Aleksey Yeschenko and Blake Eggleston
for CASSANDRA-14513


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb91942f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb91942f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb91942f

Branch: refs/heads/cassandra-3.11
Commit: eb91942f64972bef04c4e965dcdf788ae1f21a60
Parents: 897b55a
Author: Sam Tunnicliffe 
Authored: Fri Jun 8 12:57:54 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Thu Jun 14 18:15:57 2018 +0100

--
 CHANGES.txt| 1 +
 .../cassandra/db/columniterator/AbstractSSTableIterator.java   | 2 +-
 .../cassandra/db/columniterator/SSTableReversedIterator.java   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94fbcd2..ebf8764 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.17
+ * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
  * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
  * Add Missing dependencies in pom-all (CASSANDRA-14422)
  * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
index f9e6545..386b2c8 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
@@ -420,12 +420,12 @@ abstract class AbstractSSTableIterator implements 
SliceableUnfilteredRowIterator
 if (blockIdx >= 0 && blockIdx < indexes.size())
 {
 reader.seekToPosition(columnOffset(blockIdx));
+mark = reader.file.mark();
 reader.deserializer.clearState();
 }
 
 currentIndexIdx = blockIdx;
 reader.openMarker = blockIdx > 0 ? indexes.get(blockIdx - 
1).endOpenMarker : null;
-mark = reader.file.mark();
 
 // If we're reading an old format file and we move to the first 
block in the index (i.e. the
 // head of the partition), we skip the static row as it's already 
been read when we first opened

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
index 76d8c4d..d5b46a4 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
@@ -281,6 +281,7 @@ public class SSTableReversedIterator extends 
AbstractSSTableIterator
 if (startIdx < 0)
 {
 iterator = Collections.emptyIterator();
+indexState.setToBlock(startIdx);
 return;
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[3/6] cassandra git commit: Update current index block pointer when slice precedes partition start

2018-06-14 Thread samt
Update current index block pointer when slice precedes partition start

When reverse iterating an indexed sstable partition, if the end bound of
the query slice is < the first unfiltered in the partition, the current
index block pointer is not updated. This causes the reader to incorrectly
jump to the end of the partition and start reading from there once the
initial emtpy iterator has been consumed.

Patch by Sam Tunnicliffe; reviewed by Aleksey Yeschenko and Blake Eggleston
for CASSANDRA-14513


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb91942f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb91942f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb91942f

Branch: refs/heads/trunk
Commit: eb91942f64972bef04c4e965dcdf788ae1f21a60
Parents: 897b55a
Author: Sam Tunnicliffe 
Authored: Fri Jun 8 12:57:54 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Thu Jun 14 18:15:57 2018 +0100

--
 CHANGES.txt| 1 +
 .../cassandra/db/columniterator/AbstractSSTableIterator.java   | 2 +-
 .../cassandra/db/columniterator/SSTableReversedIterator.java   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94fbcd2..ebf8764 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.17
+ * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
  * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
  * Add Missing dependencies in pom-all (CASSANDRA-14422)
  * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
index f9e6545..386b2c8 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
@@ -420,12 +420,12 @@ abstract class AbstractSSTableIterator implements 
SliceableUnfilteredRowIterator
 if (blockIdx >= 0 && blockIdx < indexes.size())
 {
 reader.seekToPosition(columnOffset(blockIdx));
+mark = reader.file.mark();
 reader.deserializer.clearState();
 }
 
 currentIndexIdx = blockIdx;
 reader.openMarker = blockIdx > 0 ? indexes.get(blockIdx - 
1).endOpenMarker : null;
-mark = reader.file.mark();
 
 // If we're reading an old format file and we move to the first 
block in the index (i.e. the
 // head of the partition), we skip the static row as it's already 
been read when we first opened

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
index 76d8c4d..d5b46a4 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
@@ -281,6 +281,7 @@ public class SSTableReversedIterator extends 
AbstractSSTableIterator
 if (startIdx < 0)
 {
 iterator = Collections.emptyIterator();
+indexState.setToBlock(startIdx);
 return;
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-06-14 Thread samt
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/191ad7b8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/191ad7b8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/191ad7b8

Branch: refs/heads/trunk
Commit: 191ad7b87a4ded26be4ab0bd192ef676f059276c
Parents: f0403b4 eb91942
Author: Sam Tunnicliffe 
Authored: Thu Jun 14 18:16:23 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Thu Jun 14 18:18:47 2018 +0100

--
 CHANGES.txt| 1 +
 .../cassandra/db/columniterator/AbstractSSTableIterator.java   | 2 +-
 .../cassandra/db/columniterator/SSTableReversedIterator.java   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/191ad7b8/CHANGES.txt
--
diff --cc CHANGES.txt
index 083f480,ebf8764..e807340
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,22 -1,5 +1,23 @@@
 -3.0.17
 +3.11.3
 + * Remove BTree.Builder Recycler to reduce memory usage (CASSANDRA-13929)
 + * Reduce nodetool GC thread count (CASSANDRA-14475)
 + * Fix New SASI view creation during Index Redistribution (CASSANDRA-14055)
 + * Remove string formatting lines from BufferPool hot path (CASSANDRA-14416)
 + * Update metrics to 3.1.5 (CASSANDRA-12924)
 + * Detect OpenJDK jvm type and architecture (CASSANDRA-12793)
 + * Don't use guava collections in the non-system keyspace jmx attributes 
(CASSANDRA-12271)
 + * Allow existing nodes to use all peers in shadow round (CASSANDRA-13851)
 + * Fix cqlsh to read connection.ssl cqlshrc option again (CASSANDRA-14299)
 + * Downgrade log level to trace for CommitLogSegmentManager (CASSANDRA-14370)
 + * CQL fromJson(null) throws NullPointerException (CASSANDRA-13891)
 + * Serialize empty buffer as empty string for json output format 
(CASSANDRA-14245)
 + * Allow logging implementation to be interchanged for embedded testing 
(CASSANDRA-13396)
 + * SASI tokenizer for simple delimiter based entries (CASSANDRA-14247)
 + * Fix Loss of digits when doing CAST from varint/bigint to decimal 
(CASSANDRA-14170)
 + * RateBasedBackPressure unnecessarily invokes a lock on the Guava 
RateLimiter (CASSANDRA-14163)
 + * Fix wildcard GROUP BY queries (CASSANDRA-14209)
 +Merged from 3.0:
+  * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
   * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
   * Add Missing dependencies in pom-all (CASSANDRA-14422)
   * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/191ad7b8/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
--
diff --cc 
src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
index c15416f,386b2c8..4eaf8f6
--- 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
@@@ -483,15 -417,15 +483,15 @@@ public abstract class AbstractSSTableIt
  // Sets the reader to the beginning of blockIdx.
  public void setToBlock(int blockIdx) throws IOException
  {
 -if (blockIdx >= 0 && blockIdx < indexes.size())
 +if (blockIdx >= 0 && blockIdx < indexEntry.columnsIndexCount())
  {
  reader.seekToPosition(columnOffset(blockIdx));
+ mark = reader.file.mark();
  reader.deserializer.clearState();
  }
  
  currentIndexIdx = blockIdx;
 -reader.openMarker = blockIdx > 0 ? indexes.get(blockIdx - 
1).endOpenMarker : null;
 +reader.openMarker = blockIdx > 0 ? index(blockIdx - 
1).endOpenMarker : null;
- mark = reader.file.mark();
  
  // If we're reading an old format file and we move to the first 
block in the index (i.e. the
  // head of the partition), we skip the static row as it's already 
been read when we first opened

http://git-wip-us.apache.org/repos/asf/cassandra/blob/191ad7b8/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-06-14 Thread samt
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/191ad7b8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/191ad7b8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/191ad7b8

Branch: refs/heads/cassandra-3.11
Commit: 191ad7b87a4ded26be4ab0bd192ef676f059276c
Parents: f0403b4 eb91942
Author: Sam Tunnicliffe 
Authored: Thu Jun 14 18:16:23 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Thu Jun 14 18:18:47 2018 +0100

--
 CHANGES.txt| 1 +
 .../cassandra/db/columniterator/AbstractSSTableIterator.java   | 2 +-
 .../cassandra/db/columniterator/SSTableReversedIterator.java   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/191ad7b8/CHANGES.txt
--
diff --cc CHANGES.txt
index 083f480,ebf8764..e807340
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,22 -1,5 +1,23 @@@
 -3.0.17
 +3.11.3
 + * Remove BTree.Builder Recycler to reduce memory usage (CASSANDRA-13929)
 + * Reduce nodetool GC thread count (CASSANDRA-14475)
 + * Fix New SASI view creation during Index Redistribution (CASSANDRA-14055)
 + * Remove string formatting lines from BufferPool hot path (CASSANDRA-14416)
 + * Update metrics to 3.1.5 (CASSANDRA-12924)
 + * Detect OpenJDK jvm type and architecture (CASSANDRA-12793)
 + * Don't use guava collections in the non-system keyspace jmx attributes 
(CASSANDRA-12271)
 + * Allow existing nodes to use all peers in shadow round (CASSANDRA-13851)
 + * Fix cqlsh to read connection.ssl cqlshrc option again (CASSANDRA-14299)
 + * Downgrade log level to trace for CommitLogSegmentManager (CASSANDRA-14370)
 + * CQL fromJson(null) throws NullPointerException (CASSANDRA-13891)
 + * Serialize empty buffer as empty string for json output format 
(CASSANDRA-14245)
 + * Allow logging implementation to be interchanged for embedded testing 
(CASSANDRA-13396)
 + * SASI tokenizer for simple delimiter based entries (CASSANDRA-14247)
 + * Fix Loss of digits when doing CAST from varint/bigint to decimal 
(CASSANDRA-14170)
 + * RateBasedBackPressure unnecessarily invokes a lock on the Guava 
RateLimiter (CASSANDRA-14163)
 + * Fix wildcard GROUP BY queries (CASSANDRA-14209)
 +Merged from 3.0:
+  * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
   * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
   * Add Missing dependencies in pom-all (CASSANDRA-14422)
   * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/191ad7b8/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
--
diff --cc 
src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
index c15416f,386b2c8..4eaf8f6
--- 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
@@@ -483,15 -417,15 +483,15 @@@ public abstract class AbstractSSTableIt
  // Sets the reader to the beginning of blockIdx.
  public void setToBlock(int blockIdx) throws IOException
  {
 -if (blockIdx >= 0 && blockIdx < indexes.size())
 +if (blockIdx >= 0 && blockIdx < indexEntry.columnsIndexCount())
  {
  reader.seekToPosition(columnOffset(blockIdx));
+ mark = reader.file.mark();
  reader.deserializer.clearState();
  }
  
  currentIndexIdx = blockIdx;
 -reader.openMarker = blockIdx > 0 ? indexes.get(blockIdx - 
1).endOpenMarker : null;
 +reader.openMarker = blockIdx > 0 ? index(blockIdx - 
1).endOpenMarker : null;
- mark = reader.file.mark();
  
  // If we're reading an old format file and we move to the first 
block in the index (i.e. the
  // head of the partition), we skip the static row as it's already 
been read when we first opened

http://git-wip-us.apache.org/repos/asf/cassandra/blob/191ad7b8/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[1/6] cassandra git commit: Update current index block pointer when slice precedes partition start

2018-06-14 Thread samt
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 897b55a6b -> eb91942f6
  refs/heads/cassandra-3.11 f0403b4e9 -> 191ad7b87
  refs/heads/trunk 3b56d4df4 -> 255242237


Update current index block pointer when slice precedes partition start

When reverse iterating an indexed sstable partition, if the end bound of
the query slice is < the first unfiltered in the partition, the current
index block pointer is not updated. This causes the reader to incorrectly
jump to the end of the partition and start reading from there once the
initial emtpy iterator has been consumed.

Patch by Sam Tunnicliffe; reviewed by Aleksey Yeschenko and Blake Eggleston
for CASSANDRA-14513


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb91942f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb91942f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb91942f

Branch: refs/heads/cassandra-3.0
Commit: eb91942f64972bef04c4e965dcdf788ae1f21a60
Parents: 897b55a
Author: Sam Tunnicliffe 
Authored: Fri Jun 8 12:57:54 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Thu Jun 14 18:15:57 2018 +0100

--
 CHANGES.txt| 1 +
 .../cassandra/db/columniterator/AbstractSSTableIterator.java   | 2 +-
 .../cassandra/db/columniterator/SSTableReversedIterator.java   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94fbcd2..ebf8764 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.17
+ * Reverse order queries with range tombstones can cause data loss 
(CASSANDRA-14513)
  * Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
  * Add Missing dependencies in pom-all (CASSANDRA-14422)
  * Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
index f9e6545..386b2c8 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
@@ -420,12 +420,12 @@ abstract class AbstractSSTableIterator implements 
SliceableUnfilteredRowIterator
 if (blockIdx >= 0 && blockIdx < indexes.size())
 {
 reader.seekToPosition(columnOffset(blockIdx));
+mark = reader.file.mark();
 reader.deserializer.clearState();
 }
 
 currentIndexIdx = blockIdx;
 reader.openMarker = blockIdx > 0 ? indexes.get(blockIdx - 
1).endOpenMarker : null;
-mark = reader.file.mark();
 
 // If we're reading an old format file and we move to the first 
block in the index (i.e. the
 // head of the partition), we skip the static row as it's already 
been read when we first opened

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb91942f/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
index 76d8c4d..d5b46a4 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
@@ -281,6 +281,7 @@ public class SSTableReversedIterator extends 
AbstractSSTableIterator
 if (startIdx < 0)
 {
 iterator = Collections.emptyIterator();
+indexState.setToBlock(startIdx);
 return;
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2018-06-14 Thread samt
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/25524223
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/25524223
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/25524223

Branch: refs/heads/trunk
Commit: 255242237a49fb27740ab9da187eaa41b0611947
Parents: 3b56d4d 191ad7b
Author: Sam Tunnicliffe 
Authored: Thu Jun 14 18:19:36 2018 +0100
Committer: Sam Tunnicliffe 
Committed: Thu Jun 14 18:22:28 2018 +0100

--
 CHANGES.txt| 1 +
 .../cassandra/db/columniterator/AbstractSSTableIterator.java   | 2 +-
 .../cassandra/db/columniterator/SSTableReversedIterator.java   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/25524223/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/25524223/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
--
diff --cc 
src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
index 443fe49,4eaf8f6..cfc7da2
--- 
a/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java
@@@ -455,7 -492,19 +456,6 @@@ public abstract class AbstractSSTableIt
  
  currentIndexIdx = blockIdx;
  reader.openMarker = blockIdx > 0 ? index(blockIdx - 
1).endOpenMarker : null;
- mark = reader.file.mark();
 -
 -// If we're reading an old format file and we move to the first 
block in the index (i.e. the
 -// head of the partition), we skip the static row as it's already 
been read when we first opened
 -// the iterator. If we don't do this and a static row is present, 
we'll re-read it but treat it
 -// as a regular row, causing deserialization to blow up later as 
that row's flags will be invalid
 -// see CASSANDRA-12088 & CASSANDRA-13236
 -if (!reader.version.storeRows()
 -&& blockIdx == 0
 -&& reader.deserializer.hasNext()
 -&& reader.deserializer.nextIsStatic())
 -{
 -reader.deserializer.skipNext();
 -}
  }
  
  private long columnOffset(int i) throws IOException

http://git-wip-us.apache.org/repos/asf/cassandra/blob/25524223/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14471) Manage audit whitelists with CQL

2018-06-14 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512772#comment-16512772
 ] 

Aleksey Yeschenko commented on CASSANDRA-14471:
---

Please no. I don't have a suggestion of my own, but I can't say I'm a fan of 
what's being suggested here, at all.

> Manage audit whitelists with CQL
> 
>
> Key: CASSANDRA-14471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14471
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Per Otterström
>Priority: Major
>  Labels: audit, security
> Fix For: 4.0
>
>
> Since CASSANDRA-12151 is merged we have support for audit logs in Cassandra. 
> With this ticket I want to explore the idea of managing audit whitelists 
> using CQL.
>  I can think of a few different benefits compared to current yaml-based 
> whitelist/blacklist approach.
>  * Nodes would always be aligned - no risk that node configuraiton go out of 
> sync as tables are added and whitelists updated.
>  * Easier to manage whitelists in large clusters - change in one place and 
> apply cluster wide.
>  * Changes to the whitelists would be in the audit log itself.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14521) With server-generated timestamps, INSERT after DELETE may not be applied

2018-06-14 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512751#comment-16512751
 ] 

Jeff Jirsa commented on CASSANDRA-14521:


What you're trying to do is guarantee serializability, so I suspect the real 
workaround is to use SERIAL consistency on the DELETE so that it consults the 
paxos history and increments the timestamp as appropriate (in case of timestamp 
collision). 

> With server-generated timestamps, INSERT after DELETE may not be applied
> 
>
> Key: CASSANDRA-14521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14521
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Julien
>Priority: Minor
>
> We use server-generated timestamps for all requests because of 
> CASSANDRA-14304.
> The scenario is basically the following:
> {code}
> INSERT INTO mytable(id) VALUES ('1');
> DELETE FROM mytable  WHERE id='1';
> INSERT INTO mytable(id) VALUES ('1');
> SELECT * FROM mytable WHERE id='1';
> {code}
> SELECT _sometimes_ does not return anything when the java driver has 
> {{CassandraClientConnector.with(ServerSideTimestampGenerator.INSTANCE);}} and 
> the Cassandra cluster has 3 nodes and replication-factor:3.
> This scenario actually works as expected with CQL because I don't know how to 
> force the usage of server-generated timestamps with CQL. Is it possible?
> It also works correctly with a single Cassandra node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-06-14 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512497#comment-16512497
 ] 

Stefan Podkowinski commented on CASSANDRA-14423:


{quote} and thus they get "removed" from the compaction strategies SSTables 
along with the unrepaired SSTables that got anti-compacted.{quote}

Where exactly does this happen?

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cass

[jira] [Commented] (CASSANDRA-14521) With server-generated timestamps, INSERT after DELETE may not be applied

2018-06-14 Thread Julien (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512434#comment-16512434
 ] 

Julien commented on CASSANDRA-14521:


Priority decreased as I have a workaround.

My application is able to know that the DELETE statement will be followed by an 
INSERT, and that it is therefore useless as long as the INSERT has all 
necessary columns.

> With server-generated timestamps, INSERT after DELETE may not be applied
> 
>
> Key: CASSANDRA-14521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14521
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Julien
>Priority: Minor
>
> We use server-generated timestamps for all requests because of 
> CASSANDRA-14304.
> The scenario is basically the following:
> {code}
> INSERT INTO mytable(id) VALUES ('1');
> DELETE FROM mytable  WHERE id='1';
> INSERT INTO mytable(id) VALUES ('1');
> SELECT * FROM mytable WHERE id='1';
> {code}
> SELECT _sometimes_ does not return anything when the java driver has 
> {{CassandraClientConnector.with(ServerSideTimestampGenerator.INSTANCE);}} and 
> the Cassandra cluster has 3 nodes and replication-factor:3.
> This scenario actually works as expected with CQL because I don't know how to 
> force the usage of server-generated timestamps with CQL. Is it possible?
> It also works correctly with a single Cassandra node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14521) With server-generated timestamps, INSERT after DELETE may not be applied

2018-06-14 Thread Julien (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien updated CASSANDRA-14521:
---
Priority: Minor  (was: Major)

> With server-generated timestamps, INSERT after DELETE may not be applied
> 
>
> Key: CASSANDRA-14521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14521
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Julien
>Priority: Minor
>
> We use server-generated timestamps for all requests because of 
> CASSANDRA-14304.
> The scenario is basically the following:
> {code}
> INSERT INTO mytable(id) VALUES ('1');
> DELETE FROM mytable  WHERE id='1';
> INSERT INTO mytable(id) VALUES ('1');
> SELECT * FROM mytable WHERE id='1';
> {code}
> SELECT _sometimes_ does not return anything when the java driver has 
> {{CassandraClientConnector.with(ServerSideTimestampGenerator.INSTANCE);}} and 
> the Cassandra cluster has 3 nodes and replication-factor:3.
> This scenario actually works as expected with CQL because I don't know how to 
> force the usage of server-generated timestamps with CQL. Is it possible?
> It also works correctly with a single Cassandra node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14497) Add Role login cache

2018-06-14 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14497:

Fix Version/s: 4.0
   Status: Patch Available  (was: Open)

Sorry it's a bit late, but I found some time to get my patch tidied up. It goes 
a bit beyond the scope of the original description to ensure that all Role info 
can be served from the cache: login privilege, superuser status, custom role 
options as well as the member-of list.
\\
\\
||Branch||CI||
|[trunk|https://github.com/beobal/cassandra/tree/14497-trunk]|[CircleCI|https://circleci.com/workflow-run/c4ed5b53-a454-4a57-8373-5517562dd553]|

> Add Role login cache
> 
>
> Key: CASSANDRA-14497
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14497
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Auth
>Reporter: Jay Zhuang
>Assignee: Sam Tunnicliffe
>Priority: Major
>  Labels: security
> Fix For: 4.0
>
>
> The 
> [{{ClientState.login()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ClientState.java#L313]
>  function is used for all auth message: 
> [{{AuthResponse.java:82}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/AuthResponse.java#L82].
>  But the 
> [{{role.canLogin}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L521]
>  information is not cached. So it hits the database every time: 
> [{{CassandraRoleManager.java:407}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L407].
>  For a cluster with lots of new connections, it's causing performance issue. 
> The mitigation for us is to increase the {{system_auth}} replication factor 
> to match the number of nodes, so 
> [{{local_one}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L488]
>  would be very cheap. The P99 dropped immediately, but I don't think it is 
> not a good solution.
> I would purpose to add {{Role.canLogin}} to the RolesCache to improve the 
> auth performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14515) Short read protection in presence of almost-purgeable range tombstones may cause permanent data loss

2018-06-14 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14515:
--
Reviewer: Blake Eggleston

> Short read protection in presence of almost-purgeable range tombstones may 
> cause permanent data loss
> 
>
> Key: CASSANDRA-14515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14515
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Because read responses don't necessarily close their open RT bounds, it's 
> possible to lose data during short read protection, if a closing bound is 
> compacted away between two adjacent reads from a node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14513) Reverse order queries in presence of range tombstones may cause permanent data loss

2018-06-14 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512325#comment-16512325
 ] 

Aleksey Yeschenko commented on CASSANDRA-14513:
---

+1

> Reverse order queries in presence of range tombstones may cause permanent 
> data loss
> ---
>
> Key: CASSANDRA-14513
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14513
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Local Write-Read Paths
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Slice queries in descending sort order can create oversized artificial range 
> tombstones. At CL > ONE, read repair can propagate these tombstones to all 
> replicas, wiping out vast data ranges that they mistakenly cover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14513) Reverse order queries in presence of range tombstones may cause permanent data loss

2018-06-14 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14513:
--
Status: Ready to Commit  (was: Patch Available)

> Reverse order queries in presence of range tombstones may cause permanent 
> data loss
> ---
>
> Key: CASSANDRA-14513
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14513
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Local Write-Read Paths
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Slice queries in descending sort order can create oversized artificial range 
> tombstones. At CL > ONE, read repair can propagate these tombstones to all 
> replicas, wiping out vast data ranges that they mistakenly cover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14471) Manage audit whitelists with CQL

2018-06-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512321#comment-16512321
 ] 

Per Otterström commented on CASSANDRA-14471:


I've been playing around with a few ideas on how this could be done. My 
thoughts so far:

*CQL*
 We need to create new CQL commands to manage what goes into the audit log and 
not. We also need a new permission which a role would need in order to use 
these new CQL commands. Finally, we need a way to list what is whitelisted and 
not.

After some experimenting, I propose to create two new CQL commands called MUTE 
and UNMUTE. Example:
 MUTE SELECT ON ks.tbl TO user

By default all roles would be UNMUTEd on all permissions/resources. The MUTE 
command would add the permission/resource/role combination to the whitelist. 
The UNMUTE command would remove the same entry from the whitelist. The 
signature of MUTE/UNMUTE would be identical to that of GRANT/REVOKE. By default 
roles with super_user would have permission to use MUTE/UNMUTE. Other users 
would need the WHITELIST permission and the "operation" permission on the 
relevant permission/resource combination being whitelisted. Example:
 cassandra> GRANT WHITELIST ON ks.tbl to user;
 cassandra> GRANT SELECT ON ks.tbl to user;
 cassandra> LOGIN user;
 user> MUTE SELECT ON ks.tbl to other_user;

Given the nature and similarities of MUTE/UNMUTE and GRANT/REVOKE, I think it 
makes sense to implement the new statement classes based on the existing 
PermissionsManagementStatement which probably should be renamed in the process 
to reviel its wider use.

Listing what is muted could be a bit tricky. It would make sense to just extend 
on the existing LIST command while still staying fully backwards compatible. 
Currently these are all valid commands:
 LIST ALL;
 LIST ALL PERMISSIONS;
 LIST MODIFY;
 LIST MODIFY PERMISSION;

By that logic the following commands should be used to list who has permission 
to manage whitelists:
 LIST WHITELIST;
 LIST WHITELIST PERMISSION;

So to list what is actually whitelisted you'd have to type things like:
 LIST MODIFY WHITELIST;
 LIST ALL WHITELISTS;

And consequently we'd get the slightly odd:
 LIST WHITELIST WHITELIST;

*Persisting whitelists*
 I propose that whitelists are stored in a new table called 
system_auth.role_whitelists. The role_whitelists table would be similar to the 
role_permissions table. Role name is partition key, resource is cluster key. 
Whitelisted operations are stored as a set of whitelisted permissions. 
Whitelists should be cached, preferably using the existing framework used by 
permissions and roles.

*Connection resource*
 Currently it is possible to whitelist the fact that a user is connecting to 
the cluster. I believe this an important feature for some use cases of audit 
logs. In order to represent this we must create a new ConnectionResource type, 
call it "connections". I can think of two different sub-types 
"connections/native", "connections/jmx". But I think we should leave the jmx 
part out of the picture for now. The Permission used to make connections could 
be EXECUTE. Another option would be to create a new Permission, say CONNECT. 
For completeness, we could consider to make the ConnectionResource available 
for GRANT/REVOKE as well. That would allow us to give a user permission to 
connect to JMX only, but not the native interface. But we should probably leave 
also this out of the picture for now.

*Existing yaml config*
 One option is to make the filter mechanism plugable and keep both yaml based 
and CQL based filters in the code base. Downside is that the yaml config and 
existing nodetool commands don't fit well with the CQL approach, and the 
dynamics of the proposed CQL commands don't play well with the existing filter. 
IMO it is best to drop the yaml/nodetool config options.

I'd be interested to work on this, but first I'd like to understand if there is 
an interest for this and if someone is willing to review. I'm planing to create 
unit tests and some performance figures along the way. Would we need dtests as 
well? And of course I could some feedback on my thouhgs above.

> Manage audit whitelists with CQL
> 
>
> Key: CASSANDRA-14471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14471
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Per Otterström
>Priority: Major
>  Labels: audit, security
> Fix For: 4.0
>
>
> Since CASSANDRA-12151 is merged we have support for audit logs in Cassandra. 
> With this ticket I want to explore the idea of managing audit whitelists 
> using CQL.
>  I can think of a few different benefits compared to current yaml-based 
> whitelist/blacklist approach.
>  * Nodes would always be aligned - no risk that node configuraiton go out of

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-06-14 Thread Kurt Greaves (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512316#comment-16512316
 ] 

Kurt Greaves commented on CASSANDRA-14423:
--

got a patch & test for 3.11 
[here|https://github.com/apache/cassandra/compare/cassandra-3.11...kgreav:14423-3.11].
 Started on the wrong branch so will backport to 2.2 and 3.0 tomorrow...

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cass

[jira] [Created] (CASSANDRA-14521) With server-generated timestamps, INSERT after DELETE may not be applied

2018-06-14 Thread Julien (JIRA)
Julien created CASSANDRA-14521:
--

 Summary: With server-generated timestamps, INSERT after DELETE may 
not be applied
 Key: CASSANDRA-14521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14521
 Project: Cassandra
  Issue Type: Bug
Reporter: Julien


We use server-generated timestamps for all requests because of CASSANDRA-14304.

The scenario is basically the following:
{code}
INSERT INTO mytable(id) VALUES ('1');
DELETE FROM mytable  WHERE id='1';
INSERT INTO mytable(id) VALUES ('1');
SELECT * FROM mytable WHERE id='1';
{code}

SELECT _sometimes_ does not return anything when the java driver has 
{{CassandraClientConnector.with(ServerSideTimestampGenerator.INSTANCE);}} and 
the Cassandra cluster has 3 nodes and replication-factor:3.

This scenario actually works as expected with CQL because I don't know how to 
force the usage of server-generated timestamps with CQL. Is it possible?

It also works correctly with a single Cassandra node.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org