[jira] [Updated] (CASSANDRA-14444) Got NPE when querying Cassandra 3.11.2

2018-05-21 Thread Kurt Greaves (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves updated CASSANDRA-1:
-
Description: 
We just upgraded our Cassandra cluster from 2.2.6 to 3.11.2

After upgrading, we immediately got exceptions in Cassandra like this one: 

 
{code}
ERROR [Native-Transport-Requests-1] 2018-05-11 17:10:21,994 
QueryMessage.java:129 - Unexpected error during query
java.lang.NullPointerException: null
at 
org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:248) 
~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.dht.RandomPartitioner.decorateKey(RandomPartitioner.java:92)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) 
~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.service.pager.PartitionRangeQueryPager.(PartitionRangeQueryPager.java:44)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:268)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:475)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:288)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:118)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:255) 
~[apache-cassandra-3.11.2.jar:3.11.2]
at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:240) 
~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
 ~[apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
 [apache-cassandra-3.11.2.jar:3.11.2]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.11.2.jar:3.11.2]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_171]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.11.2.jar:3.11.2]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.11.2.jar:3.11.2]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
{code}
 

The table schema is like:
{code}
CREATE TABLE example.example_table (
 id bigint,
 hash text,
 json text,
 PRIMARY KEY (id, hash)
) WITH COMPACT STORAGE
{code}
 

The query is something like:
{code}
"select * from example.example_table;" // (We do know this is bad practise, and 
we are trying to fix that right now)
{code}
with fetch-size as 200, using DataStax Java driver. 

This table contains about 20k rows. 

 

Actually, the fix is quite simple, 

 
{code}
--- a/src/java/org/apache/cassandra/service/pager/PagingState.java
+++ b/src/java/org/apache/cassandra/service/pager/PagingState.java
@@ -46,7 +46,7 @@ public class PagingState

public PagingState(ByteBuffer partitionKey, RowMark rowMark, int remaining, int 
remainingInPartition)
 {
- this.partitionKey = partitionKey;
+ this.partitionKey = partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER : 
partitionKey;
 this.rowMark = rowMark;
 this.remaining = remaining;
 this.remainingInPartition = remainingInPartition;
{code}
 

"partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER : partitionKey;" is in 
2.2.6 and 2.2.8. But it was removed for some reason. 

The interesting part is that, we have: 
{code}
public final ByteBuffer partitionKey; // Can be null for single partition 
queries.
{code}
It seems "partitionKey" could be null.

Thanks a lot. 

 

 

 

  was:
We just upgraded our Cassandra cluster from 2.2.6 to 3.11.2

After upgrading, we immediately got exceptions in Cassandra like this one: 

 

ERROR [Native-Transport-Requests-1] 2018-05-11 17:10:21,994 
QueryMessage.java:129 - Unexpected error during query
java.lang.NullPointerException: null
at 

[jira] [Updated] (CASSANDRA-14460) ERROR : java.lang.AssertionError: null

2018-05-21 Thread Kurt Greaves (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves updated CASSANDRA-14460:
-
Description: 
When I tried to ADD column to a existing table, I am getting below error.

{code}
WARN [MutationStage-48] 2018-02-15 09:42:27,696 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-48,5,main]: {}
 java.lang.AssertionError: null
 at io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler.recycle(Recycler.java:141) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:839) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1092) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:779)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
 ~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:249) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:585) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:462) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:232) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1416) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2640)
 ~[apache-cassandra-3.10.jar:3.10]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
 at java.lang.Thread.run(Thread.java:745) 
 [na:1.8.0_121]
{code}
How to fix this issue? Why does this issue popped up? Any pointers / work 
around solution is appreciated!

  was:
When I tried to ADD column to a existing table, I am getting below error.

WARN [MutationStage-48] 2018-02-15 09:42:27,696 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-48,5,main]: {}
 java.lang.AssertionError: null
 at io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler.recycle(Recycler.java:141) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:839) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1092) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
 

[jira] [Updated] (CASSANDRA-14460) ERROR : java.lang.AssertionError: null

2018-05-21 Thread Mutharasan Anbarasan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mutharasan Anbarasan updated CASSANDRA-14460:
-
Description: 
When I tried to ADD column to a existing table, I am getting below error.

WARN [MutationStage-48] 2018-02-15 09:42:27,696 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-48,5,main]: {}
 java.lang.AssertionError: null
 at io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler.recycle(Recycler.java:141) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:839) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1092) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:779)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
 ~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:249) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:585) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:462) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:232) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1416) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2640)
 ~[apache-cassandra-3.10.jar:3.10]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
 at java.lang.Thread.run(Thread.java:745) 
 [na:1.8.0_121]

How to fix this issue? Why does this issue popped up? Any pointers / work 
around solution is appreciated!

  was:
When I tried to ADD column to a exitsing table, I am getting below error.

WARN [MutationStage-48] 2018-02-15 09:42:27,696 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-48,5,main]: {}
 java.lang.AssertionError: null
 at io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler.recycle(Recycler.java:141) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:839) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1092) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
 ~[apache-cassandra-3.10.jar:3.10]
 at 

[jira] [Updated] (CASSANDRA-14460) ERROR : java.lang.AssertionError: null

2018-05-21 Thread Mutharasan Anbarasan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mutharasan Anbarasan updated CASSANDRA-14460:
-
Description: 
When I tried to ADD column to a exitsing table, I am getting below error.

WARN [MutationStage-48] 2018-02-15 09:42:27,696 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-48,5,main]: {}
 java.lang.AssertionError: null
 at io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler.recycle(Recycler.java:141) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:839) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1092) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:779)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
 ~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:249) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:585) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:462) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:232) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1416) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2640)
 ~[apache-cassandra-3.10.jar:3.10]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
 at java.lang.Thread.run(Thread.java:745) 
 [na:1.8.0_121]

How to fix this issue? Why does this issue popped up? Any pointers / work 
around solution is appreciated!

  was:
When I tried to ADD column to a exiting table, I am getting below error.

WARN [MutationStage-48] 2018-02-15 09:42:27,696 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-48,5,main]: {}
java.lang.AssertionError: null
 at io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler.recycle(Recycler.java:141) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:839) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1092) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)

[jira] [Created] (CASSANDRA-14460) ERROR : java.lang.AssertionError: null

2018-05-21 Thread Mutharasan Anbarasan (JIRA)
Mutharasan Anbarasan created CASSANDRA-14460:


 Summary: ERROR : java.lang.AssertionError: null
 Key: CASSANDRA-14460
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14460
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
Reporter: Mutharasan Anbarasan
 Fix For: 3.10


When I tried to ADD column to a exiting table, I am getting below error.

WARN [MutationStage-48] 2018-02-15 09:42:27,696 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-48,5,main]: {}
java.lang.AssertionError: null
 at io.netty.util.Recycler$WeakOrderQueue.(Recycler.java:225) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:180) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at io.netty.util.Recycler.recycle(Recycler.java:141) 
~[netty-all-4.0.39.Final.jar:4.0.39.Final]
 at org.apache.cassandra.utils.btree.BTree$Builder.recycle(BTree.java:839) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.utils.btree.BTree$Builder.build(BTree.java:1092) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.build(PartitionUpdate.java:587)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.maybeBuild(PartitionUpdate.java:577)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate.holder(PartitionUpdate.java:388)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:177)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.AbstractBTreePartition.unfilteredIterator(AbstractBTreePartition.java:172)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:779)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:393)
 ~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:249) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:585) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:462) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:232) 
~[apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1416) 
~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2640)
 ~[apache-cassandra-3.10.jar:3.10]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.10.jar:3.10]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.10.jar:3.10]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
 at java.lang.Thread.run(Thread.java:745) 
[na:1.8.0_121]





How to fix this issue? Why does this issue popped up? Any pointers / work 
around solution is appreciated!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14298) cqlshlib tests broken on b.a.o

2018-05-21 Thread Patrick Bannister (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483150#comment-16483150
 ] 

Patrick Bannister commented on CASSANDRA-14298:
---

I think I need to retract my recommendation to use LC_CTYPE=C.UTF-8. I learned 
this weekend that the C.UTF-8 locale is somewhat specific to Debian. (It's also 
available on more recent versions of Fedora, as an optional add-on.)

I recommended it initially because it's more internationalization friendly than 
picking a single language such as en_US.UTF-8. Unfortunately, since it's 
specific to the Debian family, I think that makes it a poor choice for testing.

For the lack of a better solution, I recommend we use LC_CTYPE=en_US.UTF-8.

Also - I'm working on standing up a RHEL 7.5 instance on AWS to test my work on 
a different environment, to make sure there aren't more hidden environmental 
dependencies like this.

Separately, as an update on the cqlshlib porting work: my forks of cassandra 
and cassandra-dtest have cqlshlib3 branches with cqlshlib ported to straight 
Python 3, with all cqlshlib unittests and all dtest cqlsh_tests passing, except 
for test_describe (in test_cqlsh_output.py in the cqlshlib unit tests) and 
test_unusual_dates (in cqlsh_tests.py in the dtests). I still want to try to 
measure coverage (not sure how that's going to work with the dtests but it 
should be doable with the unittests), and I definitely want to test these on 
RHEL or some other non-Debian environment; I'll continue with that work this 
week.

 

> cqlshlib tests broken on b.a.o
> --
>
> Key: CASSANDRA-14298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Testing
>Reporter: Stefan Podkowinski
>Assignee: Patrick Bannister
>Priority: Major
>  Labels: cqlsh, dtest
> Attachments: CASSANDRA-14298-old.txt, CASSANDRA-14298.txt, 
> cqlsh_tests_notes.md
>
>
> It appears that cqlsh-tests on builds.apache.org on all branches stopped 
> working since we removed nosetests from the system environment. See e.g. 
> [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-cqlsh-tests/458/cython=no,jdk=JDK%201.8%20(latest),label=cassandra/console].
>  Looks like we either have to make nosetests available again or migrate to 
> pytest as we did with dtests. Giving pytest a quick try resulted in many 
> errors locally, but I haven't inspected them in detail yet. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2018-05-21 Thread Vinay Chella (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella reassigned CASSANDRA-14459:


Assignee: Joseph Lynch

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14358) OutboundTcpConnection can hang for many minutes when nodes restart

2018-05-21 Thread Joseph Lynch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482919#comment-16482919
 ] 

Joseph Lynch edited comment on CASSANDRA-14358 at 5/21/18 7:27 PM:
---

[~alienth] that is interesting and thank you for digging so deeply! If I 
understand correctly during a {{drain}} the other servers are responsible for 
noticing the change and closing their connections within the 
{{[shutdown_announce_in_ms|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/Gossiper.java#L1497]}}
 period in 
[response|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/GossipShutdownVerbHandler.java#L37]
 to the {{GOSSIP_SHUTDOWN}} gossip state, and then the 
{{[markAsShutdown|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/Gossiper.java#L363-L373]}}
 method marks it down and forcibly convicts it. I believe that the TCP 
connections get closed via the {{StorageService}}'s {{onDead}} method which 
calls 
{{[onDead|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L2514]}}
 which calls 
{{[MessagingService::reset|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/MessagingService.java#L505]}}
 which calls 
{{[OutboundTcpConnection::closeSocket|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java#L80],
 which [enqueues a 
sentinel|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L210]}}
 into the backlog and then the 
{{[OutboundTcpConnection::run|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L253]}}
 method is actually supposed to close it. The {{drainedMessages}} queue is a 
local reference though so backlog could get something that was enqueued before 
the {{CLOSE_SENTINEL}} and after it as well. This seems very racey to me, in 
particular the reconnection logic might race with the closing logic from what I 
can tell as we have a 2 second window between when the clients start closing 
and when the server will actually stop accepting new connections (because it 
closes the listeners).

Non stateful networks would surface the RST in the {{writeConnected}} method, 
but AWS is like "yea that machine isn't allowed to talk to that one" and just 
blackholes the RSTs... I wonder if I can reproduce this by increasing that 
window significantly and just sending lots of traffic.


was (Author: jolynch):
[~alienth] that is interesting and thank you for digging so deeply! If I 
understand correctly during a {{drain}} the other servers are responsible for 
noticing the change and closing their connections within the 
{{[shutdown_announce_in_ms|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/Gossiper.java#L1497]}}
 period in 
[response|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/GossipShutdownVerbHandler.java#L37]
 to the {{GOSSIP_SHUTDOWN}} gossip state, and then the 
{{[markAsShutdown|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/Gossiper.java#L363-L373]}}
 method marks it down and forcibly convicts it. I believe that the TCP 
connections get closed via the {{StorageService}}'s {{onDead}} method which 
calls 
{{[onDead|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L2514]}}
 which calls 
{{[MessagingService::reset|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/MessagingService.java#L505]}}
 which calls 
{{[OutboundTcpConnection::closeSocket|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java#L80],
 which [enqueues a 
sentinel|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L210]}}
 into the backlog and then the 
{{[OutboundTcpConnection::run|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L253]}}
 method is actually supposed to close it. The {{drainedMessages}} queue is a 
local reference though so backlog could get something that was enqueued before 
the {{CLOSE_SENTINEL}} and after it as 

[jira] [Commented] (CASSANDRA-14358) OutboundTcpConnection can hang for many minutes when nodes restart

2018-05-21 Thread Joseph Lynch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482919#comment-16482919
 ] 

Joseph Lynch commented on CASSANDRA-14358:
--

[~alienth] that is interesting and thank you for digging so deeply! If I 
understand correctly during a {{drain}} the other servers are responsible for 
noticing the change and closing their connections within the 
{{[shutdown_announce_in_ms|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/Gossiper.java#L1497]}}
 period in 
[response|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/GossipShutdownVerbHandler.java#L37]
 to the {{GOSSIP_SHUTDOWN}} gossip state, and then the 
{{[markAsShutdown|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/gms/Gossiper.java#L363-L373]}}
 method marks it down and forcibly convicts it. I believe that the TCP 
connections get closed via the {{StorageService}}'s {{onDead}} method which 
calls 
{{[onDead|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L2514]}}
 which calls 
{{[MessagingService::reset|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/MessagingService.java#L505]}}
 which calls 
{{[OutboundTcpConnection::closeSocket|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java#L80],
 which [enqueues a 
sentinel|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L210]}}
 into the backlog and then the 
{{[OutboundTcpConnection::run|https://github.com/apache/cassandra/blob/06b3521acdb21dd3d85902d59146b9d08ad7d752/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L253]}}
 method is actually supposed to close it. The {{drainedMessages}} queue is a 
local reference though so backlog could get something that was enqueued before 
the {{CLOSE_SENTINEL}} and after it as well. This seems very racey to me, in 
particular the reconnection logic might race with the closing logic from what I 
can tell as we have a 2 second window between when the clients start closing 
and when the server will actually stop accepting new connections (because it 
closes the listeners).

Non stateful networks would surface the RST in the {{writeConnected}} method, 
but AWS is like "yea that machine isn't allowed to talk to that one" and just 
blackholes the RSTs...

> OutboundTcpConnection can hang for many minutes when nodes restart
> --
>
> Key: CASSANDRA-14358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14358
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Cassandra 2.1.19 (also reproduced on 3.0.15), running 
> with {{internode_encryption: all}} and the EC2 multi region snitch on Linux 
> 4.13 within the same AWS region. Smallest cluster I've seen the problem on is 
> 12 nodes, reproduces more reliably on 40+ and 300 node clusters consistently 
> reproduce on at least one node in the cluster.
> So all the connections are SSL and we're connecting on the internal ip 
> addresses (not the public endpoint ones).
> Potentially relevant sysctls:
> {noformat}
> /proc/sys/net/ipv4/tcp_syn_retries = 2
> /proc/sys/net/ipv4/tcp_synack_retries = 5
> /proc/sys/net/ipv4/tcp_keepalive_time = 7200
> /proc/sys/net/ipv4/tcp_keepalive_probes = 9
> /proc/sys/net/ipv4/tcp_keepalive_intvl = 75
> /proc/sys/net/ipv4/tcp_retries2 = 15
> {noformat}
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
> Attachments: 10 Minute Partition.pdf
>
>
> edit summary: This primarily impacts networks with stateful firewalls such as 
> AWS. I'm working on a proper patch for trunk but unfortunately it relies on 
> the Netty refactor in 4.0 so it will be hard to backport to previous 
> versions. A workaround for earlier versions is to set the 
> {{net.ipv4.tcp_retries2}} sysctl to ~5. This can be done with the following:
> {code:java}
> $ cat /etc/sysctl.d/20-cassandra-tuning.conf
> net.ipv4.tcp_retries2=5
> $ # Reload all sysctls
> $ sysctl --system{code}
> Original Bug Report:
> I've been trying to debug nodes not being able to see each other during 
> longer (~5 minute+) Cassandra restarts in 3.0.x and 2.1.x which can 
> contribute to {{UnavailableExceptions}} during rolling restarts of 3.0.x and 
> 2.1.x clusters for us. I think I finally have a lead. It appears that prior 
> to trunk (with the awesome Netty refactor) we do not set socket 

[jira] [Updated] (CASSANDRA-14358) OutboundTcpConnection can hang for many minutes when nodes restart

2018-05-21 Thread Joseph Lynch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14358:
-
Description: 
edit summary: This primarily impacts networks with stateful firewalls such as 
AWS. I'm working on a proper patch for trunk but unfortunately it relies on the 
Netty refactor in 4.0 so it will be hard to backport to previous versions. A 
workaround for earlier versions is to set the {{net.ipv4.tcp_retries2}} sysctl 
to ~5. This can be done with the following:
{code:java}
$ cat /etc/sysctl.d/20-cassandra-tuning.conf
net.ipv4.tcp_retries2=5
$ # Reload all sysctls
$ sysctl --system{code}
Original Bug Report:

I've been trying to debug nodes not being able to see each other during longer 
(~5 minute+) Cassandra restarts in 3.0.x and 2.1.x which can contribute to 
{{UnavailableExceptions}} during rolling restarts of 3.0.x and 2.1.x clusters 
for us. I think I finally have a lead. It appears that prior to trunk (with the 
awesome Netty refactor) we do not set socket connect timeouts on SSL 
connections (in 2.1.x, 3.0.x, or 3.11.x) nor do we set {{SO_TIMEOUT}} as far as 
I can tell on outbound connections either. I believe that this means that we 
could potentially block forever on {{connect}} or {{recv}} syscalls, and we 
could block forever on the SSL Handshake as well. I think that the OS will 
protect us somewhat (and that may be what's causing the eventual timeout) but I 
think that given the right network conditions our {{OutboundTCPConnection}} 
threads can just be stuck never making any progress until the OS intervenes.

I have attached some logs of such a network partition during a rolling restart 
where an old node in the cluster has a completely foobarred 
{{OutboundTcpConnection}} for ~10 minutes before finally getting a 
{{java.net.SocketException: Connection timed out (Write failed)}} and 
immediately successfully reconnecting. I conclude that the old node is the 
problem because the new node (the one that restarted) is sending ECHOs to the 
old node, and the old node is sending ECHOs and REQUEST_RESPONSES to the new 
node's ECHOs, but the new node is never getting the ECHO's. This appears, to 
me, to indicate that the old node's {{OutboundTcpConnection}} thread is just 
stuck and can't make any forward progress. By the time we could notice this and 
slap TRACE logging on, the only thing we see is ~10 minutes later a 
{{SocketException}} inside {{writeConnected}}'s flush and an immediate 
recovery. It is interesting to me that the exception happens in 
{{writeConnected}} and it's a _connection timeout_ (and since we see {{Write 
failure}} I believe that this can't be a connection reset), because my 
understanding is that we should have a fully handshaked SSL connection at that 
point in the code.

Current theory:
 # "New" node restarts,  "Old" node calls 
[newSocket|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L433]
 # Old node starts [creating a 
new|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java#L141]
 SSL socket 
 # SSLSocket calls 
[createSocket|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/security/SSLFactory.java#L98],
 which conveniently calls connect with a default timeout of "forever". We could 
hang here forever until the OS kills us.
 # If we continue, we get to 
[writeConnected|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L263]
 which eventually calls 
[flush|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L341]
 on the output stream and also can hang forever. I think the probability is 
especially high when a node is restarting and is overwhelmed with SSL 
handshakes and such.

I don't fully understand the attached traceback as it appears we are getting a 
{{Connection Timeout}} from a {{send}} failure (my understanding is you can 
only get a connection timeout prior to a send), but I think it's reasonable 
that we have a timeout configuration issue. I'd like to try to make Cassandra 
robust to networking issues like this via maybe:
 # Change the {{SSLSocket}} {{getSocket}} methods to provide connection 
timeouts of 2s (equivalent to trunk's 
[timeout|https://github.com/apache/cassandra/blob/11496039fb18bb45407246602e31740c56d28157/src/java/org/apache/cassandra/net/async/NettyFactory.java#L329])
 # Appropriately set recv timeouts via {{SO_TIMEOUT}}, maybe something like 2 
minutes (in old versions via 
[setSoTimeout|https://docs.oracle.com/javase/8/docs/api/java/net/Socket.html#setSoTimeout-int-],
 in trunk via 

[jira] [Updated] (CASSANDRA-14358) OutboundTcpConnection can hang for many minutes when nodes restart

2018-05-21 Thread Joseph Lynch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14358:
-
Description: 
edit: There is a reasonably workaround on Linux, I'm working on a proper patch 
for trunk but unfortunately it relies on the Netty refactor there so it will be 
hard to backport to previous versions. The workaround for earlier versions is 
to set:

{code:

I've been trying to debug nodes not being able to see each other during longer 
(~5 minute+) Cassandra restarts in 3.0.x and 2.1.x which can contribute to 
{{UnavailableExceptions}} during rolling restarts of 3.0.x and 2.1.x clusters 
for us. I think I finally have a lead. It appears that prior to trunk (with the 
awesome Netty refactor) we do not set socket connect timeouts on SSL 
connections (in 2.1.x, 3.0.x, or 3.11.x) nor do we set {{SO_TIMEOUT}} as far as 
I can tell on outbound connections either. I believe that this means that we 
could potentially block forever on {{connect}} or {{recv}} syscalls, and we 
could block forever on the SSL Handshake as well. I think that the OS will 
protect us somewhat (and that may be what's causing the eventual timeout) but I 
think that given the right network conditions our {{OutboundTCPConnection}} 
threads can just be stuck never making any progress until the OS intervenes.

I have attached some logs of such a network partition during a rolling restart 
where an old node in the cluster has a completely foobarred 
{{OutboundTcpConnection}} for ~10 minutes before finally getting a 
{{java.net.SocketException: Connection timed out (Write failed)}} and 
immediately successfully reconnecting. I conclude that the old node is the 
problem because the new node (the one that restarted) is sending ECHOs to the 
old node, and the old node is sending ECHOs and REQUEST_RESPONSES to the new 
node's ECHOs, but the new node is never getting the ECHO's. This appears, to 
me, to indicate that the old node's {{OutboundTcpConnection}} thread is just 
stuck and can't make any forward progress. By the time we could notice this and 
slap TRACE logging on, the only thing we see is ~10 minutes later a 
{{SocketException}} inside {{writeConnected}}'s flush and an immediate 
recovery. It is interesting to me that the exception happens in 
{{writeConnected}} and it's a _connection timeout_ (and since we see {{Write 
failure}} I believe that this can't be a connection reset), because my 
understanding is that we should have a fully handshaked SSL connection at that 
point in the code.

Current theory:
 # "New" node restarts,  "Old" node calls 
[newSocket|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L433]
 # Old node starts [creating a 
new|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java#L141]
 SSL socket 
 # SSLSocket calls 
[createSocket|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/security/SSLFactory.java#L98],
 which conveniently calls connect with a default timeout of "forever". We could 
hang here forever until the OS kills us.
 # If we continue, we get to 
[writeConnected|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L263]
 which eventually calls 
[flush|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L341]
 on the output stream and also can hang forever. I think the probability is 
especially high when a node is restarting and is overwhelmed with SSL 
handshakes and such.

I don't fully understand the attached traceback as it appears we are getting a 
{{Connection Timeout}} from a {{send}} failure (my understanding is you can 
only get a connection timeout prior to a send), but I think it's reasonable 
that we have a timeout configuration issue. I'd like to try to make Cassandra 
robust to networking issues like this via maybe:
 # Change the {{SSLSocket}} {{getSocket}} methods to provide connection 
timeouts of 2s (equivalent to trunk's 
[timeout|https://github.com/apache/cassandra/blob/11496039fb18bb45407246602e31740c56d28157/src/java/org/apache/cassandra/net/async/NettyFactory.java#L329])
 # Appropriately set recv timeouts via {{SO_TIMEOUT}}, maybe something like 2 
minutes (in old versions via 
[setSoTimeout|https://docs.oracle.com/javase/8/docs/api/java/net/Socket.html#setSoTimeout-int-],
 in trunk via 
[SO_TIMEOUT|http://netty.io/4.0/api/io/netty/channel/ChannelOption.html#SO_TIMEOUT]
 # Since we can't set send timeouts afaik (thanks java) maybe we can have some 
kind of watchdog that ensures OutboundTcpConnection is making progress in its 
queue and if it doesn't make any 

[jira] [Commented] (CASSANDRA-14457) Add a virtual table with current compactions

2018-05-21 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482546#comment-16482546
 ] 

Aleksey Yeschenko commented on CASSANDRA-14457:
---

As for 3/5, I’m thinking (keyspace, table, id) - so that you can do a SELECT by 
keyspace, without the table or ALLOW FILTERING. They’d still be equally close 
together in cqlsh.

> Add a virtual table with current compactions
> 
>
> Key: CASSANDRA-14457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.x
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14457) Add a virtual table with current compactions

2018-05-21 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482541#comment-16482541
 ] 

Aleksey Yeschenko commented on CASSANDRA-14457:
---

bq. Can we just use "undefined" for summary redistribution with changing it to 
be part of key?

We could use a sentinel like that, so long as it's something that isn't a legal 
keyspace/table name. Think 'all keyspaces' and 'all tables', with a space 
in-between. But I'm not sure we should even list it there, or that it should 
have ever been a compaction type in the first place.

> Add a virtual table with current compactions
> 
>
> Key: CASSANDRA-14457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.x
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14457) Add a virtual table with current compactions

2018-05-21 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482532#comment-16482532
 ] 

Chris Lohfink commented on CASSANDRA-14457:
---

3/5: ((keyspace, table), id) would solve the issue I concatenated the 
keyspace/table together for (columns listed alphabetically in cqlsh so having 
them on opposite sides of row was hard to read. So I definitely will go with 
that.

Can we just use "undefined" for summary redistribution with changing it to be 
part of key?

> Add a virtual table with current compactions
> 
>
> Key: CASSANDRA-14457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.x
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13981) Enable Cassandra for Persistent Memory

2018-05-21 Thread Tony Ruiz (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482490#comment-16482490
 ] 

Tony Ruiz commented on CASSANDRA-13981:
---

I am out of the office returning returning Monday May 18.  For urgent matters 
please contact my manager: eric.kaczma...@intel.com

Thanks,
Tony


> Enable Cassandra for Persistent Memory 
> ---
>
> Key: CASSANDRA-13981
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13981
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Preetika Tyagi
>Assignee: Preetika Tyagi
>Priority: Major
> Fix For: 4.0
>
> Attachments: in-mem-cassandra-1.0.patch, in-mem-cassandra-2.0.patch, 
> readme.txt, readme2_0.txt
>
>
> Currently, Cassandra relies on disks for data storage and hence it needs data 
> serialization, compaction, bloom filters and partition summary/index for 
> speedy access of the data. However, with persistent memory, data can be 
> stored directly in the form of Java objects and collections, which can 
> greatly simplify the retrieval mechanism of the data. What we are proposing 
> is to make use of faster and scalable B+ tree-based data collections built 
> for persistent memory in Java (PCJ: https://github.com/pmem/pcj) and enable a 
> complete in-memory version of Cassandra, while still keeping the data 
> persistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13981) Enable Cassandra for Persistent Memory

2018-05-21 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482488#comment-16482488
 ] 

Jason Brown commented on CASSANDRA-13981:
-

Thanks, [~pree] and [~shylaja.koko...@intel.com], for the patches. I've been 
reading them, understanding the scope of the technology, and see the direction 
you are going. However, I'd like to propose a slightly different direction.

Stepping back, the pcj library is divided into two parts: the higher-level pcj 
components (as used in the version of this patch as previously posted), and the 
lower-level API, called LLPL in the library. LLPL is much smaller than the pcj 
parts, and offers a direct and simple way to just write bytes into a backing 
array from the persistent memory. In my option this will be far more natural 
for the cassandra community and developers, and provides a more direct access 
to the storage bytes. We already have lots of serialization code, and we 
understand that quite well; thus I'd like to keep leveraging that lower-level 
thinking. We will need to write custom, non-generic data structures (like we 
already have for our LSM-based engine), but I only see this as complete win. We 
need to optimize, in every way we reasonably can, our data structures as we are 
a database, after all. LLPL has some rough edges wrt code optimization and we 
will want to modify the transaction model a bit, but I suspect the pcj authors 
will work with us toward that end.

With this as background, I've started sketching out a direction I think we 
should pursue. This sketch primarily shows the direction for thinking about 
serialization and memory allocation using LLPL. DISCLAIMER: this code doesn't 
compile, is not syntactically correct, and is wholly incomplete. It should be 
thought of a loose blueprint (sketch!) for discussion.

The sketch compromises of the following concepts:
 - thread per sub-range (to reduce lock contention in the data structures). 
This is kinda inspired by the thread-per-core notion, but on a smaller scale. 
({{TreeManager}} in this patch is a rudimentary dispatch class.)
 - how partitions should be stored - allocate a {{MemoryRegion}} from the LLPL 
allocator, wrap it with a {{DataOutputPlus}}, and write as we normally would.
 - rough implementations of the data structures for the primary index and 
storing rows. A longer treatment of this topic will be in the deisgn doc (see 
below), but using a tree for the primary index (for partition look up) and then 
a map for the cql rows is the basic idea. I mostly want to show the ideas 
around serialization so I didn't actually implement the index nor the map - 
except for the leaf/entry nodes which show how the serailization/data layout 
fits into the data structure.
 - explicitly pass the transaction around on writes (instead of looking for it 
in a {{ThreadLocal}}, as the pcj transactions does).

||13981-sketch-1||
|[branch|https://github.com/jasobrown/cassandra/tree/13981-sketch-1]|

I am proposing this sketch as a starting for for discussion, along with a 
forthcoming design doc to help us work out more high-level details of how 
cassandra as a main memory database should look. I'm working on design doc now. 
It will explore how we can have a pluggable storage engine implementation that 
allows cassandra to run as a main memory database using persistent memory, 
while supporting the existing behaviors of cassandra in that kind of system.

> Enable Cassandra for Persistent Memory 
> ---
>
> Key: CASSANDRA-13981
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13981
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Preetika Tyagi
>Assignee: Preetika Tyagi
>Priority: Major
> Fix For: 4.0
>
> Attachments: in-mem-cassandra-1.0.patch, in-mem-cassandra-2.0.patch, 
> readme.txt, readme2_0.txt
>
>
> Currently, Cassandra relies on disks for data storage and hence it needs data 
> serialization, compaction, bloom filters and partition summary/index for 
> speedy access of the data. However, with persistent memory, data can be 
> stored directly in the form of Java objects and collections, which can 
> greatly simplify the retrieval mechanism of the data. What we are proposing 
> is to make use of faster and scalable B+ tree-based data collections built 
> for persistent memory in Java (PCJ: https://github.com/pmem/pcj) and enable a 
> complete in-memory version of Cassandra, while still keeping the data 
> persistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14223) Provide ability to do custom certificate validations (e.g. hostname validation, certificate revocation checks)

2018-05-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482467#comment-16482467
 ] 

Per Otterström commented on CASSANDRA-14223:


[~jasobrown], I'm trying to understand your concern with blocking I/O. Only 
scenario I can think of is that several clients connect simultaneously and 
thereby allocate (and block) so many threads that already active connections 
don't get enough threads to execute requests? Not sure if that's the issue 
tough. Can you elaborate a bit?

In your patch there is a comment on {{SSLSessionValidator.validate()}} that got 
me confused. "This function should not block!". I thought the point of having 
this separation was to allow the validator to block?

If I would like to implement hostname validation using a custom 
{{SSLSessionValidator}} it think we need to change the signature of the 
{{validate()}} method to {{boolean validate(SocketChannel)}}. This change would 
obviously cascade to other places as well. I don't think it is possible to pull 
out remote peer IP/port from a {{Channel}} object. Also, I would need to find 
some way to get information from the certificate to compare. Is there some 
clever way to do that?

bq. Perhaps another solution, a sort of middle ground, is to still make use of 
a custom TrustManager, but hand that to the netty SslContext, and then execute 
the TLS handshake in the netty pipeline on a different event loop group from 
the rest of the pipeline. 

That seem like a more attractive approach IMO.

> Provide ability to do custom certificate validations (e.g. hostname 
> validation, certificate revocation checks)
> --
>
> Key: CASSANDRA-14223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14223
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ron Blechman
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
> Attachments: dsp.tar.gz
>
>
> Cassandra server should be to be able do additional certificate validations, 
> such as hostname validatation and certificate revocation checking against 
> CRLs and/or using OCSP. 
> One approach couild be to have SSLFactory use SSLContext.getDefault() instead 
> of forcing the creation of a new SSLContext using SSLContext.getInstance().  
> Using the default SSLContext would allow a user to plug in their own custom 
> SSLSocketFactory via the java.security properties file. The custom 
> SSLSocketFactory could create a default SSLContext  that was customized to do 
> any extra validation such as certificate revocation, host name validation, 
> etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14223) Provide ability to do custom certificate validations (e.g. hostname validation, certificate revocation checks)

2018-05-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482416#comment-16482416
 ] 

Per Otterström commented on CASSANDRA-14223:


Attached dsp.tar.gz. A minimal security provider, only containing a single 
service - a TrustManager with enforced hostname validation. There is a readme 
with some instructions on how to use it. [~ronblechman], based on what you 
described around your tests, I believe that you should be able to install your 
own TrustManager in a similar way.

Bouncy Castle seem to support a similar setup: 
[http://www.bouncycastle.org/wiki/display/JA1/Provider+Installation]

What I like about this approach, is that I can install and manage my security 
providers in the same way for all my Java based applications.

 

> Provide ability to do custom certificate validations (e.g. hostname 
> validation, certificate revocation checks)
> --
>
> Key: CASSANDRA-14223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14223
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ron Blechman
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
> Attachments: dsp.tar.gz
>
>
> Cassandra server should be to be able do additional certificate validations, 
> such as hostname validatation and certificate revocation checking against 
> CRLs and/or using OCSP. 
> One approach couild be to have SSLFactory use SSLContext.getDefault() instead 
> of forcing the creation of a new SSLContext using SSLContext.getInstance().  
> Using the default SSLContext would allow a user to plug in their own custom 
> SSLSocketFactory via the java.security properties file. The custom 
> SSLSocketFactory could create a default SSLContext  that was customized to do 
> any extra validation such as certificate revocation, host name validation, 
> etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14458) Add virtual table to list active connections

2018-05-21 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14458:
--
 Reviewer: Aleksey Yeschenko
Fix Version/s: 4.x

> Add virtual table to list active connections
> 
>
> Key: CASSANDRA-14458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14458
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.x
>
>
> List all active connections in virtual table like:
> {code:sql}
> cqlsh:system> select * from system_views.clients ;
>  
>  client_address   | cipher    | driver_name | driver_version | keyspace | 
> protocol  | requests | ssl   | user      | version
> --+---+-++--+---+--+---+---+-
>  /127.0.0.1:63903 | undefined |   undefined |      undefined |          | 
> undefined |       13 | False | anonymous |       4
>  /127.0.0.1:63904 | undefined |   undefined |      undefined |   system | 
> undefined |       16 | False | anonymous |       4
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14457) Add a virtual table with current compactions

2018-05-21 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482413#comment-16482413
 ] 

Aleksey Yeschenko commented on CASSANDRA-14457:
---

This looks good, so I only have some bikeshedding to contribute:

1. The table doesn't really represent compaction statistics, so should probably 
not name it compaction_stats? I know that the nodetool cmd is named 
compactionstats, but that was a bad name too, imo. So perhaps 
{{compaction_state}} and {{CompactionStateTable}}, or {{compcation_status}} and 
{{CompactionStatusTable}}, or even {{active_compactions}} and 
{{ActiveCompactionsTable}}; whichever sounds nicer to your American ear.
2. Should we perhaps use {{int}} as the type for current and total columns, 
instead of {{text}}?
3. Maybe don't concat the names of the keyspace and the table? Why not have 
them in separate columns, for easier querying?
4. We tend to lowercase enums in table, usually. Can you slap a 
{{toLowerCase()}} on {{task_type}} please?
5. I would prefer to have ((keyspace, table), id) or ((keyspace), table, id) as 
PRIMARY KEY here, personally.

Upon a quick look, it seems like the only case where we don't have a 
keyspace/table attached to a compaction is summary redistribution, which is 
performed on all sstables. But it's not really a compaction, so perhaps we 
should exclude it from the dataset?

> Add a virtual table with current compactions
> 
>
> Key: CASSANDRA-14457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.x
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14223) Provide ability to do custom certificate validations (e.g. hostname validation, certificate revocation checks)

2018-05-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Per Otterström updated CASSANDRA-14223:
---
Attachment: dsp.tar.gz

> Provide ability to do custom certificate validations (e.g. hostname 
> validation, certificate revocation checks)
> --
>
> Key: CASSANDRA-14223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14223
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ron Blechman
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
> Attachments: dsp.tar.gz
>
>
> Cassandra server should be to be able do additional certificate validations, 
> such as hostname validatation and certificate revocation checking against 
> CRLs and/or using OCSP. 
> One approach couild be to have SSLFactory use SSLContext.getDefault() instead 
> of forcing the creation of a new SSLContext using SSLContext.getInstance().  
> Using the default SSLContext would allow a user to plug in their own custom 
> SSLSocketFactory via the java.security properties file. The custom 
> SSLSocketFactory could create a default SSLContext  that was customized to do 
> any extra validation such as certificate revocation, host name validation, 
> etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14457) Add a virtual table with current compactions

2018-05-21 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14457:
--
 Reviewer: Aleksey Yeschenko
Fix Version/s: 4.x

> Add a virtual table with current compactions
> 
>
> Key: CASSANDRA-14457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.x
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org