date:20170926


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Kjellman updated CASSANDRA-13741:
-
Reviewer: Jason Brown

> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.x
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13741) Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Kjellman updated CASSANDRA-13741:
-
Status: Patch Available  (was: Open)

> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.x
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13741) Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182008#comment-16182008
 ] 

Michael Kjellman commented on CASSANDRA-13741:
--

I was able to get some time to do proper extended regression and performance 
testing of lz4-java 1.4.0. Things are looking good so far. I'd like to do some 
extended perf profiling to really validate this as it's such an important 
dependency but I've convinced myself that we should be comfortable with this to 
go into trunk at this point!

https://github.com/mkjellman/cassandra/tree/CASSANDRA-13741
https://github.com/mkjellman/cassandra/commit/bc7980b1c6f5306b4df491f71a4cfe6af89911f2

[~jasobrown] or [[~jjirsa] would you like to review this?

> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.x
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182003#comment-16182003
 ] 

Michael Kjellman commented on CASSANDRA-13304:
--

[~beobal] I've pushed up a trunk rebased squashed commit with your changes and 
more to address the additional comments on this ticket.

https://github.com/mkjellman/cassandra/tree/CASSANDRA-13304
https://github.com/mkjellman/cassandra/commit/8f66abcf93edc51173332c815babdc457472f62a

1) I added YAML option enable_checksumming_in_native_transport (defaults to 
true) which allows an operator to fully disable checksum support in the native 
protocol
2) Added a new option "CHECKSUMS", which expects the client to send a list of 
checksums to use. If this is empty or not provided by the client, we won't use 
checksums on the protocol

[~adutra] how does this look to you?

> Add checksumming to the native protocol
> ---
>
> Key: CASSANDRA-13304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13304
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>  Labels: client-impacting
> Attachments: 13304_v1.diff, boxplot-read-throughput.png, 
> boxplot-write-throughput.png
>
>
> The native binary transport implementation doesn't include checksums. This 
> makes it highly susceptible to silently inserting corrupted data either due 
> to hardware issues causing bit flips on the sender/client side, C*/receiver 
> side, or network in between.
> Attaching an implementation that makes checksum'ing mandatory (assuming both 
> client and server know about a protocol version that supports checksums) -- 
> and also adds checksumming to clients that request compression.
> The serialized format looks something like this:
> {noformat}
>  *  1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
>  *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Number of Compressed Chunks  | Compressed Length (e1)/
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * /  Compressed Length cont. (e1) |Uncompressed Length (e1)   /
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e1) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (e2)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Compressed Bytes (e2)   +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e2) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (en)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Compressed Bytes (en)  +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (en) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
> {noformat}
> The first pass here adds checksums only to the actual contents of the frame 
> body itself (and doesn't actually checksum lengths and headers). While it 
> would be great to fully add checksuming across the entire protocol, the 
> proposed implementation will ensure we at least catch corrupted data and 
> likely protect ourselves pretty well anyways.
> I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor 
> implementation as it's been deprecated for a while -- is really slow and 
> crappy compared to LZ4 -- and we should do everything in our power to make 
> sure no one in the community is still using

[jira] [Commented] (CASSANDRA-13789) Reduce memory copies and object creations when acting on ByteBufs

2017-09-26 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181853#comment-16181853
 ] 

Stefania commented on CASSANDRA-13789:
--

This 
[change|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/CBUtil.java#L449]
 in {{CBUtil.writeValue()}} is a bit dangerous: {{writeValue()}} does not 
modify the source buffer, except it does modify it temporarily and in a thread 
unsafe way, with no documentation. If two threads ever wrote the same source 
buffer in parallel, maybe a buffer coming from memtables, this code would 
introduce a bug that would be very tedious to debug.

> Reduce memory copies and object creations when acting on  ByteBufs
> --
>
> Key: CASSANDRA-13789
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13789
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Norman Maurer
>Assignee: Norman Maurer
> Fix For: 4.0
>
> Attachments: 
> 0001-CBUtil.sizeOfLongString-encodes-String-to-byte-to-ca.patch, 
> 0001-Reduce-memory-copies-and-object-creations-when-actin.patch
>
>
> There are multiple "low-hanging-fruits" when it comes to reduce memory copies 
> and object allocations when acting on ByteBufs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13905) Correctly close netty channels when a stream session ends


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-13905:

Status: Patch Available  (was: Open)

> Correctly close netty channels when a stream session ends
> -
>
> Key: CASSANDRA-13905
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13905
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.x
>
>
> Netty channels in stream sessions were not being closed correctly. TL;DR I 
> was using a lambda that was not executing as it is lazily evaluated. This was 
> causing a {{RejectedExecutionException}} at the end of some streaming-related 
> dtests



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13899) Streaming of very large partition fails


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181837#comment-16181837
 ] 

Jason Brown commented on CASSANDRA-13899:
-

tbh, I don't think it's large partitions that necessarily triggered this bug; 
streaming just transfers the bytes of sstable Data file naively. I think it was 
something in the dataset and the way it compresses that exposed the bug. Thus, 
I'm not sure how to proceed with a dtest, especially one based on 
[~pauloricardomg]'s submitted patch (as highly useful as it is!).

> Streaming of very large partition fails 
> 
>
> Key: CASSANDRA-13899
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13899
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Jason Brown
> Fix For: 4.0
>
> Attachments: largepartition.yaml
>
>
> Streaming a single partition with ~100K rows fails with the following 
> exception:
> {noformat}
> ERROR [Stream-Deserializer-/127.0.0.1:35149-a92e5e12] 2017-09-21 04:03:41,237 
> StreamSession.java:617 - [Stream #c2e5b640-9eab-11e7-99c0-e9864ca8da8e] 
> Streaming error occurred on session with peer 127.0.0.1
> org.apache.cassandra.streaming.StreamReceiveException: 
> java.lang.RuntimeException: Last written key 
> DecoratedKey(-1000328290821038380) >= current key 
> DecoratedKey(-1055007227842125139)  writing into 
> /home/paulo/.ccm/test/node2/data0/stresscql/typestest-482ac7b09e8d11e787cf85d073c
> 8e037/na-1-big-Data.db
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178)
>  ~[main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {noformat}
> Reproduction steps:
>  * Create CCM cluster with 2 nodes
> * Start only first node, disable hinted handoff
>  * Run stress with the attached yaml: {{tools/bin/cassandra-stress "user 
> profile=largepartition.yaml n=10K ops(insert=1) no-warmup -node whitelist 
> 127.0.0.1 -mode native cql3 compression=lz4 -rate threads=4 -insert 
> visits=FIXED(100K) revisit=FIXED(100K)"}}
> * Start second node, run repair on {{stresscql}} table - the exception above 
> will be thrown.
> I investigated briefly and haven't found anything suspicious. This seems to 
> be related to CASSANDRA-12229 as I tested the steps above in a branch without 
> that and the repair completed successfully. I haven't tested with a smaller 
> number of rows per partition to see at which point it starts to be a problem.
> We should probably add a regression dtest to stream large partitions to catch 
> similar problems in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13899) Streaming of very large partition fails


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-13899:

Status: Patch Available  (was: Open)

> Streaming of very large partition fails 
> 
>
> Key: CASSANDRA-13899
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13899
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Jason Brown
> Fix For: 4.0
>
> Attachments: largepartition.yaml
>
>
> Streaming a single partition with ~100K rows fails with the following 
> exception:
> {noformat}
> ERROR [Stream-Deserializer-/127.0.0.1:35149-a92e5e12] 2017-09-21 04:03:41,237 
> StreamSession.java:617 - [Stream #c2e5b640-9eab-11e7-99c0-e9864ca8da8e] 
> Streaming error occurred on session with peer 127.0.0.1
> org.apache.cassandra.streaming.StreamReceiveException: 
> java.lang.RuntimeException: Last written key 
> DecoratedKey(-1000328290821038380) >= current key 
> DecoratedKey(-1055007227842125139)  writing into 
> /home/paulo/.ccm/test/node2/data0/stresscql/typestest-482ac7b09e8d11e787cf85d073c
> 8e037/na-1-big-Data.db
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178)
>  ~[main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {noformat}
> Reproduction steps:
>  * Create CCM cluster with 2 nodes
> * Start only first node, disable hinted handoff
>  * Run stress with the attached yaml: {{tools/bin/cassandra-stress "user 
> profile=largepartition.yaml n=10K ops(insert=1) no-warmup -node whitelist 
> 127.0.0.1 -mode native cql3 compression=lz4 -rate threads=4 -insert 
> visits=FIXED(100K) revisit=FIXED(100K)"}}
> * Start second node, run repair on {{stresscql}} table - the exception above 
> will be thrown.
> I investigated briefly and haven't found anything suspicious. This seems to 
> be related to CASSANDRA-12229 as I tested the steps above in a branch without 
> that and the repair completed successfully. I haven't tested with a smaller 
> number of rows per partition to see at which point it starts to be a problem.
> We should probably add a regression dtest to stream large partitions to catch 
> similar problems in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-13906:

Status: Patch Available  (was: Open)

> Properly close StreamCompressionInputStream to release any ByteBuf
> --
>
> Key: CASSANDRA-13906
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13906
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jason Brown
>Assignee: Jason Brown
>
> When running dtests for trunk (4.x) that perform some streaming, sometimes a 
> {{ByteBuf}} is not released properly, and we get this error in the logs 
> (causing the dtest to fail):
> {code}
> ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 
> Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's 
> garbage-collected. Enable advanced leak reporting to find out where the leak 
> occurred. To enable advanced leak reporting, specify the JVM option 
> '-Dio.netty.leakDetection.level=advanced' or call 
> ResourceLeakDetector.setLevel() See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13907) Versioned documentation on cassandra.apache.org

2017-09-26 Thread Murukesh Mohanan (JIRA)

Murukesh Mohanan created CASSANDRA-13907:


 Summary: Versioned documentation on cassandra.apache.org
 Key: CASSANDRA-13907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13907
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation and Website
Reporter: Murukesh Mohanan
Priority: Minor


Services like https://readthedocs.org and http://www.javadoc.io/ make it easy 
to browse the documentation for a particular version or commit of various open 
source projects. It would be nice to be able to browse the docs for a 
particular release on http://cassandra.apache.org/doc/.

Currently it seems only CQL has this at http://cassandra.apache.org/doc/old/.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181752#comment-16181752
 ] 

Jason Brown commented on CASSANDRA-13906:
-

The cause of the leaked {{ByteBuf}} was due to a wrapping class not being 
closed, and thus never had the chance to call {{release()}} on the {{ByteBuf}}.

||trunk||
|[branch|https://github.com/jasobrown/cassandra/tree/netty-leak]|
|[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/338/]|
|[utests|https://circleci.com/gh/jasobrown/cassandra/tree/netty-leak]|

I ran one of the failing dtests 
(repair_tests/incremental_repair_test.py:TestIncRepair.multiple_repair_test) on 
my laptop, and without the patch could reproduce within 2-3 runs. With the 
patch, I ran it fifty time and could not reproduce.

[~aweisberg] Would you mind reviewing?

> Properly close StreamCompressionInputStream to release any ByteBuf
> --
>
> Key: CASSANDRA-13906
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13906
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jason Brown
>Assignee: Jason Brown
>
> When running dtests for trunk (4.x) that perform some streaming, sometimes a 
> {{ByteBuf}} is not released properly, and we get this error in the logs 
> (causing the dtest to fail):
> {code}
> ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 
> Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's 
> garbage-collected. Enable advanced leak reporting to find out where the leak 
> occurred. To enable advanced leak reporting, specify the JVM option 
> '-Dio.netty.leakDetection.level=advanced' or call 
> ResourceLeakDetector.setLevel() See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf

Jason Brown created CASSANDRA-13906:
---

 Summary: Properly close StreamCompressionInputStream to release 
any ByteBuf
 Key: CASSANDRA-13906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13906
 Project: Cassandra
  Issue Type: Bug
Reporter: Jason Brown
Assignee: Jason Brown


When running dtests for trunk (4.x) that perform some streaming, sometimes a 
{{ByteBuf}} is not released properly, and we get this error in the logs 
(causing the dtest to fail):

{code}
ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 
Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's 
garbage-collected. Enable advanced leak reporting to find out where the leak 
occurred. To enable advanced leak reporting, specify the JVM option 
'-Dio.netty.leakDetection.level=advanced' or call 
ResourceLeakDetector.setLevel() See 
http://netty.io/wiki/reference-counted-objects.html for more information.
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-09-26 Thread Jeremy Hanna (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181708#comment-16181708
 ] 

Jeremy Hanna commented on CASSANDRA-13863:
--

Linking to CASSANDRA-10726 as one intent is to remove blocking read repairs.

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 
> bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 
> 127.0.0.1 |   4323
>Preparing 
> statement [SharedPool-Worker-1] | 2017-09-12 23:59:42.32 | 127.0.0.1 |
>5250
>  Determining replicas

[jira] [Commented] (CASSANDRA-13291) Replace usages of MessageDigest with Guava's Hasher


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181674#comment-16181674
 ] 

Michael Kjellman commented on CASSANDRA-13291:
--

[~snazy] let me know if you have a chance to look at this, or if you're busy or 
it's too paged out [~jasobrown] said he could finish up the remaining review 
here. thanks!

> Replace usages of MessageDigest with Guava's Hasher
> ---
>
> Key: CASSANDRA-13291
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13291
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
> Attachments: CASSANDRA-13291-trunk.diff
>
>
> During my profiling of C* I frequently see lots of aggregate time across 
> threads being spent inside the MD5 MessageDigest implementation. Given that 
> there are tons of modern alternative hashing functions better than MD5 
> available -- both in terms of providing better collision resistance and 
> actual computational speed -- I wanted to switch out our usage of MD5 for 
> alternatives (like adler128 or murmur3_128) and test for performance 
> improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  
> switching out the hashing function to something like adler128 or murmur3_128 
> (for example) -- which don't ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest 
> directly in favor of Hasher from Guava. This means going forward we can 
> change a single line of code to switch the hashing algorithm being used 
> (assuming there is an implementation in Guava).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13291) Replace usages of MessageDigest with Guava's Hasher


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181672#comment-16181672
 ] 

Michael Kjellman commented on CASSANDRA-13291:
--

Finally got some time to work on this again. Requested changes committed (along 
with a rebase of the patch to current trunk) available at:

https://github.com/mkjellman/cassandra/tree/CASSANDRA-13291
https://github.com/mkjellman/cassandra/commit/05e7b65258100e10b07bd556ab5f72133be9b74c

> Replace usages of MessageDigest with Guava's Hasher
> ---
>
> Key: CASSANDRA-13291
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13291
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
> Attachments: CASSANDRA-13291-trunk.diff
>
>
> During my profiling of C* I frequently see lots of aggregate time across 
> threads being spent inside the MD5 MessageDigest implementation. Given that 
> there are tons of modern alternative hashing functions better than MD5 
> available -- both in terms of providing better collision resistance and 
> actual computational speed -- I wanted to switch out our usage of MD5 for 
> alternatives (like adler128 or murmur3_128) and test for performance 
> improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  
> switching out the hashing function to something like adler128 or murmur3_128 
> (for example) -- which don't ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest 
> directly in favor of Hasher from Guava. This means going forward we can 
> change a single line of code to switch the hashing algorithm being used 
> (assuming there is an implementation in Guava).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12961) LCS needlessly checks for L0 STCS candidates multiple times

2017-09-26 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181671#comment-16181671
 ] 

Jeff Jirsa commented on CASSANDRA-12961:


I'm PRETTY SURE the failures are unrelated. I've asked [~jasobrown] to confirm 
just to be good citizens. Then I'll commit.


> LCS needlessly checks for L0 STCS candidates multiple times
> ---
>
> Key: CASSANDRA-12961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12961
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Vusal Ahmadoglu
>Priority: Trivial
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: 
> 0001-CASSANDRA-12961-Moving-getSTCSInL0CompactionCandidat.patch
>
>
> It's very likely that the check for L0 STCS candidates (if L0 is falling 
> behind) can be moved outside of the loop, or at very least made so that it's 
> not called on each loop iteration:
> {code}
> for (int i = generations.length - 1; i > 0; i--)
> {
> List sstables = getLevel(i);
> if (sstables.isEmpty())
> continue; // mostly this just avoids polluting the debug log 
> with zero scores
> // we want to calculate score excluding compacting ones
> Set sstablesInLevel = Sets.newHashSet(sstables);
> Set remaining = Sets.difference(sstablesInLevel, 
> cfs.getTracker().getCompacting());
> double score = (double) SSTableReader.getTotalBytes(remaining) / 
> (double)maxBytesForLevel(i, maxSSTableSizeInBytes);
> logger.trace("Compaction score for level {} is {}", i, score);
> if (score > 1.001)
> {
> // before proceeding with a higher level, let's see if L0 is 
> far enough behind to warrant STCS
> CompactionCandidate l0Compaction = 
> getSTCSInL0CompactionCandidate();
> if (l0Compaction != null)
> return l0Compaction;
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12106) Add ability to blacklist a CQL partition so all requests are ignored

2017-09-26 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181655#comment-16181655
 ] 

Jeff Jirsa edited comment on CASSANDRA-12106 at 9/26/17 10:04 PM:
--

Came up at NGCC that a number of different users are implementing something 
along these lines in an ad hoc sense. May be time to revisit this ticket. I 
don't know that we need to block this on CASSANDRA-12345 - we could easily do 
this with a {{system_distributed}} table instead? 


was (Author: jjirsa):
Came up at NGCC that a number of different users are implementing something 
along these lines in an ad hoc sense. May be time to revisit this ticket.


> Add ability to blacklist a CQL partition so all requests are ignored
> 
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blacklist such 
> partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12961) LCS needlessly checks for L0 STCS candidates multiple times

2017-09-26 Thread Vusal Ahmadoglu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181654#comment-16181654
 ] 

Vusal Ahmadoglu commented on CASSANDRA-12961:
-

Hi Jeff. I checked your last circlleci tests. I got the same failure 
([circleci|https://circleci.com/gh/vusal-ahmadoglu/cassandra/10]): 
_org.apache.cassandra.streaming.StreamTransferTaskTest Tests run: 2, Failures: 
1_

> LCS needlessly checks for L0 STCS candidates multiple times
> ---
>
> Key: CASSANDRA-12961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12961
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Vusal Ahmadoglu
>Priority: Trivial
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: 
> 0001-CASSANDRA-12961-Moving-getSTCSInL0CompactionCandidat.patch
>
>
> It's very likely that the check for L0 STCS candidates (if L0 is falling 
> behind) can be moved outside of the loop, or at very least made so that it's 
> not called on each loop iteration:
> {code}
> for (int i = generations.length - 1; i > 0; i--)
> {
> List sstables = getLevel(i);
> if (sstables.isEmpty())
> continue; // mostly this just avoids polluting the debug log 
> with zero scores
> // we want to calculate score excluding compacting ones
> Set sstablesInLevel = Sets.newHashSet(sstables);
> Set remaining = Sets.difference(sstablesInLevel, 
> cfs.getTracker().getCompacting());
> double score = (double) SSTableReader.getTotalBytes(remaining) / 
> (double)maxBytesForLevel(i, maxSSTableSizeInBytes);
> logger.trace("Compaction score for level {} is {}", i, score);
> if (score > 1.001)
> {
> // before proceeding with a higher level, let's see if L0 is 
> far enough behind to warrant STCS
> CompactionCandidate l0Compaction = 
> getSTCSInL0CompactionCandidate();
> if (l0Compaction != null)
> return l0Compaction;
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12106) Add ability to blacklist a CQL partition so all requests are ignored

2017-09-26 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181655#comment-16181655
 ] 

Jeff Jirsa commented on CASSANDRA-12106:


Came up at NGCC that a number of different users are implementing something 
along these lines in an ad hoc sense. May be time to revisit this ticket.


> Add ability to blacklist a CQL partition so all requests are ignored
> 
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blacklist such 
> partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13899) Streaming of very large partition fails


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181621#comment-16181621
 ] 

Jason Brown commented on CASSANDRA-13899:
-

This turned out to be a simple, stupid bug - i used the wrong length to compare 
against when translating from CASSANDRA-10520.

Patch here:

||13899||
|[branch|https://github.com/jasobrown/cassandra/tree/13899]|
|[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/337/]|
|[utests|https://circleci.com/gh/jasobrown/cassandra/tree/13899]|

[~pauloricardomg] Do you mind reviewing?


> Streaming of very large partition fails 
> 
>
> Key: CASSANDRA-13899
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13899
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Jason Brown
> Fix For: 4.0
>
> Attachments: largepartition.yaml
>
>
> Streaming a single partition with ~100K rows fails with the following 
> exception:
> {noformat}
> ERROR [Stream-Deserializer-/127.0.0.1:35149-a92e5e12] 2017-09-21 04:03:41,237 
> StreamSession.java:617 - [Stream #c2e5b640-9eab-11e7-99c0-e9864ca8da8e] 
> Streaming error occurred on session with peer 127.0.0.1
> org.apache.cassandra.streaming.StreamReceiveException: 
> java.lang.RuntimeException: Last written key 
> DecoratedKey(-1000328290821038380) >= current key 
> DecoratedKey(-1055007227842125139)  writing into 
> /home/paulo/.ccm/test/node2/data0/stresscql/typestest-482ac7b09e8d11e787cf85d073c
> 8e037/na-1-big-Data.db
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178)
>  ~[main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {noformat}
> Reproduction steps:
>  * Create CCM cluster with 2 nodes
> * Start only first node, disable hinted handoff
>  * Run stress with the attached yaml: {{tools/bin/cassandra-stress "user 
> profile=largepartition.yaml n=10K ops(insert=1) no-warmup -node whitelist 
> 127.0.0.1 -mode native cql3 compression=lz4 -rate threads=4 -insert 
> visits=FIXED(100K) revisit=FIXED(100K)"}}
> * Start second node, run repair on {{stresscql}} table - the exception above 
> will be thrown.
> I investigated briefly and haven't found anything suspicious. This seems to 
> be related to CASSANDRA-12229 as I tested the steps above in a branch without 
> that and the repair completed successfully. I haven't tested with a smaller 
> number of rows per partition to see at which point it starts to be a problem.
> We should probably add a regression dtest to stream large partitions to catch 
> similar problems in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13904) Performance improvement of Cassandra UDF/UDA

2017-09-26 Thread xin jin (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181515#comment-16181515
 ] 

xin jin commented on CASSANDRA-13904:
-

Simple experiments:

{code}
//Test function:
createTable("CREATE TABLE %s (a int primary key, b int)");
List queryList = new ArrayList<>();
for (int i = 1, m = 1; i < m; i++) {
String queryString = "INSERT INTO %s (a, b) " + 
String.format("VALUES (%d, %d)", i, i);
execute(queryString);
}
String fState = createFunction(KEYSPACE,
   "int, int",
   "CREATE FUNCTION %s(a int, b int) " +
   "CALLED ON NULL INPUT " +
   "RETURNS int " +
   "LANGUAGE java " +
   "AS 'return 
Integer.valueOf((a!=null?a.intValue():0) + b.intValue());'");
String a = createAggregate(KEYSPACE,
   "int, int",
   "CREATE AGGREGATE %s(int) " +
   "SFUNC " + shortFunctionName(fState) + " " +
   "STYPE int");
// 1 + 2 + 3 = 6
assertRows(execute("SELECT " + a + "(b) FROM %s"), row(49995000));
{code}

results:

1. enable_user_defined_functions_threads: false

TRACE: UDAggregate.java:198 - Executed UDA cql_test_keyspace.aggregate_2:  
call(s) to state function cql_test_keyspace.function_1 in 37259μs, 17297μs, 
26131μs

2. enable_user_defined_functions_threads: true

UDAggregate.java:198 - Executed UDA cql_test_keyspace.aggregate_2:  call(s) 
to state function cql_test_keyspace.function_1 in 555004μs, 457931μs, 475664μs


> Performance improvement of Cassandra UDF/UDA
> 
>
> Key: CASSANDRA-13904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13904
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: xin jin
>Priority: Critical
>  Labels: performance
> Fix For: 3.11.x
>
>
> Hi All,
> We have made a few experiments and found that running query with direct UDF 
> execution is ten time more faster than the async UDF execution. The in-line 
> comment: "Using async UDF execution is expensive (adds about 100us overhead 
> per invocation on a Core-i7 MBPr)” 
> https://insight.io/github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/UDFunction.java?line=293
>  show that this is a known behavior.  My questions are as below:
> 1. What are the main pros and cons of these two methods? Can I find any 
> documents that discuss this?  
> 2. Are there any plans to improve the performance of using async UDF? A 
> simple way come to my mind is to use some sort of batch method, e.g., replace 
> current row by row method with some rows by some rows. Are there any concerns 
> on this?
> 3. How people solve this performance issue in general? It seems this 
> performance issue is not an urgent or an important issue to solve because it 
> is known and it is still there. Therefore people must have some sort of good 
> solution solving this issue. 
> I really appreciate your comments in advance.
> Best regards,
> Xin



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13905) Correctly close netty channels when a stream session ends


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181406#comment-16181406
 ] 

Jason Brown commented on CASSANDRA-13905:
-

Minor patch here:

||trunk||
|[branch|https://github.com/jasobrown/cassandra/tree/netty-RejectedExecutionException]|
|[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/336/]|
|[utests|https://circleci.com/gh/jasobrown/cassandra/tree/netty-RejectedExecutionException]|

utest errors are unrelated, and number of dtest failures are lower (and none 
caused by {{RejectedExecutionEception}})

[~aweisberg] or [~pauloricardomg] : Would one of you mind reviewing?

> Correctly close netty channels when a stream session ends
> -
>
> Key: CASSANDRA-13905
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13905
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.x
>
>
> Netty channels in stream sessions were not being closed correctly. TL;DR I 
> was using a lambda that was not executing as it is lazily evaluated. This was 
> causing a {{RejectedExecutionException}} at the end of some streaming-related 
> dtests



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13905) Correctly close netty channels when a stream session ends

Jason Brown created CASSANDRA-13905:
---

 Summary: Correctly close netty channels when a stream session ends
 Key: CASSANDRA-13905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13905
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
Reporter: Jason Brown
Assignee: Jason Brown
 Fix For: 4.x


Netty channels in stream sessions were not being closed correctly. TL;DR I was 
using a lambda that was not executing as it is lazily evaluated. This was 
causing a {{RejectedExecutionException}} at the end of some streaming-related 
dtests



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13904) Performance improvement of Cassandra UDF/UDA

2017-09-26 Thread xin jin (JIRA)

xin jin created CASSANDRA-13904:
---

Summary: Performance improvement of Cassandra UDF/UDA
Key: CASSANDRA-13904
URL: https://issues.apache.org/jira/browse/CASSANDRA-13904
Project: Cassandra
Issue Type: Improvement
Components: CQL
Reporter: xin jin
Priority: Critical
Fix For: 3.11.x

Hi All,

We have made a few experiments and found that running query with direct UDF
execution is ten time more faster than the async UDF execution. The in-line
comment: "Using async UDF execution is expensive (adds about 100us overhead per
invocation on a Core-i7 MBPr)”
https://insight.io/github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/UDFunction.java?line=293
show that this is a known behavior. My questions are as below:

1. What are the main pros and cons of these two methods? Can I find any
documents that discuss this?

2. Are there any plans to improve the performance of using async UDF? A simple
way come to my mind is to use some sort of batch method, e.g., replace current
row by row method with some rows by some rows. Are there any concerns on this?

3. How people solve this performance issue in general? It seems this
performance issue is not an urgent or an important issue to solve because it is
known and it is still there. Therefore people must have some sort of good
solution solving this issue.

I really appreciate your comments in advance.

Best regards,

Xin

--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13896) Improving Cassandra write performance

2017-09-26 Thread Prince Nana Owusu Boateng (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181041#comment-16181041
 ] 

Prince Nana Owusu Boateng commented on CASSANDRA-13896:
---

Stress profile: $CASSANDRA_HOME/tools/bin/cassandra-stress user 
profile=$CASSANDRA_HOME/tools/cqlstress-insanity-example.yaml ops\(insert=1\) 
no-warmup cl=ONE duration=30m -mode native cql3 -pop 
dist=uniform\(1..24\) -node 192.168.1.246 -rate threads=125 -errors 
ignore
Only change to cqlstress-insanity-example.yaml is selecting 
SizeTieredCompactionStrategy.

Cassandra.yaml used is the default cassandra yaml for 3.10 with the following 
changes: Concurrent_reads=1024; Concurrent_writes=32 Heap size= 64GB

Dataset used:  LZ4 compression; compression chunk size of 64KB; 2.8 million 
entries; Insanity data

> Improving Cassandra write performance  
> ---
>
> Key: CASSANDRA-13896
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13896
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
> Environment: Skylake server with 2 sockets, 192GB RAM, 3xPCIe SSDs
> OS: Centos 7.3 
> Java: Oracle JDK1.8.0_121
>Reporter: Prince Nana Owusu Boateng
>  Labels: Performance
> Fix For: 4.x
>
> Attachments: Screen Shot 2017-09-22 at 11.22.43 AM.png, Screen Shot 
> 2017-09-22 at 3.31.09 PM.png
>
>
> During our Cassandra performance testing, we see high percentage of the CPU 
> spent in *org.apache.cassandra.utils.memory.SlabAllocator.allocate(int, 
> OpOrder Group) * method.  Appears to be high contention of the 
> ** atomic Integer in write workloads.   This structure is 
> used by the threads for keeping track of the region bytebuffer allocation.  
> When the contention appears, adding more clients, modifications of write 
> specific parameters does not change write throughput performance.  Attached 
> are the details of Java Flight Recorder (JFR), showing hot functions and also 
> performance results.   When we see this contention, we still have plenty of 
> CPU and throughput left ( *<20%*  Total average CPU utilization and  *<11%* 
> of the storage write total throughput).   This occurs on Cassandra 3.10.0-src 
> version using the Cassandra-Stress.
> Proposal:
> We will like to introduce a solution which eliminates the atomic operations 
> on the ** atomic Integer. This implementation will allow 
> concurrent allocation of bytebuffers without an atomic compareAndSet and 
> incrementAndGet operations. The solution is expected to increase overall 
> write performance while improving CPU utilization.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-09-26 Thread Murukesh Mohanan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181007#comment-16181007
 ] 

Murukesh Mohanan commented on CASSANDRA-13863:
--

How about:
* having the {{DigestMismatchException}} handler get a read repair decision? 
While discussing this problem with Wakabayashi-san, I made a quick patch that 
does this, but I have the feeling it's not the right way to go about this.
* or, a global flag to disable automatic read repairs? I'm not sure how this 
would go - I'm guessing there's quite a few places where read repairs can be 
initiated?

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass',

[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-09-26 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180923#comment-16180923
 ] 

Paulo Motta commented on CASSANDRA-13299:
-

Thanks for the updates, this is looking good! I managed to reproduce an OOM 
when repairing a wide partition with 100K rows and verified that this patch 
avoids the OOM by splitting the partition in multiple batches (found 
CASSANDRA-13899 on the way). Awesome job!

While splitting on the happy case is working nicely, we need to ensure the 
range tombstone handling (specially range deletions) is working correctly and 
well tested before committing this.

I noticed that the previous {{ThrottledUnfilteredIterator}} implementation 
could [potentially 
return|https://github.com/jasonstack/cassandra/blob/b8cb49035d3cf77198b31df7c51a174fffe3edaf/src/java/org/apache/cassandra/db/rows/ThrottledUnfilteredIterator.java#L166]
 {{throttle+2}} unfiltereds, differently from the documentation which states 
that the maximum number of unfiltereds per batch is {{throttle+1}}. I also 
noticed that the case when there is a row between two markers was not being 
tested by existing tests, since we need a range deletion to reproduce this 
scenario. I fixed this and added more tests on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/47d8ca3592cb6382bb4c308720646395306a0a69].
 Could you also modify 
[complexThrottleWithTombstoneTest|https://github.com/jasonstack/cassandra/commit/b8cb49035d3cf77198b31df7c51a174fffe3edaf#diff-5162644c24391628b339b88c3619427cR66]
 to test range deletions?

The previous change 
[requires|https://github.com/pauloricardomg/cassandra/commit/47d8ca3592cb6382bb4c308720646395306a0a69#diff-2acee8fea5cd82a51fda4af6e38faf13R60]
 the minimum throttle size to be 2, otherwise it would not be possible to make 
progress on the iterator in the presence of open and close markers.

I think that instead of throwing an {{AssertionError}} when the returned 
iterator is not exhausted, we could simply exhaust it, effectively skipping 
entries, since this might be a possible usage of 
{{ThrottledUnfilteredIterator}} so I did this on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/04ed5ecb5183195601950fc9efd2ca9123596487].

I also added an utility method 
{{ThrottledUnfilteredIterator.throttle(UnfilteredPartitionIterator 
partitionIterator, int maxBatchSize)}} to allow throttling an 
{{UnfilteredPartitionIterator}} transparently and used that on 
{{StreamReceiveTask}} [on this 
commit|https://github.com/pauloricardomg/cassandra/commit/4f8c3b8faa2644133d301ac7bf7b748f7ec265ee].

I had another look at the {{throttled_partition_update_test}} 
[dtest|https://github.com/riptano/cassandra-dtest/commit/f3307adef349f232ec0ae64e902164684f32cca0]
 and think we can make the following improvements:
* Right now we're [verifying the 
results|https://github.com/riptano/cassandra-dtest/commit/f3307adef349f232ec0ae64e902164684f32cca0#diff-62ba429edee6a4681782f078246c9893R1410]
 with all the nodes UP, but it's possible that another node responds the query 
even though one of the inconsistent nodes did not stream correctly. I think we 
should check the results on each node individually (with the others down) to 
ensure they streamed data correctly from other nodes.
*  Add [range deletions|https://issues.apache.org/jira/browse/CASSANDRA-6237] 
since that's when the range tombstones special cases will be properly exercised.

Please let me know what do you think about these suggestions.

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
> Fix For: 4.x
>
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The

[jira] [Commented] (CASSANDRA-13900) Massive GC suspension increase after updating to 3.0.14 from 2.1.18


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180814#comment-16180814
 ] 

Thomas Steinmaurer commented on CASSANDRA-13900:


Yep, thanks. I can give this a try in our loadtest environment suffering from 
this issue. Need to first figure out about any config changes in cassandra.yaml 
etc ...

> Massive GC suspension increase after updating to 3.0.14 from 2.1.18
> ---
>
> Key: CASSANDRA-13900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13900
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Thomas Steinmaurer
>Priority: Blocker
> Attachments: cassandra2118_vs_3014.jpg, cassandra3014_jfr_5min.jpg
>
>
> In short: After upgrading to 3.0.14 (from 2.1.18), we aren't able to process 
> the same incoming write load on the same infrastructure anymore.
> We have a loadtest environment running 24x7 testing our software using 
> Cassandra as backend. Both, loadtest and production is hosted in AWS and do 
> have the same spec on the Cassandra-side, namely:
> * 9x m4.xlarge
> * 8G heap
> * CMS (400MB newgen)
> * 2TB EBS gp2
> * Client requests are entirely CQL
> per node. We have a solid/constant baseline in loadtest at ~ 60% CPU cluster 
> AVG with constant, simulated load running against our cluster, using 
> Cassandra 2.1 for > 2 years now.
> Recently we started to upgrade to 3.0.14 in this 9 node loadtest environment, 
> and basically, 3.0.14 isn't able to cope with the load anymore. No particular 
> special tweaks, memory settings/changes etc., all the same as in 2.1.18. We 
> also didn't upgrade sstables yet, thus the increase mentioned in the 
> screenshot is not related to any manually triggered maintenance operation 
> after upgrading to 3.0.14.
> According to our monitoring, with 3.0.14, we see a *GC suspension time 
> increase by a factor of > 2*, of course directly correlating with an CPU 
> increase > 80%. See: attached screen "cassandra2118_vs_3014.jpg"
> This all means that our incoming load against 2.1.18 is something, 3.0.14 
> can't handle. So, we would need to either scale up (e.g. m4.xlarge => 
> m4.2xlarge) or scale out for being able to handle the same load, which is 
> cost-wise not an option.
> Unfortunately I do not have Java Flight Recorder runs for 2.1.18 at the 
> mentioned load, but can provide JFR session for our current 3.0.14 setup. The 
> attached 5min JFR memory allocation area (cassandra3014_jfr_5min.jpg) shows 
> compaction being the top contributor for the captured 5min time-frame. Could 
> be by "accident" covering the 5min with compaction as top contributor only 
> (although mentioned simulated client load is attached), but according to 
> stack traces, we see new classes from 3.0, e.g. BTreeRow.searchIterator() 
> etc. popping up as top contributor, thus possibly new classes / data 
> structures are causing much more object churn now.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13900) Massive GC suspension increase after updating to 3.0.14 from 2.1.18

2017-09-26 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180759#comment-16180759
 ] 

Jeremiah Jordan commented on CASSANDRA-13900:
-

It is supported, 3.0 and 3.11 support the same versions to upgrade from.  But 
yes it is a much bigger upgrade step.  Was just wondering if you had tried 3.11 
(possibly on a test cluster) to see if you saw the same issues.

> Massive GC suspension increase after updating to 3.0.14 from 2.1.18
> ---
>
> Key: CASSANDRA-13900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13900
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Thomas Steinmaurer
>Priority: Blocker
> Attachments: cassandra2118_vs_3014.jpg, cassandra3014_jfr_5min.jpg
>
>
> In short: After upgrading to 3.0.14 (from 2.1.18), we aren't able to process 
> the same incoming write load on the same infrastructure anymore.
> We have a loadtest environment running 24x7 testing our software using 
> Cassandra as backend. Both, loadtest and production is hosted in AWS and do 
> have the same spec on the Cassandra-side, namely:
> * 9x m4.xlarge
> * 8G heap
> * CMS (400MB newgen)
> * 2TB EBS gp2
> * Client requests are entirely CQL
> per node. We have a solid/constant baseline in loadtest at ~ 60% CPU cluster 
> AVG with constant, simulated load running against our cluster, using 
> Cassandra 2.1 for > 2 years now.
> Recently we started to upgrade to 3.0.14 in this 9 node loadtest environment, 
> and basically, 3.0.14 isn't able to cope with the load anymore. No particular 
> special tweaks, memory settings/changes etc., all the same as in 2.1.18. We 
> also didn't upgrade sstables yet, thus the increase mentioned in the 
> screenshot is not related to any manually triggered maintenance operation 
> after upgrading to 3.0.14.
> According to our monitoring, with 3.0.14, we see a *GC suspension time 
> increase by a factor of > 2*, of course directly correlating with an CPU 
> increase > 80%. See: attached screen "cassandra2118_vs_3014.jpg"
> This all means that our incoming load against 2.1.18 is something, 3.0.14 
> can't handle. So, we would need to either scale up (e.g. m4.xlarge => 
> m4.2xlarge) or scale out for being able to handle the same load, which is 
> cost-wise not an option.
> Unfortunately I do not have Java Flight Recorder runs for 2.1.18 at the 
> mentioned load, but can provide JFR session for our current 3.0.14 setup. The 
> attached 5min JFR memory allocation area (cassandra3014_jfr_5min.jpg) shows 
> compaction being the top contributor for the captured 5min time-frame. Could 
> be by "accident" covering the 5min with compaction as top contributor only 
> (although mentioned simulated client load is attached), but according to 
> stack traces, we see new classes from 3.0, e.g. BTreeRow.searchIterator() 
> etc. popping up as top contributor, thus possibly new classes / data 
> structures are causing much more object churn now.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13900) Massive GC suspension increase after updating to 3.0.14 from 2.1.18


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180752#comment-16180752
 ] 

Thomas Steinmaurer commented on CASSANDRA-13900:


[~jjordan], I was pointed to CASSANDRA-12269 a few minutes ago, which sounds a 
lot what we are facing. A pity that this isn't considered for 3.0.x.

2.1 => 3.11 sounds like a huge step to me. I haven't checked if this is even 
possible from a SSTable format perspective.

> Massive GC suspension increase after updating to 3.0.14 from 2.1.18
> ---
>
> Key: CASSANDRA-13900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13900
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Thomas Steinmaurer
>Priority: Blocker
> Attachments: cassandra2118_vs_3014.jpg, cassandra3014_jfr_5min.jpg
>
>
> In short: After upgrading to 3.0.14 (from 2.1.18), we aren't able to process 
> the same incoming write load on the same infrastructure anymore.
> We have a loadtest environment running 24x7 testing our software using 
> Cassandra as backend. Both, loadtest and production is hosted in AWS and do 
> have the same spec on the Cassandra-side, namely:
> * 9x m4.xlarge
> * 8G heap
> * CMS (400MB newgen)
> * 2TB EBS gp2
> * Client requests are entirely CQL
> per node. We have a solid/constant baseline in loadtest at ~ 60% CPU cluster 
> AVG with constant, simulated load running against our cluster, using 
> Cassandra 2.1 for > 2 years now.
> Recently we started to upgrade to 3.0.14 in this 9 node loadtest environment, 
> and basically, 3.0.14 isn't able to cope with the load anymore. No particular 
> special tweaks, memory settings/changes etc., all the same as in 2.1.18. We 
> also didn't upgrade sstables yet, thus the increase mentioned in the 
> screenshot is not related to any manually triggered maintenance operation 
> after upgrading to 3.0.14.
> According to our monitoring, with 3.0.14, we see a *GC suspension time 
> increase by a factor of > 2*, of course directly correlating with an CPU 
> increase > 80%. See: attached screen "cassandra2118_vs_3014.jpg"
> This all means that our incoming load against 2.1.18 is something, 3.0.14 
> can't handle. So, we would need to either scale up (e.g. m4.xlarge => 
> m4.2xlarge) or scale out for being able to handle the same load, which is 
> cost-wise not an option.
> Unfortunately I do not have Java Flight Recorder runs for 2.1.18 at the 
> mentioned load, but can provide JFR session for our current 3.0.14 setup. The 
> attached 5min JFR memory allocation area (cassandra3014_jfr_5min.jpg) shows 
> compaction being the top contributor for the captured 5min time-frame. Could 
> be by "accident" covering the 5min with compaction as top contributor only 
> (although mentioned simulated client load is attached), but according to 
> stack traces, we see new classes from 3.0, e.g. BTreeRow.searchIterator() 
> etc. popping up as top contributor, thus possibly new classes / data 
> structures are causing much more object churn now.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180748#comment-16180748
 ] 

Thomas Steinmaurer commented on CASSANDRA-8099:
---

[~tjake], thanks. Sounds exactly what we are facing with 3.0.

> Refactor and modernize the storage engine
> -
>
> Key: CASSANDRA-8099
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 3.0 alpha 1
>
> Attachments: 8099-nit
>
>
> The current storage engine (which for this ticket I'll loosely define as "the 
> code implementing the read/write path") is suffering from old age. One of the 
> main problem is that the only structure it deals with is the cell, which 
> completely ignores the more high level CQL structure that groups cell into 
> (CQL) rows.
> This leads to many inefficiencies, like the fact that during a reads we have 
> to group cells multiple times (to count on replica, then to count on the 
> coordinator, then to produce the CQL resultset) because we forget about the 
> grouping right away each time (so lots of useless cell names comparisons in 
> particular). But outside inefficiencies, having to manually recreate the CQL 
> structure every time we need it for something is hindering new features and 
> makes the code more complex that it should be.
> Said storage engine also has tons of technical debt. To pick an example, the 
> fact that during range queries we update {{SliceQueryFilter.count}} is pretty 
> hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has 
> to go into to simply "remove the last query result".
> So I want to bite the bullet and modernize this storage engine. I propose to 
> do 2 main things:
> # Make the storage engine more aware of the CQL structure. In practice, 
> instead of having partitions be a simple iterable map of cells, it should be 
> an iterable list of row (each being itself composed of per-column cells, 
> though obviously not exactly the same kind of cell we have today).
> # Make the engine more iterative. What I mean here is that in the read path, 
> we end up reading all cells in memory (we put them in a ColumnFamily object), 
> but there is really no reason to. If instead we were working with iterators 
> all the way through, we could get to a point where we're basically 
> transferring data from disk to the network, and we should be able to reduce 
> GC substantially.
> Please note that such refactor should provide some performance improvements 
> right off the bat but it's not it's primary goal either. It's primary goal is 
> to simplify the storage engine and adds abstraction that are better suited to 
> further optimizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13900) Massive GC suspension increase after updating to 3.0.14 from 2.1.18

2017-09-26 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180741#comment-16180741
 ] 

Jeremiah Jordan commented on CASSANDRA-13900:
-

Have you tried 3.11.x?  During the 3.x line many perf/gc improvements went in 
to the new storage engine.  They were not put into 3.0.x as we did not want to 
possibly introduce bugs to the stable branch along with the improvements.

> Massive GC suspension increase after updating to 3.0.14 from 2.1.18
> ---
>
> Key: CASSANDRA-13900
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13900
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Thomas Steinmaurer
>Priority: Blocker
> Attachments: cassandra2118_vs_3014.jpg, cassandra3014_jfr_5min.jpg
>
>
> In short: After upgrading to 3.0.14 (from 2.1.18), we aren't able to process 
> the same incoming write load on the same infrastructure anymore.
> We have a loadtest environment running 24x7 testing our software using 
> Cassandra as backend. Both, loadtest and production is hosted in AWS and do 
> have the same spec on the Cassandra-side, namely:
> * 9x m4.xlarge
> * 8G heap
> * CMS (400MB newgen)
> * 2TB EBS gp2
> * Client requests are entirely CQL
> per node. We have a solid/constant baseline in loadtest at ~ 60% CPU cluster 
> AVG with constant, simulated load running against our cluster, using 
> Cassandra 2.1 for > 2 years now.
> Recently we started to upgrade to 3.0.14 in this 9 node loadtest environment, 
> and basically, 3.0.14 isn't able to cope with the load anymore. No particular 
> special tweaks, memory settings/changes etc., all the same as in 2.1.18. We 
> also didn't upgrade sstables yet, thus the increase mentioned in the 
> screenshot is not related to any manually triggered maintenance operation 
> after upgrading to 3.0.14.
> According to our monitoring, with 3.0.14, we see a *GC suspension time 
> increase by a factor of > 2*, of course directly correlating with an CPU 
> increase > 80%. See: attached screen "cassandra2118_vs_3014.jpg"
> This all means that our incoming load against 2.1.18 is something, 3.0.14 
> can't handle. So, we would need to either scale up (e.g. m4.xlarge => 
> m4.2xlarge) or scale out for being able to handle the same load, which is 
> cost-wise not an option.
> Unfortunately I do not have Java Flight Recorder runs for 2.1.18 at the 
> mentioned load, but can provide JFR session for our current 3.0.14 setup. The 
> attached 5min JFR memory allocation area (cassandra3014_jfr_5min.jpg) shows 
> compaction being the top contributor for the captured 5min time-frame. Could 
> be by "accident" covering the 5min with compaction as top contributor only 
> (although mentioned simulated client load is attached), but according to 
> stack traces, we see new classes from 3.0, e.g. BTreeRow.searchIterator() 
> etc. popping up as top contributor, thus possibly new classes / data 
> structures are causing much more object churn now.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine

2017-09-26 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180739#comment-16180739
 ] 

T Jake Luciani commented on CASSANDRA-8099:
---

[~tsteinmaurer] A number of GC problems were addressed in CASSANDRA-12269

> Refactor and modernize the storage engine
> -
>
> Key: CASSANDRA-8099
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 3.0 alpha 1
>
> Attachments: 8099-nit
>
>
> The current storage engine (which for this ticket I'll loosely define as "the 
> code implementing the read/write path") is suffering from old age. One of the 
> main problem is that the only structure it deals with is the cell, which 
> completely ignores the more high level CQL structure that groups cell into 
> (CQL) rows.
> This leads to many inefficiencies, like the fact that during a reads we have 
> to group cells multiple times (to count on replica, then to count on the 
> coordinator, then to produce the CQL resultset) because we forget about the 
> grouping right away each time (so lots of useless cell names comparisons in 
> particular). But outside inefficiencies, having to manually recreate the CQL 
> structure every time we need it for something is hindering new features and 
> makes the code more complex that it should be.
> Said storage engine also has tons of technical debt. To pick an example, the 
> fact that during range queries we update {{SliceQueryFilter.count}} is pretty 
> hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has 
> to go into to simply "remove the last query result".
> So I want to bite the bullet and modernize this storage engine. I propose to 
> do 2 main things:
> # Make the storage engine more aware of the CQL structure. In practice, 
> instead of having partitions be a simple iterable map of cells, it should be 
> an iterable list of row (each being itself composed of per-column cells, 
> though obviously not exactly the same kind of cell we have today).
> # Make the engine more iterative. What I mean here is that in the read path, 
> we end up reading all cells in memory (we put them in a ColumnFamily object), 
> but there is really no reason to. If instead we were working with iterators 
> all the way through, we could get to a point where we're basically 
> transferring data from disk to the network, and we should be able to reduce 
> GC substantially.
> Please note that such refactor should provide some performance improvements 
> right off the bat but it's not it's primary goal either. It's primary goal is 
> to simplify the storage engine and adds abstraction that are better suited to 
> further optimizations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13886) OOM put node in limbo


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommy Stendahl updated CASSANDRA-13886:
---
Status: Patch Available  (was: Open)

> OOM put node in limbo
> -
>
> Key: CASSANDRA-13886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13886
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra version 2.2.10
>Reporter: Marcus Olsson
>Assignee: Tommy Stendahl
>Priority: Minor
>  Labels: lhf
> Attachments: 13886.txt
>
>
> In one of our test clusters we have had some issues with OOM. While working 
> on fixing this it was discovered that one of the nodes that got OOM actually 
> wasn't shut down properly. Instead it went into a half-up-state where the 
> affected node considered itself up while all other nodes considered it as 
> down.
> The following stacktrace was observed which seems to be the cause of this:
> {noformat}
> java.lang.NoClassDefFoundError: Could not initialize class 
> java.lang.UNIXProcess
> at java.lang.ProcessImpl.start(ProcessImpl.java:130) ~[na:1.8.0_131]
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) 
> ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:620) ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:485) ~[na:1.8.0_131]
> at 
> org.apache.cassandra.utils.HeapUtils.generateHeapDump(HeapUtils.java:88) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:56)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:168)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
> {noformat}
> It seems that if an unexpected exception/error is thrown inside 
> JVMStabilityInspector.inspectThrowable the JVM is not actually shut down but 
> instead keeps on running. My expectation is that the JVM should shut down in 
> case OOM is thrown.
> Potential workaround is to add:
> {noformat}
> JVM_OPTS="$JVM_OPTS -XX:+ExitOnOutOfMemoryError"
> {noformat}
> to cassandra-env.sh.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13886) OOM put node in limbo


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommy Stendahl updated CASSANDRA-13886:
---
Attachment: 13886.txt

> OOM put node in limbo
> -
>
> Key: CASSANDRA-13886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13886
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra version 2.2.10
>Reporter: Marcus Olsson
>Assignee: Tommy Stendahl
>Priority: Minor
>  Labels: lhf
> Attachments: 13886.txt
>
>
> In one of our test clusters we have had some issues with OOM. While working 
> on fixing this it was discovered that one of the nodes that got OOM actually 
> wasn't shut down properly. Instead it went into a half-up-state where the 
> affected node considered itself up while all other nodes considered it as 
> down.
> The following stacktrace was observed which seems to be the cause of this:
> {noformat}
> java.lang.NoClassDefFoundError: Could not initialize class 
> java.lang.UNIXProcess
> at java.lang.ProcessImpl.start(ProcessImpl.java:130) ~[na:1.8.0_131]
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) 
> ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:620) ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:485) ~[na:1.8.0_131]
> at 
> org.apache.cassandra.utils.HeapUtils.generateHeapDump(HeapUtils.java:88) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:56)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:168)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
> {noformat}
> It seems that if an unexpected exception/error is thrown inside 
> JVMStabilityInspector.inspectThrowable the JVM is not actually shut down but 
> instead keeps on running. My expectation is that the JVM should shut down in 
> case OOM is thrown.
> Potential workaround is to add:
> {noformat}
> JVM_OPTS="$JVM_OPTS -XX:+ExitOnOutOfMemoryError"
> {noformat}
> to cassandra-env.sh.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13886) OOM put node in limbo


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180640#comment-16180640
 ] 

Tommy Stendahl commented on CASSANDRA-13886:


I have done some work on this issue, even if it happens very seldomly its very 
bad when it happens. Since the JVM doesn’t die properly our monitoring system 
doesn’t restart Cassandra on this node, it requires a manual intervention. The 
work around with {{-XX:+ExitOnOutOfMemoryError}} works fine, and you can also 
use {{-XX:+CrashOnOutOfMemoryError}} if you want core dumps. But as I 
understand these options are only available from java 8u92 so they might not be 
an option for every one. I think an alternative is to improve 
{{HeapUtils.generateHeapDump()}} so we catch {{Throwable}} so we prevent any 
exceptions from leaking out from {{HeapUtils.generateHeapDump()}}, this would 
allow execution to continue in {{JVMStabilityInspector.inspectThrowable()}} 
until we reach {{killer.killCurrentJVM(t)}} that will properly kill the jvm.
I have prepared a patch for this on the 2.2 branch but it should merge fine to 
all branches.

> OOM put node in limbo
> -
>
> Key: CASSANDRA-13886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13886
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra version 2.2.10
>Reporter: Marcus Olsson
>Assignee: Tommy Stendahl
>Priority: Minor
>  Labels: lhf
>
> In one of our test clusters we have had some issues with OOM. While working 
> on fixing this it was discovered that one of the nodes that got OOM actually 
> wasn't shut down properly. Instead it went into a half-up-state where the 
> affected node considered itself up while all other nodes considered it as 
> down.
> The following stacktrace was observed which seems to be the cause of this:
> {noformat}
> java.lang.NoClassDefFoundError: Could not initialize class 
> java.lang.UNIXProcess
> at java.lang.ProcessImpl.start(ProcessImpl.java:130) ~[na:1.8.0_131]
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) 
> ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:620) ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:485) ~[na:1.8.0_131]
> at 
> org.apache.cassandra.utils.HeapUtils.generateHeapDump(HeapUtils.java:88) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:56)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:168)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
> {noformat}
> It seems that if an unexpected exception/error is thrown inside 
> JVMStabilityInspector.inspectThrowable the JVM is not actually shut down but 
> instead keeps on running. My expectation is that the JVM should shut down in 
> case OOM is thrown.
> Potential workaround is to add:
> {noformat}
> JVM_OPTS="$JVM_OPTS -XX:+ExitOnOutOfMemoryError"
> {noformat}
> to cassandra-env.sh.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-13886) OOM put node in limbo


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommy Stendahl reassigned CASSANDRA-13886:
--

Assignee: Tommy Stendahl

> OOM put node in limbo
> -
>
> Key: CASSANDRA-13886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13886
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra version 2.2.10
>Reporter: Marcus Olsson
>Assignee: Tommy Stendahl
>Priority: Minor
>  Labels: lhf
>
> In one of our test clusters we have had some issues with OOM. While working 
> on fixing this it was discovered that one of the nodes that got OOM actually 
> wasn't shut down properly. Instead it went into a half-up-state where the 
> affected node considered itself up while all other nodes considered it as 
> down.
> The following stacktrace was observed which seems to be the cause of this:
> {noformat}
> java.lang.NoClassDefFoundError: Could not initialize class 
> java.lang.UNIXProcess
> at java.lang.ProcessImpl.start(ProcessImpl.java:130) ~[na:1.8.0_131]
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) 
> ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:620) ~[na:1.8.0_131]
> at java.lang.Runtime.exec(Runtime.java:485) ~[na:1.8.0_131]
> at 
> org.apache.cassandra.utils.HeapUtils.generateHeapDump(HeapUtils.java:88) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:56)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:168)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  ~[apache-cassandra-2.2.10.jar:2.2.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-2.2.10.jar:2.2.10]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
> {noformat}
> It seems that if an unexpected exception/error is thrown inside 
> JVMStabilityInspector.inspectThrowable the JVM is not actually shut down but 
> instead keeps on running. My expectation is that the JVM should shut down in 
> case OOM is thrown.
> Potential workaround is to add:
> {noformat}
> JVM_OPTS="$JVM_OPTS -XX:+ExitOnOutOfMemoryError"
> {noformat}
> to cassandra-env.sh.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13808) Integer overflows with Amazon Elastic File System (EFS)

2017-09-26 Thread Anonymous (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous updated CASSANDRA-13808:
--
Status: Ready to Commit  (was: Patch Available)

> Integer overflows with Amazon Elastic File System (EFS)
> ---
>
> Key: CASSANDRA-13808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Vitalii Ishchenko
>Assignee: Benjamin Lerer
> Fix For: 3.11.1
>
>
> Integer overflow issue was fixed for cassandra 2.2, but in 3.8 new property 
> was introduced in config that also derives from disk size  
> {{cdc_total_space_in_mb}}, see CASSANDRA-8844
> It should be updated too 
> https://github.com/apache/cassandra/blob/6b7d73a49695c0ceb78bc7a003ace606a806c13a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L484



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13649) Uncaught exceptions in Netty pipeline


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180599#comment-16180599
 ] 

Jason Brown commented on CASSANDRA-13649:
-

[~ajithshetty28] Please look at the "Fix Version" field above: this was patched 
in 3.0.15

> Uncaught exceptions in Netty pipeline
> -
>
> Key: CASSANDRA-13649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13649
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging, Testing
>Reporter: Stefan Podkowinski
>Assignee: Norman Maurer
>  Labels: patch
> Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0
>
> Attachments: 
> 0001-CASSANDRA-13649-Ensure-all-exceptions-are-correctly-.patch, 
> test_stdout.txt
>
>
> I've noticed some netty related errors in trunk in [some of the dtest 
> results|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/106/#showFailuresLink].
>  Just want to make sure that we don't have to change anything related to the 
> exception handling in our pipeline and that this isn't a netty issue. 
> Actually if this causes flakiness but is otherwise harmless, we should do 
> something about it, even if it's just on the dtest side.
> {noformat}
> WARN  [epollEventLoopGroup-2-9] 2017-06-28 17:23:49,699 Slf4JLogger.java:151 
> - An exceptionCaught() event was fired, and it reached at the tail of the 
> pipeline. It usually means the last handler in the pipeline did not handle 
> the exception.
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
>   at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown 
> Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> {noformat}
> And again in another test:
> {noformat}
> WARN  [epollEventLoopGroup-2-8] 2017-06-29 02:27:31,300 Slf4JLogger.java:151 
> - An exceptionCaught() event was fired, and it reached at the tail of the 
> pipeline. It usually means the last handler in the pipeline did not handle 
> the exception.
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
>   at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown 
> Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> {noformat}
> Edit:
> The {{io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() 
> failed}} error also causes tests to fail for 3.0 and 3.11. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13595) Short read protection doesn't work at the end of a partition

2017-09-26 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13595:
--
Fix Version/s: 3.11.x
   3.0.x
   Status: Patch Available  (was: In Progress)

> Short read protection doesn't work at the end of a partition
> 
>
> Key: CASSANDRA-13595
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13595
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Andrés de la Peña
>Assignee: Aleksey Yeschenko
>  Labels: Correctness
> Fix For: 3.0.x, 3.11.x
>
>
> It seems that short read protection doesn't work when the short read is done 
> at the end of a partition in a range query. The final assertion of this dtest 
> fails:
> {code}
> def short_read_partitions_delete_test(self):
> cluster = self.cluster
> cluster.set_configuration_options(values={'hinted_handoff_enabled': 
> False})
> cluster.set_batch_commitlog(enabled=True)
> cluster.populate(2).start(wait_other_notice=True)
> node1, node2 = self.cluster.nodelist()
> session = self.patient_cql_connection(node1)
> create_ks(session, 'ks', 2)
> session.execute("CREATE TABLE t (k int, c int, PRIMARY KEY(k, c)) 
> WITH read_repair_chance = 0.0")
> # we write 1 and 2 in a partition: all nodes get it.
> session.execute(SimpleStatement("INSERT INTO t (k, c) VALUES (1, 1)", 
> consistency_level=ConsistencyLevel.ALL))
> session.execute(SimpleStatement("INSERT INTO t (k, c) VALUES (2, 1)", 
> consistency_level=ConsistencyLevel.ALL))
> # we delete partition 1: only node 1 gets it.
> node2.flush()
> node2.stop(wait_other_notice=True)
> session = self.patient_cql_connection(node1, 'ks', 
> consistency_level=ConsistencyLevel.ONE)
> session.execute(SimpleStatement("DELETE FROM t WHERE k = 1"))
> node2.start(wait_other_notice=True)
> # we delete partition 2: only node 2 gets it.
> node1.flush()
> node1.stop(wait_other_notice=True)
> session = self.patient_cql_connection(node2, 'ks', 
> consistency_level=ConsistencyLevel.ONE)
> session.execute(SimpleStatement("DELETE FROM t WHERE k = 2"))
> node1.start(wait_other_notice=True)
> # read from both nodes
> session = self.patient_cql_connection(node1, 'ks', 
> consistency_level=ConsistencyLevel.ALL)
> assert_none(session, "SELECT * FROM t LIMIT 1")
> {code}
> However, the dtest passes if we remove the {{LIMIT 1}}.
> Short read protection [uses a 
> {{SinglePartitionReadCommand}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DataResolver.java#L484],
>  maybe it should use a {{PartitionRangeReadCommand}} instead?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13595) Short read protection doesn't work at the end of a partition

2017-09-26 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180588#comment-16180588
 ] 

Aleksey Yeschenko commented on CASSANDRA-13595:
---

[3.0 branch|https://github.com/iamaleksey/cassandra/commits/13595-3.0], [a new 
dtest|https://github.com/iamaleksey/cassandra-dtest/commits/13595], [CircleCI 
run|https://circleci.com/gh/iamaleksey/cassandra/41], [dtest 
run|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/335/testReport/].

utests have he usual circle issues with {{RemoveTest}} and {{ViewComplexTest}}, 
dtests have mostly the git clone issue again with some of the tests. Everything 
relevant seems to be passing.

> Short read protection doesn't work at the end of a partition
> 
>
> Key: CASSANDRA-13595
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13595
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Andrés de la Peña
>Assignee: Aleksey Yeschenko
>  Labels: Correctness
>
> It seems that short read protection doesn't work when the short read is done 
> at the end of a partition in a range query. The final assertion of this dtest 
> fails:
> {code}
> def short_read_partitions_delete_test(self):
> cluster = self.cluster
> cluster.set_configuration_options(values={'hinted_handoff_enabled': 
> False})
> cluster.set_batch_commitlog(enabled=True)
> cluster.populate(2).start(wait_other_notice=True)
> node1, node2 = self.cluster.nodelist()
> session = self.patient_cql_connection(node1)
> create_ks(session, 'ks', 2)
> session.execute("CREATE TABLE t (k int, c int, PRIMARY KEY(k, c)) 
> WITH read_repair_chance = 0.0")
> # we write 1 and 2 in a partition: all nodes get it.
> session.execute(SimpleStatement("INSERT INTO t (k, c) VALUES (1, 1)", 
> consistency_level=ConsistencyLevel.ALL))
> session.execute(SimpleStatement("INSERT INTO t (k, c) VALUES (2, 1)", 
> consistency_level=ConsistencyLevel.ALL))
> # we delete partition 1: only node 1 gets it.
> node2.flush()
> node2.stop(wait_other_notice=True)
> session = self.patient_cql_connection(node1, 'ks', 
> consistency_level=ConsistencyLevel.ONE)
> session.execute(SimpleStatement("DELETE FROM t WHERE k = 1"))
> node2.start(wait_other_notice=True)
> # we delete partition 2: only node 2 gets it.
> node1.flush()
> node1.stop(wait_other_notice=True)
> session = self.patient_cql_connection(node2, 'ks', 
> consistency_level=ConsistencyLevel.ONE)
> session.execute(SimpleStatement("DELETE FROM t WHERE k = 2"))
> node1.start(wait_other_notice=True)
> # read from both nodes
> session = self.patient_cql_connection(node1, 'ks', 
> consistency_level=ConsistencyLevel.ALL)
> assert_none(session, "SELECT * FROM t LIMIT 1")
> {code}
> However, the dtest passes if we remove the {{LIMIT 1}}.
> Short read protection [uses a 
> {{SinglePartitionReadCommand}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DataResolver.java#L484],
>  maybe it should use a {{PartitionRangeReadCommand}} instead?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13808) Integer overflows with Amazon Elastic File System (EFS)

2017-09-26 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180443#comment-16180443
 ] 

Alex Petrov edited comment on CASSANDRA-13808 at 9/26/17 8:21 AM:
--

+1, LGTM

For the record, the fix is the same as in [CASSANDRA-13067].


was (Author: ifesdjeen):
+1, LGTM

> Integer overflows with Amazon Elastic File System (EFS)
> ---
>
> Key: CASSANDRA-13808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Vitalii Ishchenko
>Assignee: Benjamin Lerer
> Fix For: 3.11.1
>
>
> Integer overflow issue was fixed for cassandra 2.2, but in 3.8 new property 
> was introduced in config that also derives from disk size  
> {{cdc_total_space_in_mb}}, see CASSANDRA-8844
> It should be updated too 
> https://github.com/apache/cassandra/blob/6b7d73a49695c0ceb78bc7a003ace606a806c13a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L484



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13808) Integer overflows with Amazon Elastic File System (EFS)

2017-09-26 Thread Alex Petrov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13808:

Reviewer: Alex Petrov

> Integer overflows with Amazon Elastic File System (EFS)
> ---
>
> Key: CASSANDRA-13808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13808
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Vitalii Ishchenko
>Assignee: Benjamin Lerer
> Fix For: 3.11.1
>
>
> Integer overflow issue was fixed for cassandra 2.2, but in 3.8 new property 
> was introduced in config that also derives from disk size  
> {{cdc_total_space_in_mb}}, see CASSANDRA-8844
> It should be updated too 
> https://github.com/apache/cassandra/blob/6b7d73a49695c0ceb78bc7a003ace606a806c13a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L484



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine