[jira] [Updated] (CASSANDRA-7486) Migrate to G1GC by default

2015-09-19 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-7486:
--
Assignee: Benedict  (was: Albert P Tobey)

> Migrate to G1GC by default
> --
>
> Key: CASSANDRA-7486
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Config
>Reporter: Jonathan Ellis
>Assignee: Benedict
> Fix For: 3.0 alpha 1
>
>
> See 
> http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
>  and https://twitter.com/rbranson/status/482113561431265281
> May want to default 2.1 to G1.
> 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
> Suspect this will help G1 even more than CMS.  (NB this is off by default but 
> needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-09-19 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877418#comment-14877418
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

The main point of switching to G1 was to enable most users to get decent - if 
not the best - performance out of the box without having to guess HEAP_NEWSIZE.

Since nobody has the time or inclination to test/discover further, it might as 
well be rolled back. Users won't notice any difference in pain since there was 
never a release with G1.

> Migrate to G1GC by default
> --
>
> Key: CASSANDRA-7486
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Config
>Reporter: Jonathan Ellis
>Assignee: Albert P Tobey
> Fix For: 3.0 alpha 1
>
>
> See 
> http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
>  and https://twitter.com/rbranson/status/482113561431265281
> May want to default 2.1 to G1.
> 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
> Suspect this will help G1 even more than CMS.  (NB this is off by default but 
> needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-09-19 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877311#comment-14877311
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

Is the picture equally bleak at RF=3?

Do the "2.2 GC" settings include anything other than the defaults from 
cassandra-env.sh? "ps -efw" output is sufficient.

I'd be happy to take a look at the GC logs if they are available.

> Migrate to G1GC by default
> --
>
> Key: CASSANDRA-7486
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Config
>Reporter: Jonathan Ellis
>Assignee: Albert P Tobey
> Fix For: 3.0 alpha 1
>
>
> See 
> http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
>  and https://twitter.com/rbranson/status/482113561431265281
> May want to default 2.1 to G1.
> 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
> Suspect this will help G1 even more than CMS.  (NB this is off by default but 
> needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-10 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740056#comment-14740056
 ] 

Albert P Tobey commented on CASSANDRA-10249:


I'm benchmarking this patch when I have time to do so.

https://gist.github.com/tobert/a30ee9b9c2d8aba882f0

> Reduce over-read for standard disk io by 16x
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
> Fix For: 2.1.x
>
> Attachments: patched-2.1.9-dstat-lvn10.png, 
> stock-2.1.9-dstat-lvn10.png, yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig. Latency improved considerably and GC became a lot 
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-03 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729820#comment-14729820
 ] 

Albert P Tobey commented on CASSANDRA-10249:


I'm running benchmarks now and will create a new patch that accepts a property 
and defaults to the original value of 64K.

> Reduce over-read for standard disk io by 16x
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
> Fix For: 2.1.x
>
> Attachments: patched-2.1.9-dstat-lvn10.png, 
> stock-2.1.9-dstat-lvn10.png, yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig. Latency improved considerably and GC became a lot 
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-01 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-10249:
---
Attachment: yourkit-screenshot.png

> Reduce over-read for standard disk io by 16x
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
> Fix For: 2.1.x
>
> Attachments: yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig. Latency improved considerably and GC became a lot 
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-01 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-10249:
---
Attachment: stock-2.1.9-dstat-lvn10.png

> Reduce over-read for standard disk io by 16x
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
> Fix For: 2.1.x
>
> Attachments: stock-2.1.9-dstat-lvn10.png, yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig. Latency improved considerably and GC became a lot 
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-01 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-10249:
---
Attachment: patched-2.1.9-dstat-lvn10.png

> Reduce over-read for standard disk io by 16x
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
> Fix For: 2.1.x
>
> Attachments: patched-2.1.9-dstat-lvn10.png, 
> stock-2.1.9-dstat-lvn10.png, yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig. Latency improved considerably and GC became a lot 
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-01 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-10249:
---
Description: 
On read workloads, Cassandra 2.1 reads drastically more data than it emits over 
the network. This causes problems throughput the system by wasting disk IO and 
causing unnecessary GC.

I have reproduce the issue on clusters and locally with a single instance. The 
only requirement to reproduce the issue is enough data to blow through the page 
cache. The default schema and data size with cassandra-stress is sufficient for 
exposing the issue.

With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 disk:network 
ratio. That is to say, for 1MB/s of network IO, Cassandra was doing 300-500MB/s 
of disk reads, saturating the drive.

After applying this patch for standard IO mode 
https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
100:1 on my local test rig.

I tested with 512 byte reads as well, but got the same performance, which makes 
sense since all HDD and SSD made in the last few years have a 4K block size 
(many of them lie and say 512). Ideally, the reads in 

  was:
On read workloads, Cassandra 2.1 reads drastically more data than it emits over 
the network. This causes problems throughput the system by wasting disk IO and 
causing unnecessary GC.

I have reproduce the issue on clusters and locally with a single instance. The 
only requirement to reproduce the issue is enough data to blow through the page 
cache. The default schema and data size with cassandra-stress is sufficient for 
exposing the issue.

With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 disk:network 
ratio. That is to say, for 1MB/s of network IO, Cassandra was doing 300-500MB/s 
of disk reads, saturating the drive.

After applying this patch https://gist.github.com/tobert/10c307cf3709a585a7cf 
the ratio fell to around 100:1 on my local test rig.

I tested with 512 byte reads as well, but got the same performance, which makes 
sense since all HDD and SSD made in the last few years have a 4K block size 
(many of them lie and say 512).


> Reduce over-read for standard disk io by 16x
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
> Fix For: 2.1.x
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512). Ideally, the reads in 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-01 Thread Albert P Tobey (JIRA)
Albert P Tobey created CASSANDRA-10249:
--

 Summary: Reduce over-read for standard disk io by 16x
 Key: CASSANDRA-10249
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
 Project: Cassandra
  Issue Type: Improvement
Reporter: Albert P Tobey
 Fix For: 2.1.x


On read workloads, Cassandra 2.1 reads drastically more data than it emits over 
the network. This causes problems throughput the system by wasting disk IO and 
causing unnecessary GC.

I have reproduce the issue on clusters and locally with a single instance. The 
only requirement to reproduce the issue is enough data to blow through the page 
cache. The default schema and data size with cassandra-stress is sufficient for 
exposing the issue.

With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 disk:network 
ratio. That is to say, for 1MB/s of network IO, Cassandra was doing 300-500MB/s 
of disk reads, saturating the drive.

After applying this patch https://gist.github.com/tobert/10c307cf3709a585a7cf 
the ratio fell to around 100:1 on my local test rig.

I tested with 512 byte reads as well, but got the same performance, which makes 
sense since all HDD and SSD made in the last few years have a 4K block size 
(many of them lie and say 512).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size

2015-09-01 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726544#comment-14726544
 ] 

Albert P Tobey commented on CASSANDRA-8894:
---

Sorry I'm late to the thread, but the recently added patch seems very 
overcomplicated for little benefit.

I noticed the massive over-read in 2.1 and tracked it down independently by 
profiling with disk_access_mode: standard. I then ran a build with the 
DEFAULT_BUFFER_SIZE at 4K and saw an instant 4x increase in TXN/s on a simple 
-stress test with a 60% reduction in wasted disk IO. This over-read is causing 
performance problems on every Cassandra 2.1 cluster that isn't 100% writes. It 
doesn't always show up because of the massive amount of RAM a lot of folks are 
running, but under low memory situations it is killing even very fast SSDs.

Patch for 2.1: https://gist.github.com/tobert/10c307cf3709a585a7cf

In reading through the history, I think this is being overthought. If anything, 
the readahead and buffering in the read path should be *removed* and instead 
issue precise reads wherever it's possible. For now, the change to a 4K buffer 
size should be added to 2.1 in order to significantly speed up read workloads.

> Our default buffer size for (uncompressed) buffered reads should be smaller, 
> and based on the expected record size
> --
>
> Key: CASSANDRA-8894
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Stefania
>  Labels: benedict-to-commit
> Fix For: 3.0 alpha 1
>
> Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml
>
>
> A large contributor to slower buffered reads than mmapped is likely that we 
> read a full 64Kb at once, when average record sizes may be as low as 140 
> bytes on our stress tests. The TLB has only 128 entries on a modern core, and 
> each read will touch 32 of these, meaning we are unlikely to almost ever be 
> hitting the TLB, and will be incurring at least 30 unnecessary misses each 
> time (as well as the other costs of larger than necessary accesses). When 
> working with an SSD there is little to no benefit reading more than 4Kb at 
> once, and in either case reading more data than we need is wasteful. So, I 
> propose selecting a buffer size that is the next larger power of 2 than our 
> average record size (with a minimum of 4Kb), so that we expect to read in one 
> operation. I also propose that we create a pool of these buffers up-front, 
> and that we ensure they are all exactly aligned to a virtual page, so that 
> the source and target operations each touch exactly one virtual page per 4Kb 
> of expected record size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10249) Reduce over-read for standard disk io by 16x

2015-09-01 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-10249:
---
Description: 
On read workloads, Cassandra 2.1 reads drastically more data than it emits over 
the network. This causes problems throughput the system by wasting disk IO and 
causing unnecessary GC.

I have reproduce the issue on clusters and locally with a single instance. The 
only requirement to reproduce the issue is enough data to blow through the page 
cache. The default schema and data size with cassandra-stress is sufficient for 
exposing the issue.

With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 disk:network 
ratio. That is to say, for 1MB/s of network IO, Cassandra was doing 300-500MB/s 
of disk reads, saturating the drive.

After applying this patch for standard IO mode 
https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
100:1 on my local test rig. Latency improved considerably and GC became a lot 
less frequent.

I tested with 512 byte reads as well, but got the same performance, which makes 
sense since all HDD and SSD made in the last few years have a 4K block size 
(many of them lie and say 512).

I'm re-running the numbers now and will post them tomorrow.

  was:
On read workloads, Cassandra 2.1 reads drastically more data than it emits over 
the network. This causes problems throughput the system by wasting disk IO and 
causing unnecessary GC.

I have reproduce the issue on clusters and locally with a single instance. The 
only requirement to reproduce the issue is enough data to blow through the page 
cache. The default schema and data size with cassandra-stress is sufficient for 
exposing the issue.

With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 disk:network 
ratio. That is to say, for 1MB/s of network IO, Cassandra was doing 300-500MB/s 
of disk reads, saturating the drive.

After applying this patch for standard IO mode 
https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
100:1 on my local test rig.

I tested with 512 byte reads as well, but got the same performance, which makes 
sense since all HDD and SSD made in the last few years have a 4K block size 
(many of them lie and say 512). Ideally, the reads in 


> Reduce over-read for standard disk io by 16x
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
> Fix For: 2.1.x
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig. Latency improved considerably and GC became a lot 
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9946) use ioprio_set on compaction threads by default instead of manually throttling

2015-08-03 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651488#comment-14651488
 ] 

Albert P Tobey edited comment on CASSANDRA-9946 at 8/3/15 6:22 AM:
---

Here's a script for pinning compaction to cores and ionice in one go: 
https://gist.github.com/tobert/97c52f80fdff2ba79ee9

Comment out the 'taskset' line to mess with ionice in isolation.

I agree with Ariel and usually advise people to always use the deadline io 
scheduler, but ...

FWIW I think it's possible to tune up CFQ to be acceptable. There isn't a lot 
of existing advice on the internet about how to do it, but it's doable. I've 
seen some references in various Redhat low-latency guides but have yet to try 
it out. Even if many users choose deadline/noop for peak throughput, others may 
prefer the performance tradeoff of CFQ if there is a payback of more 
predictable/smooth performance.

That's not to mention the large number of setups that never tweak the disk 
scheduler at all. Setting compaction IO to idle class will benefit some folks 
and doesn't hurt those on noop/deadline.


was (Author: ato...@datastax.com):
Here's a script for pinning compaction to cores and ionice in one go: 
https://gist.github.com/tobert/97c52f80fdff2ba79ee9

Comment out the 'taskset' line to mess with ionice in isolation.

FWIW I think it's possible to tune up CFQ to be acceptable. There isn't a lot 
of existing advice on the internet about how to do it, but it's doable. I've 
seen some references in various Redhat low-latency guides but have yet to try 
it out. Even if many users choose deadline/noop for peak throughput, others may 
prefer the performance tradeoff of CFQ if there is a payback of more 
predictable/smooth performance.

That's not to mention the large number of setups that never tweak the disk 
scheduler at all. Setting compaction IO to idle class will benefit some folks 
and doesn't hurt those on noop/deadline.

 use ioprio_set on compaction threads by default instead of manually throttling
 --

 Key: CASSANDRA-9946
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9946
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
 Fix For: 3.x


 Compaction throttling works as designed, but it has two drawbacks:
 * it requires manual tuning to choose the right value for a given machine
 * it does not allow compaction to burst above its limit if there is 
 additional i/o capacity available while there are less application requests 
 to serve
 Using ioprio_set instead solves both of these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9946) use ioprio_set on compaction threads by default instead of manually throttling

2015-08-03 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651488#comment-14651488
 ] 

Albert P Tobey commented on CASSANDRA-9946:
---

Here's a script for pinning compaction to cores and ionice in one go: 
https://gist.github.com/tobert/97c52f80fdff2ba79ee9

Comment out the 'taskset' line to mess with ionice in isolation.

FWIW I think it's possible to tune up CFQ to be acceptable. There isn't a lot 
of existing advice on the internet about how to do it, but it's doable. I've 
seen some references in various Redhat low-latency guides but have yet to try 
it out. Even if many users choose deadline/noop for peak throughput, others may 
prefer the performance tradeoff of CFQ if there is a payback of more 
predictable/smooth performance.

That's not to mention the large number of setups that never tweak the disk 
scheduler at all. Setting compaction IO to idle class will benefit some folks 
and doesn't hurt those on noop/deadline.

 use ioprio_set on compaction threads by default instead of manually throttling
 --

 Key: CASSANDRA-9946
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9946
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
 Fix For: 3.x


 Compaction throttling works as designed, but it has two drawbacks:
 * it requires manual tuning to choose the right value for a given machine
 * it does not allow compaction to burst above its limit if there is 
 additional i/o capacity available while there are less application requests 
 to serve
 Using ioprio_set instead solves both of these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9946) use ioprio_set on compaction threads by default instead of manually throttling

2015-08-03 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651488#comment-14651488
 ] 

Albert P Tobey edited comment on CASSANDRA-9946 at 8/3/15 6:23 AM:
---

Here's a script for pinning compaction to cores and ionice in one go: 
https://gist.github.com/tobert/97c52f80fdff2ba79ee9

Comment out the 'taskset' line to mess with ionice in isolation.

I agree with Ariel and usually advise people to always use the deadline io 
scheduler, but ...

I think it's possible to tune up CFQ to be acceptable. There isn't a lot of 
existing advice on the internet about how to do it, but it's doable. I've seen 
some references in various Redhat low-latency guides but have yet to try it 
out. Even if many users choose deadline/noop for peak throughput, others may 
prefer the performance tradeoff of CFQ if there is a payback of more 
predictable/smooth performance.

That's not to mention the large number of setups that never tweak the disk 
scheduler at all. Setting compaction IO to idle class will benefit some folks 
and doesn't hurt those on noop/deadline.


was (Author: ato...@datastax.com):
Here's a script for pinning compaction to cores and ionice in one go: 
https://gist.github.com/tobert/97c52f80fdff2ba79ee9

Comment out the 'taskset' line to mess with ionice in isolation.

I agree with Ariel and usually advise people to always use the deadline io 
scheduler, but ...

FWIW I think it's possible to tune up CFQ to be acceptable. There isn't a lot 
of existing advice on the internet about how to do it, but it's doable. I've 
seen some references in various Redhat low-latency guides but have yet to try 
it out. Even if many users choose deadline/noop for peak throughput, others may 
prefer the performance tradeoff of CFQ if there is a payback of more 
predictable/smooth performance.

That's not to mention the large number of setups that never tweak the disk 
scheduler at all. Setting compaction IO to idle class will benefit some folks 
and doesn't hurt those on noop/deadline.

 use ioprio_set on compaction threads by default instead of manually throttling
 --

 Key: CASSANDRA-9946
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9946
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
 Fix For: 3.x


 Compaction throttling works as designed, but it has two drawbacks:
 * it requires manual tuning to choose the right value for a given machine
 * it does not allow compaction to burst above its limit if there is 
 additional i/o capacity available while there are less application requests 
 to serve
 Using ioprio_set instead solves both of these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9274) Changing memtable_flush_writes per recommendations in cassandra.yaml causes memtable_cleanup_threshold to be too small

2015-06-11 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581497#comment-14581497
 ] 

Albert P Tobey commented on CASSANDRA-9274:
---

I've been messing with these values lately and have observed some poor behavior 
around them. The comment in cassandra.yaml is misleading.

I agree we shouldn't set the threshold by default, but I would like to see a 
comment added to memtable_flush_writers indicating that if you add a large 
number of flush writers, the default memtable_cleanup_threshold is going to end 
up small, which leads to small flushes and more frequent compaction.

It makes some sense to set the memtable_cleanup_threshold based on the expected 
number of tables rather than cores or disks. I might be wrong, but that seems 
more relevant than the hardware or even the number of flush writer threads.

 Changing memtable_flush_writes per recommendations in cassandra.yaml causes  
 memtable_cleanup_threshold to be too small
 ---

 Key: CASSANDRA-9274
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9274
 Project: Cassandra
  Issue Type: Improvement
Reporter: Donald Smith
Priority: Minor

 It says in cassandra.yaml:
 {noformat}
 # If your data directories are backed by SSD, you should increase this
 # to the number of cores.
 #memtable_flush_writers: 8
 {noformat}
 so we raised it to 24.
 Much later we noticed a warning in the logs:
 {noformat}
 WARN  [main] 2015-04-22 15:32:58,619 DatabaseDescriptor.java:539 - 
 memtable_cleanup_threshold is set very low, which may cause performance 
 degradation
 {noformat}
 Looking at cassandra.yaml again I see:
 {noformat}
 # memtable_cleanup_threshold defaults to 1 / (memtable_flush_writers + 1)
 # memtable_cleanup_threshold: 0.11
 #memtable_cleanup_threshold: 0.11
 {noformat}
 So, I uncommented that last line (figuring that 0.11 is a reasonable value).
 Cassandra.yaml should give better guidance or the code should *prevent* the 
 value from going outside a reasonable range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters

2015-06-05 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575206#comment-14575206
 ] 

Albert P Tobey commented on CASSANDRA-8150:
---

I did some testing on EC2 with Cassandra 2.0 and G1GC and found the following 
settings to work well. Make sure to comment out the -Xmn line as shown.

{code:sh}

MAX_HEAP_SIZE=16G
HEAP_NEWSIZE=”2G” # placeholder, ignored

# setting -Xmn breaks G1GC, don’t do it
#JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}

# G1GC support ato...@datastax.com 2015-04-03
JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

# Cassandra does not benefit from biased locking
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking

# lowering the pause target will lower throughput
# 200ms is the default and lowest viable setting for G1GC
# 1000ms seems to provide good balance of throughput and latency
JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=1000

# auto-optimize thread local allocation block size
# https://blogs.oracle.com/jonthecollector/entry/the_real_thi
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB

{code}

 Revaluate Default JVM tuning parameters
 ---

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Ryan McGuire
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8150) Revaluate Default JVM tuning parameters

2015-06-05 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575206#comment-14575206
 ] 

Albert P Tobey edited comment on CASSANDRA-8150 at 6/5/15 8:56 PM:
---

I did some testing on EC2 with Cassandra 2.0 and G1GC and found the following 
settings to work well. Make sure to comment out the -Xmn line as shown.

{code}

MAX_HEAP_SIZE=16G
HEAP_NEWSIZE=”2G” # placeholder, ignored

# setting -Xmn breaks G1GC, don’t do it
#JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}

# G1GC support ato...@datastax.com 2015-04-03
JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

# Cassandra does not benefit from biased locking
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking

# lowering the pause target will lower throughput
# 200ms is the default and lowest viable setting for G1GC
# 1000ms seems to provide good balance of throughput and latency
JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=1000

# auto-optimize thread local allocation block size
# https://blogs.oracle.com/jonthecollector/entry/the_real_thi
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB

{code}


was (Author: ato...@datastax.com):
I did some testing on EC2 with Cassandra 2.0 and G1GC and found the following 
settings to work well. Make sure to comment out the -Xmn line as shown.

{code:sh}

MAX_HEAP_SIZE=16G
HEAP_NEWSIZE=”2G” # placeholder, ignored

# setting -Xmn breaks G1GC, don’t do it
#JVM_OPTS=$JVM_OPTS -Xmn${HEAP_NEWSIZE}

# G1GC support ato...@datastax.com 2015-04-03
JVM_OPTS=$JVM_OPTS -XX:+UseG1GC

# Cassandra does not benefit from biased locking
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking

# lowering the pause target will lower throughput
# 200ms is the default and lowest viable setting for G1GC
# 1000ms seems to provide good balance of throughput and latency
JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=1000

# auto-optimize thread local allocation block size
# https://blogs.oracle.com/jonthecollector/entry/the_real_thi
JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB

{code}

 Revaluate Default JVM tuning parameters
 ---

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Ryan McGuire
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9517) Switch to DTCS for hint storage

2015-05-29 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565433#comment-14565433
 ] 

Albert P Tobey commented on CASSANDRA-9517:
---

The big question is if we can get away without the major compaction and it 
sounds like that's not feasible at this point. A new, simple compaction 
strategy that does no merging and only tombstone cleanup might be the best bet 
in the short term.

 Switch to DTCS for hint storage
 ---

 Key: CASSANDRA-9517
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9517
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremy Hanna
 Fix For: 2.1.6


 The DateTieredCompactionStrategy is a good choice for HintedHandoff so that 
 we reduce the compaction load we incur when users build up hints.  
 [~ato...@datastax.com] and others have tried the following patch in various 
 setups and have seen significantly less load from hint compaction.
 https://gist.github.com/tobert/c069af27e3f8840d137d
 Setting the time window to 10 minutes has shown additional improvement.
 [~krummas] do you have any feedback about this idea and/or settings?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9517) Switch to DTCS for hint storage

2015-05-29 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565469#comment-14565469
 ] 

Albert P Tobey commented on CASSANDRA-9517:
---

My original theory was that we could use DTCS for system.hints since it has a 
timeseries-like table definition and let it delete whole tables when the TTLs 
expire. That was before I understood exactly how tombstones are used in hints. 
The patch seemed to help a little in testing, but I did not figure out why it 
seemed that way.

The forced major compaction is most of the problem when hints build up, so 
that's the thing that needs to be removed if at all possible. Under 100% write 
workload on very fast machines I was seeing system.hints compactions in excess 
of 100GB, which has all kinds of negative side-effects.

If there's a way we can convince any of the compaction strategies to split the 
wide rows across sstables (split by time window) while only merging tombstones 
along with subsequent cleanup, that could make hints tolerable until 3.0 takes 
over the world.


 Switch to DTCS for hint storage
 ---

 Key: CASSANDRA-9517
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9517
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremy Hanna
 Fix For: 2.1.6


 The DateTieredCompactionStrategy is a good choice for HintedHandoff so that 
 we reduce the compaction load we incur when users build up hints.  
 [~ato...@datastax.com] and others have tried the following patch in various 
 setups and have seen significantly less load from hint compaction.
 https://gist.github.com/tobert/c069af27e3f8840d137d
 Setting the time window to 10 minutes has shown additional improvement.
 [~krummas] do you have any feedback about this idea and/or settings?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-28 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563219#comment-14563219
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

I tested a number of different pause targets on a wide variety of machines. 
While the 200ms default is often fine on big machines with real CPUs, in 
Ghz-constrained environments like EC2 PVM or LV Xeons, throughput dropped 
considerably so that the GC could hit the pause target. I initially tested at 
1000ms and 2000ms but settled on 500ms because it provides most of the benefit 
of a more generous pause target while being far enough below the current 
read/write timeouts in cassandra.yaml to make sure that pauses never/rarely hit 
those limits.

 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-27 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561432#comment-14561432
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

Updated patches with spelling and whitespace fixes:

https://github.com/tobert/cassandra/commits/g1gc-2

https://github.com/tobert/cassandra/commit/419d39814985a6ef165fdbafee5f1b84bf2f197b
https://github.com/tobert/cassandra/commit/89d40af978eaeb02185726a63257d979111ad317
https://github.com/tobert/cassandra/commit/0f70469985d62aeadc20b41dc9cdc9d72a035c64


 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-26 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-7486:
--
Comment: was deleted

(was: Yeah. I started on the Powershell scripts but figured I should talk to 
someone more knowledgeable on Windows before making the change.

If you want a straight port I can throw that together and do a quick test on my 
local Windows machine.)

 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-26 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560200#comment-14560200
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

https://github.com/tobert/cassandra/commit/0759be3b2a2a8ded0098622dcb95c0eb47d79fd3

 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-05-26 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559669#comment-14559669
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

https://github.com/tobert/cassandra/tree/g1gc

https://github.com/tobert/cassandra/commit/33bf6719e0c8e84672c3633f8ecce602affc3071
https://github.com/tobert/cassandra/commit/cafee86c3c5798e423689a26b43d05ed9312adc5

 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-26 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559851#comment-14559851
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

Yeah. I started on the Powershell scripts but figured I should talk to someone 
more knowledgeable on Windows before making the change.

If you want a straight port I can throw that together and do a quick test on my 
local Windows machine.

 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-26 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559852#comment-14559852
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

Yeah. I started on the Powershell scripts but figured I should talk to someone 
more knowledgeable on Windows before making the change.

If you want a straight port I can throw that together and do a quick test on my 
local Windows machine.

 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-26 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559853#comment-14559853
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

Yeah. I started on the Powershell scripts but figured I should talk to someone 
more knowledgeable on Windows before making the change.

If you want a straight port I can throw that together and do a quick test on my 
local Windows machine.

 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-7486) Migrate to G1GC by default

2015-05-26 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey updated CASSANDRA-7486:
--
Comment: was deleted

(was: Yeah. I started on the Powershell scripts but figured I should talk to 
someone more knowledgeable on Windows before making the change.

If you want a straight port I can throw that together and do a quick test on my 
local Windows machine.)

 Migrate to G1GC by default
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: New Feature
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-05-20 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553067#comment-14553067
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

I'll attach a patch ASAP.

 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 3.0 beta 1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8651) Add support for running on Apache Mesos

2015-05-20 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553408#comment-14553408
 ] 

Albert P Tobey commented on CASSANDRA-8651:
---

The code is here: https://github.com/mesosphere/cassandra-mesos

 Add support for running on Apache Mesos
 ---

 Key: CASSANDRA-8651
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8651
 Project: Cassandra
  Issue Type: Task
Reporter: Ben Whitehead
Priority: Minor
 Fix For: 3.x


 As a user of Apache Mesos I would like to be able to run Cassandra on my 
 Mesos cluster. This would entail integration of Cassandra on Mesos through 
 the creation of a production level Mesos framework. This would enable me to 
 avoid static partitioning and inefficiencies and run Cassandra as part of my 
 data center infrastructure.
 http://mesos.apache.org/documentation/latest/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-05-19 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550741#comment-14550741
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

Did you run into evacuation failures? How big was your heap? I haven't seen any 
evac failures with 2.1 and Java 8. This is one of the things that was worked on 
for Hotspot 1.8. Then again maybe it's Solr that needs the help.

I suspect you can remove a lot of these settings on Java 8, but have also 
discovered that setting the GC threads is necessary on many machines.

Try adding the below line for a nice decrease in p99 latencies.

JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5

 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.x


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-05-01 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523593#comment-14523593
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

if I am reading correctly there was pretty never an old generation collection 
under the workload I looked at. The old gen was growing but never reached the 
point it needed to do an old gen GC.

^ G1 doesn't work that way.

Another behavior to consider is worst case pause time when there is 
fragmentation.

^ G1 performs compaction. It's fairly easy to trigger and observe in gc.log 
with Cassandra 2.0. It takes more work with 2.1 since it seems to be easier on 
the GC.

I'll see if I can find some time to generate graphs to make all this more 
convincing, but time is short because I'm spending all of my time tuning users' 
clusters where the #1 first issue every time is getting CMS to behave.

 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.x


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-05-01 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523724#comment-14523724
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

I only started messing with G1 this year, so I only know the old behavior by 
lore I've read and heard. I have not observed significant problems it in the 
~20-40 hours I've spent tuning clusters with G1 recently.

I don't recommend anyone try G1 on JDK 7  u75 or JDK 8  u40 (although it's 
probably OK down to u20 according to the docs I've read). I did some testing on 
JDK7u75 and it was stable but didn't spend much time on it since JDK8u40 gave a 
nice bump in performance (5-10% on a customer cluster) by just switching JDKs 
and nothing else.

From what I've read about the reference clearing issues, there is a new-ish 
setting to enable parallel reference collection, -XX:+ParallelRefProcEnabled. 
The advice in the docs is to only turn it on if a significant amount of time 
is spent on RefProc collection, e.g.   [Ref Proc: 5.2 ms]. I pulled that 
from a log I had handy and that is high enough that we might want to consider 
enabling the flag, but in most of my observations it hovers around 0.1ms under 
saturation load.


 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.x


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-29 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519791#comment-14519791
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

[~aweisberg] comparing promotions between G1 and CMS doesn't really make sense 
IMO. G1 promotions simply mark memory where it is without copying. After a 
threshold it will compact surviving cards into a single region. What I've 
observed is that compaction is rarely necessary with a big enough G1 heap. With 
a saturation write workload on Cassandra 2.1 only ~100-200MB seems to stick 
around for the long haul with almost all the rest getting cycled every few 
minutes (in an 8GB heap).

[~yangzhe1991] I would keep the default heap at 8GB. I have tested with G1 at 
16GB on a 30GB m3.2xlarge on EC2 and it generally gets better throughput and 
latency because there's more space for G1 to waste (that's what they call 
it). Intel tested up to 100GB with Hbase at 200ms pause target and said nice 
things about it. I don't see much need for C* to hit that size but it's 
certainly doable with G1. The main problem is smaller heaps where G1 starts to 
struggle a little, but I found that it still works OK down to 512MB, even if a 
bit less efficient than CMS since G1 targets ~10% CPU time for GC while the 
others target 1% by default.

Throughput / latency is always a tradeoff and in the case of G1 with 
non-aggressive latency targets (-XX:MaxGCPauseMillis=2000) the throughput is 
darn close to CMS with considerably improved standard deviation on latency. IMO 
that's a great tradeoff as most of the users I talk to in the wild mostly 
struggle with getting reliable latency rather than throughput.

IMO consistent performance should always take precedence over maximum 
performance/throughput. G1 provides a much more consistent experience with 
fewer knobs to mess with (especially tuning eden size, which is still a black 
art that nearly every installation I've looked at gets wrong).

 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.x


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries

2015-04-20 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503111#comment-14503111
 ] 

Albert P Tobey commented on CASSANDRA-9193:
---

Maybe just steal this? 
https://github.com/datastax/nodejs-driver/blob/master/lib/tokenizer.js

 Facility to write dynamic code to selectively trigger trace or log for queries
 --

 Key: CASSANDRA-9193
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193
 Project: Cassandra
  Issue Type: New Feature
Reporter: Matt Stump

 I want the equivalent of dtrace for Cassandra. I want the ability to 
 intercept a query with a dynamic script (assume JS) and based on logic in 
 that script trigger the statement for trace or logging. 
 Examples 
 - Trace only INSERT statements to a particular CF. 
 - Trace statements for a particular partition or consistency level.
 - Log statements that fail to reach the desired consistency for read or write.
 - Log If the request size for read or write exceeds some threshold
 At some point in the future it would be helpful to also do things such as log 
 partitions greater than X bytes or Z cells when performing compaction. 
 Essentially be able to inject custom code dynamically without a reboot to the 
 different stages of C*. 
 The code should be executed synchronously as part of the monitored task, but 
 we should provide the ability to log or execute CQL asynchronously from the 
 provided API.
 Further down the line we could use this functionality to modify/rewrite 
 requests or tasks dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-16 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497603#comment-14497603
 ] 

Albert P Tobey edited comment on CASSANDRA-7486 at 4/16/15 6:05 AM:


My benchmarks completed. These were run on 6 quad-core Intel NUCs with 16GB RAM 
/ 240GB SSD / gigabit ethernet. The CPUs are fairly slow at 1.3Ghz i5-4250U. 
Cassandra 2.1.4 / Oracle JDK 8u40 / CoreOS 647.0.0 / Linux 3.19.3 (bare metal - 
no container). The tests were automated with a complete cluster rebuild between 
tests and caches dropped before starting Cassandra each time.

The big win with G1 IMO is that it is auto-tuning. I've been running it on a 
few other kinds of machines and it generally does much better with more CPU 
power.

cassandra-stress was run with an increased heap but is otherwise unmodified 
from Cassandra 2.1.4. I checked the gc log regularly and did not see many 
pauses for stress itself above 1ms here  there, with most pauses in the 
~300usec range. The three stress nodes I had available are all quad-cores: 
i7-2600/3.4Ghz/8GB, Xeon-E31270/3.4Ghz/16GB, i5-4250U/1.3Ghz/16GB.

The final output of the stress is available here:

https://docs.google.com/a/datastax.com/spreadsheets/d/19Eb7HGkd5rFUD_C0ZALbK6-R4fPF9vJRr8BrvxBwo38/edit?usp=sharing
http://tobert.org/downloads/cassandra-2.1-cms-vs-g1.csv

The stress commands, system.log, GC logs, conf directory from all the servers, 
and full stress logs are available on my webserver here:

http://tobert.org/downloads/cassandra-2.1-cms-vs-g1-data.tar.gz (35MB)



was (Author: ato...@datastax.com):
My benchmarks completed. These were run on 6 quad-core Intel NUCs with 16GB RAM 
/ 240GB SSD / gigabit ethernet. The CPUs are fairly slow at 1.4Ghz. Cassandra 
2.1.4 / Oracle JDK 8u40 / CoreOS 647.0.0 / Linux 3.19.3 (bare metal - no 
container). The tests were automated with a complete cluster rebuild between 
tests and caches dropped before starting Cassandra each time.

The big win with G1 IMO is that it is auto-tuning. I've been running it on a 
few other kinds of machines and it generally does much better with more CPU 
power.

cassandra-stress was run with an increased heap but is otherwise unmodified 
from Cassandra 2.1.4. I checked the gc log regularly and did not see many 
pauses for stress itself above 1ms here  there, with most pauses in the 
~300usec range.

The final output of the stress is available here:

https://docs.google.com/a/datastax.com/spreadsheets/d/19Eb7HGkd5rFUD_C0ZALbK6-R4fPF9vJRr8BrvxBwo38/edit?usp=sharing
http://tobert.org/downloads/cassandra-2.1-cms-vs-g1.csv

The stress commands, system.log, GC logs, conf directory from all the servers, 
and full stress logs are available on my webserver here:

http://tobert.org/downloads/cassandra-2.1-cms-vs-g1-data.tar.gz (35MB)


 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.5


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-16 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497603#comment-14497603
 ] 

Albert P Tobey edited comment on CASSANDRA-7486 at 4/16/15 6:11 AM:


My benchmarks completed. These were run on 6 quad-core Intel NUCs with 16GB RAM 
/ 240GB SSD / gigabit ethernet. The CPUs are fairly slow at 1.3Ghz i5-4250U. 
Cassandra 2.1.4 / Oracle JDK 8u40 / CoreOS 647.0.0 / Linux 3.19.3 (bare metal - 
no container). The tests were automated with a complete cluster rebuild between 
tests and caches dropped before starting Cassandra each time.

The big win with G1 IMO is that it is auto-tuning. I've been running it on a 
few other kinds of machines and it generally does much better with more CPU 
power.

cassandra-stress was run with an increased heap but is otherwise unmodified 
from Cassandra 2.1.4. I checked the gc log regularly and did not see many 
pauses for stress itself above 1ms here  there, with most pauses in the 
~300usec range. The three stress nodes I had available are all quad-cores: 
i7-2600/3.4Ghz/8GB, Xeon-E31270/3.4Ghz/16GB, i5-4250U/1.3Ghz/16GB.

These were saturation tests. In all but the G1 @ 256MB test the stress runs 
were stable and the systems' CPUs were at 100% pretty much the whole time. The 
numbers smooth out a lot for all of the combinations of GC settings at more 
pedestrian throughput. I will kick that off when I get a chance, which will be 
~2 weeks from now.

The final output of the stress is available here:

https://docs.google.com/a/datastax.com/spreadsheets/d/19Eb7HGkd5rFUD_C0ZALbK6-R4fPF9vJRr8BrvxBwo38/edit?usp=sharing
http://tobert.org/downloads/cassandra-2.1-cms-vs-g1.csv

The stress commands, system.log, GC logs, conf directory from all the servers, 
and full stress logs are available on my webserver here:

http://tobert.org/downloads/cassandra-2.1-cms-vs-g1-data.tar.gz (35MB)



was (Author: ato...@datastax.com):
My benchmarks completed. These were run on 6 quad-core Intel NUCs with 16GB RAM 
/ 240GB SSD / gigabit ethernet. The CPUs are fairly slow at 1.3Ghz i5-4250U. 
Cassandra 2.1.4 / Oracle JDK 8u40 / CoreOS 647.0.0 / Linux 3.19.3 (bare metal - 
no container). The tests were automated with a complete cluster rebuild between 
tests and caches dropped before starting Cassandra each time.

The big win with G1 IMO is that it is auto-tuning. I've been running it on a 
few other kinds of machines and it generally does much better with more CPU 
power.

cassandra-stress was run with an increased heap but is otherwise unmodified 
from Cassandra 2.1.4. I checked the gc log regularly and did not see many 
pauses for stress itself above 1ms here  there, with most pauses in the 
~300usec range. The three stress nodes I had available are all quad-cores: 
i7-2600/3.4Ghz/8GB, Xeon-E31270/3.4Ghz/16GB, i5-4250U/1.3Ghz/16GB.

The final output of the stress is available here:

https://docs.google.com/a/datastax.com/spreadsheets/d/19Eb7HGkd5rFUD_C0ZALbK6-R4fPF9vJRr8BrvxBwo38/edit?usp=sharing
http://tobert.org/downloads/cassandra-2.1-cms-vs-g1.csv

The stress commands, system.log, GC logs, conf directory from all the servers, 
and full stress logs are available on my webserver here:

http://tobert.org/downloads/cassandra-2.1-cms-vs-g1-data.tar.gz (35MB)


 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.5


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-15 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497603#comment-14497603
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

My benchmarks completed. These were run on 6 quad-core Intel NUCs with 16GB RAM 
/ 240GB SSD / gigabit ethernet. Cassandra 2.1.4. The CPUs are fairly slow at 
1.4Ghz. The tests were automated with a complete cluster rebuild between tests 
and caches dropped before starting Cassandra each time.

The big win with G1 IMO is that it is auto-tuning. I've been running it on a 
few other kinds of machines and it generally does much better with more CPU 
power.

cassandra-stress was run with an increased heap but is otherwise unmodified 
from Cassandra 2.1.4. I checked the gc log regularly and did not see many 
pauses for stress itself above 1ms here  there, with most pauses in the 
~300usec range.

The final output of the stress is available here:

https://docs.google.com/a/datastax.com/spreadsheets/d/19Eb7HGkd5rFUD_C0ZALbK6-R4fPF9vJRr8BrvxBwo38/edit?usp=sharing
http://tobert.org/downloads/cassandra-2.1-cms-vs-g1.csv

The stress commands, system.log, GC logs, conf directory from all the servers, 
and full stress logs are available on my webserver here:

http://tobert.org/downloads/cassandra-2.1-cms-vs-g1-data.tar.gz (35MB)


 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.5


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-15 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497603#comment-14497603
 ] 

Albert P Tobey edited comment on CASSANDRA-7486 at 4/16/15 5:58 AM:


My benchmarks completed. These were run on 6 quad-core Intel NUCs with 16GB RAM 
/ 240GB SSD / gigabit ethernet. The CPUs are fairly slow at 1.4Ghz. Cassandra 
2.1.4 / Oracle JDK 8u40 / CoreOS 647.0.0 / Linux 3.19.3 (bare metal - no 
container). The tests were automated with a complete cluster rebuild between 
tests and caches dropped before starting Cassandra each time.

The big win with G1 IMO is that it is auto-tuning. I've been running it on a 
few other kinds of machines and it generally does much better with more CPU 
power.

cassandra-stress was run with an increased heap but is otherwise unmodified 
from Cassandra 2.1.4. I checked the gc log regularly and did not see many 
pauses for stress itself above 1ms here  there, with most pauses in the 
~300usec range.

The final output of the stress is available here:

https://docs.google.com/a/datastax.com/spreadsheets/d/19Eb7HGkd5rFUD_C0ZALbK6-R4fPF9vJRr8BrvxBwo38/edit?usp=sharing
http://tobert.org/downloads/cassandra-2.1-cms-vs-g1.csv

The stress commands, system.log, GC logs, conf directory from all the servers, 
and full stress logs are available on my webserver here:

http://tobert.org/downloads/cassandra-2.1-cms-vs-g1-data.tar.gz (35MB)



was (Author: ato...@datastax.com):
My benchmarks completed. These were run on 6 quad-core Intel NUCs with 16GB RAM 
/ 240GB SSD / gigabit ethernet. Cassandra 2.1.4. The CPUs are fairly slow at 
1.4Ghz. The tests were automated with a complete cluster rebuild between tests 
and caches dropped before starting Cassandra each time.

The big win with G1 IMO is that it is auto-tuning. I've been running it on a 
few other kinds of machines and it generally does much better with more CPU 
power.

cassandra-stress was run with an increased heap but is otherwise unmodified 
from Cassandra 2.1.4. I checked the gc log regularly and did not see many 
pauses for stress itself above 1ms here  there, with most pauses in the 
~300usec range.

The final output of the stress is available here:

https://docs.google.com/a/datastax.com/spreadsheets/d/19Eb7HGkd5rFUD_C0ZALbK6-R4fPF9vJRr8BrvxBwo38/edit?usp=sharing
http://tobert.org/downloads/cassandra-2.1-cms-vs-g1.csv

The stress commands, system.log, GC logs, conf directory from all the servers, 
and full stress logs are available on my webserver here:

http://tobert.org/downloads/cassandra-2.1-cms-vs-g1-data.tar.gz (35MB)


 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.5


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries

2015-04-14 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495157#comment-14495157
 ] 

Albert P Tobey commented on CASSANDRA-9193:
---

Javascript makes sense since Nashorn ships with Java 8. No dependencies to add 
to C*.

Looks like you can get at a lot of the stuff we'd need as soon as a REPL or 
some way to run scripts is available:
http://moduscreate.com/javascript-and-the-jvm/

 Facility to write dynamic code to selectively trigger trace or log for queries
 --

 Key: CASSANDRA-9193
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193
 Project: Cassandra
  Issue Type: New Feature
Reporter: Matt Stump

 I want the equivalent of dtrace for Cassandra. I want the ability to 
 intercept a query with a dynamic script (assume JS) and based on logic in 
 that script trigger the statement for trace or logging. 
 Examples 
 - Trace only INSERT statements to a particular CF. 
 - Trace statements for a particular partition or consistency level.
 - Log statements that fail to reach the desired consistency for read or write.
 - Log If the request size for read or write exceeds some threshold
 At some point in the future it would be helpful to also do things such as log 
 partitions greater than X bytes or Z cells when performing compaction. 
 Essentially be able to inject custom code dynamically without a reboot to the 
 different stages of C*. 
 The code should be executed synchronously as part of the monitored task, but 
 we should provide the ability to log or execute CQL asynchronously from the 
 provided API.
 Further down the line we could use this functionality to modify/rewrite 
 requests or tasks dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-08 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485579#comment-14485579
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

This is with 2.0 / OpenJDK8 since that's what I had running. Same everything 
each run except for heap size. cassandra-stress 2.1.4 read workload / 800 
threads. I'll re-run with 2.1 / Oracle JDK8 and some mixed load.

-XX:+UseG1GC

Also: -XX:+UseTLAB -XX:+ResizeTLAB  -XX:-UseBiasedLocking -XX:+AlwaysPreTouch 
but maybe those should go in a different ticket.

8GB:

op rate   : 139805
partition rate: 139805
row rate  : 139805
latency mean  : 5.7
latency median: 4.2
latency 95th percentile   : 13.2
latency 99th percentile   : 18.5
latency 99.9th percentile : 21.1
latency max   : 303.8

512MB:

op rate   : 114214
partition rate: 114214
row rate  : 114214
latency mean  : 7.0
latency median: 3.7
latency 95th percentile   : 12.4
latency 99th percentile   : 14.7
latency 99.9th percentile : 15.3
latency max   : 307.1

256MB:

op rate   : 60028
partition rate: 60028
row rate  : 60028
latency mean  : 13.3
latency median: 4.0
latency 95th percentile   : 44.7
latency 99th percentile   : 73.5
latency 99.9th percentile : 79.6
latency max   : 1105.4

Same everything with mostly stock CMS settings for 2.0. I added the  
-XX:+UseTLAB -XX:+ResizeTLAB  -XX:-UseBiasedLocking -XX:+AlwaysPreTouch 
settings to keep the numbers comparable to all of my other data.

8GB/1GB:

op rate   : 119155
partition rate: 119155
row rate  : 119155
latency mean  : 6.7
latency median: 4.1
latency 95th percentile   : 11.8
latency 99th percentile   : 15.5
latency 99.9th percentile : 17.3
latency max   : 520.2


512MB ( -XX:+UseAdaptiveSizePolicy):

op rate   : 82375
partition rate: 82375
row rate  : 82375
latency mean  : 9.7
latency median: 4.3
latency 95th percentile   : 28.2
latency 99th percentile   : 49.4
latency 99.9th percentile : 54.8
latency max   : 2642.6


256MB ( -XX:+UseAdaptiveSizePolicy):

op rate   : 77705
partition rate: 77705
row rate  : 77705
latency mean  : 10.3
latency median: 4.8
latency 95th percentile   : 33.6
latency 99th percentile   : 45.3
latency 99.9th percentile : 49.1
latency max   : 1990.0


 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.5


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-07 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484318#comment-14484318
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

So far my testing of read workloads matches my experience with writes. An 8GB 
heap with generic G1GC settings is good for more workloads out of the box 
than haphazardly tuned CMS can be. I've been testing on a mix of Oracle/OpenJDK 
and JDK7/8 and the results are fairly consistent across the board with the 
exception that performance is a tad higher (~5%) on JDK8 than JDK7 (with G1GC - 
I have not tested CMS much on JDK8).

These parameters get better throughput than CMS out of the box with 
significantly improved consistency in the max and p99.9 latency.

-Xmx8G -Xms8G -XX:+UseG1GC

If throughput is more critical than latency, the following will get a few % 
more throughput at the cost of potentially higher max pause times:

-Xmx8G -Xms8G -XX:+UseG1GC -XX:MaxGCPauseMillis=2000 
-XX:InitiatingHeapOccupancyPercent=75

My recommendation is to document the last two options in cassandra-env.sh but 
leave them disabled/commented out for end-users to fiddle with. Other knobs for 
G1 didn't make a statistically measurable difference in my observations.

G1 scales particularly well with heap size on huge machines. 8 to 16GB doesn't 
seem to make a big difference, matching what [~rbranson] saw. At 24GB I started 
seeing about 8-10% throughput increase with little variance in pause times.

IMO the simple G1 configuration should be the default for large heaps. It's 
simple and provides consistent latency. Because it uses heuristics to determine 
the eden size and scanning schedule, it will adapts well to diverse 
environments without tweaking. Heap sizes under 8GB should continue to use CMS 
or even experiment with serial collectors (e.g. Raspberry Pi, t2.micro, 
vagrant). If there is interest, I will write up a patch for cassandra-env.sh to 
make the auto-detection code pick G1GC at = 6GB heap and CMS for  6GB.

 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.5


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-04-07 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484657#comment-14484657
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

I'll kick off some tests and find out. All of the Oracle docs say not to bother 
below 6GB, but yeah I agree, if it's basically not bad we should go with simple.

 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.5


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters

2015-04-04 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395572#comment-14395572
 ] 

Albert P Tobey commented on CASSANDRA-8150:
---

It appears that -XX:+UseGCTaskAffinity is a noop in hotspot.

https://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high-throughput-and-low-latency-java-applications

 Revaluate Default JVM tuning parameters
 ---

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Ryan McGuire
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7486) Compare CMS and G1 pause times

2015-03-31 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389850#comment-14389850
 ] 

Albert P Tobey commented on CASSANDRA-7486:
---

I managed to get G1 (Java 8) to beat CMS on both latency and throughput on my 
NUC cluster.

Preliminary results: https://gist.github.com/tobert/ea9328e4873441c7fc34



 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.4


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8873) Add PropertySeedProvider

2015-02-26 Thread Albert P Tobey (JIRA)
Albert P Tobey created CASSANDRA-8873:
-

 Summary: Add PropertySeedProvider
 Key: CASSANDRA-8873
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8873
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Albert P Tobey
Priority: Minor
 Attachments: PropertySeedProvider.java

Add a PropertySeedProvider that allows administrators to set a seed on the 
command line with -Dcassandra.seeds=127.0.0.1,127.0.0.2 instead of rewriting 
cassandra.yaml.

It looks like the yaml parser expects there to always be a parameters: option 
on seeds, so unless we change it to be optional, there needs to be a dummy map 
or the yaml will not parse, e.g.

seed_provider:
- class_name: org.apache.cassandra.locator.PropertySeedProvider
  parameters:
  - stub: this is required for the yaml parser and is ignored



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-8651) Add support for running on Apache Mesos

2015-01-23 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey reassigned CASSANDRA-8651:
-

Assignee: Albert P Tobey

 Add support for running on Apache Mesos
 ---

 Key: CASSANDRA-8651
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8651
 Project: Cassandra
  Issue Type: Task
Reporter: Ben Whitehead
Assignee: Albert P Tobey
Priority: Minor
 Fix For: 3.0


 As a user of Apache Mesos I would like to be able to run Cassandra on my 
 Mesos cluster. This would entail integration of Cassandra on Mesos through 
 the creation of a production level Mesos framework. This would enable me to 
 avoid static partitioning and inefficiencies and run Cassandra as part of my 
 data center infrastructure.
 http://mesos.apache.org/documentation/latest/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8651) Add support for running on Apache Mesos

2015-01-21 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286450#comment-14286450
 ] 

Albert P Tobey commented on CASSANDRA-8651:
---

The folks at Mesosphere are working on building an executor for Mesos and are 
hoping to upstream any components that make sense to live in the Cassandra 
tree.  It sounds like there could be a custom MesosSeedProvider. There's also a 
question of whether or not it makes sense to have the executor code live in the 
Cassandra tree. I think that will be easier to answer once it exists.

For now, I don't think they need anything from the Cassandra developers. This 
ticket exists to make the work visible to the community.

 Add support for running on Apache Mesos
 ---

 Key: CASSANDRA-8651
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8651
 Project: Cassandra
  Issue Type: Task
Reporter: Ben Whitehead
Priority: Minor
 Fix For: 3.0


 As a user of Apache Mesos I would like to be able to run Cassandra on my 
 Mesos cluster. This would entail integration of Cassandra on Mesos through 
 the creation of a production level Mesos framework. This would enable me to 
 avoid static partitioning and inefficiencies and run Cassandra as part of my 
 data center infrastructure.
 http://mesos.apache.org/documentation/latest/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8494) incremental bootstrap

2014-12-16 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249103#comment-14249103
 ] 

Albert P Tobey commented on CASSANDRA-8494:
---

Neat idea. I think this would make a lot of sense to operators and provide 
visibility into the rebuild process that's easy to understand (how many tokens 
are complete?).

Many of the customers I've talked to in the last few months will be very 
excited about this. In one case, they want to attach ~70TB of very fast SSD. I 
explained everything to them, they're still going to try.

Another client has more than 100 remote sites that store time-series data. They 
want to store 10-15TB per node on 15K SAS RAID10. It's the gear they can get 
and they have limited ability to control power drops etc. in the remote sites, 
so density is really important to them.

My former employer was trying to run 8 x 3TB SATA. No matter how hard we fought 
for the right drives, the incentives from the HW vendors etc. drove them to buy 
the big SATA drives.

I think ops folks will like this and there's an opportunity to use this feature 
to improve the UX of bootstrap (by using token ranges to improve feedback to 
ops).

 incremental bootstrap
 -

 Key: CASSANDRA-8494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8494
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jon Haddad
Priority: Minor
  Labels: density

 Current bootstrapping involves (to my knowledge) picking tokens and streaming 
 data before the node is available for requests.  This can be problematic with 
 fat nodes, since it may require 20TB of data to be streamed over before the 
 machine can be useful.  This can result in a massive window of time before 
 the machine can do anything useful.
 As a potential approach to mitigate the huge window of time before a node is 
 available, I suggest modifying the bootstrap process to only acquire a single 
 initial token before being marked UP.  This would likely be a configuration 
 parameter incremental_bootstrap or something similar.
 After the node is bootstrapped with this one token, it could go into UP 
 state, and could then acquire additional tokens (one or a handful at a time), 
 which would be streamed over while the node is active and serving requests.  
 The benefit here is that with the default 256 tokens a node could become an 
 active part of the cluster with less than 1% of it's final data streamed over.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6246) EPaxos

2014-09-27 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14150847#comment-14150847
 ] 

Albert P Tobey commented on CASSANDRA-6246:
---

For backwards compatibility, if it's possible to run both protocols, make it a 
configuration in the yaml. Another rolling restart to disable hybrid/dual mode 
isn't so bad if it removes a lot of complexity from runtime. Would also make it 
easy for conservative users to stick with the old paxos.

 EPaxos
 --

 Key: CASSANDRA-6246
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Blake Eggleston
Priority: Minor

 One reason we haven't optimized our Paxos implementation with Multi-paxos is 
 that Multi-paxos requires leader election and hence, a period of 
 unavailability when the leader dies.
 EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
 (2) is particularly useful across multiple datacenters, and (3) allows any 
 node to act as coordinator: 
 http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
 However, there is substantial additional complexity involved if we choose to 
 implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7136) Change default paths to ~ instead of /var

2014-06-04 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018288#comment-14018288
 ] 

Albert P Tobey commented on CASSANDRA-7136:
---

[~thobbs] probably not. For some reason I had it in my head that this was for 
3.0 so it was at the bottom of my queue.

 Change default paths to ~ instead of /var
 -

 Key: CASSANDRA-7136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7136
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 2.1.0


 Defaulting to /var makes it more difficult for both multi-user systems and 
 people unfamiliar with the command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7306) Support edge dcs with more flexible gossip

2014-05-27 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010205#comment-14010205
 ] 

Albert P Tobey commented on CASSANDRA-7306:
---

One real use case is branch locations with local clusters that get replicated 
to a central datacenter for analytics. The central cluster has no authority to 
open ports or create VPNs in the plants, but it can open ports on the inbound 
side. In this situation, the easiest thing to do is to open the inbound ports 
to the central cluster and use TLS. The spokes obviously cannot communicate 
with each other, but they can push data to the hub. This kind of scenario is 
common in retail and manufacturing. Basically, it's useful anywhere there is 
hub-and-spoke topology where bidirectional communication is 
impossible/intermittent.

Another common problem is NAT traversal where VPN is not available. If there is 
no requirement for bi-directional replication, it gets a lot easier to deal 
with NAT since the spoke/leaf clusters can connect outbound through NAT into a 
centralized cluster. Generating all the firewall rules for such an setup is a 
lot of work and prone to error. If only one side needs to modify firewall 
policy, it's a lot easier to get right and troubleshoot.

 Support edge dcs with more flexible gossip
 

 Key: CASSANDRA-7306
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
  Labels: ponies

 As Cassandra clusters get bigger and bigger, and their topology becomes more 
 complex, there is more and more need for a notion of hub and spoke 
 datacenters.
 One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
 is the assumption that all dcs need to talk to each other (and be connected 
 all the time).
 This ticket is a vague placeholder with the goals of achieving:
 1) better behavioral support for occasionally disconnected datacenters
 2) explicit support for custom dc to dc routing. A simple approach would be 
 an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7306) Support edge dcs with more flexible gossip

2014-05-27 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010210#comment-14010210
 ] 

Albert P Tobey commented on CASSANDRA-7306:
---

Another angle: small edge clusters for low latency writes in many regions that 
push to a central warehouse for analytics. Think GSLB - HTTP - Cassandra all 
over the world with regional clusters replicated to us-west-2 where the data is 
crunched with Spark or Hadoop. This is basically the MySQL read-replica flipped 
on its head with 0-read write-replicas going into a read-heavy warehouse. The 
central cluster could be 100's of nodes while edge clusters are in the 5-10 
range.

 Support edge dcs with more flexible gossip
 

 Key: CASSANDRA-7306
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
  Labels: ponies

 As Cassandra clusters get bigger and bigger, and their topology becomes more 
 complex, there is more and more need for a notion of hub and spoke 
 datacenters.
 One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
 is the assumption that all dcs need to talk to each other (and be connected 
 all the time).
 This ticket is a vague placeholder with the goals of achieving:
 1) better behavioral support for occasionally disconnected datacenters
 2) explicit support for custom dc to dc routing. A simple approach would be 
 an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7136) Change default paths to ~ instead of /var

2014-05-02 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988153#comment-13988153
 ] 

Albert P Tobey commented on CASSANDRA-7136:
---

Agreed on $CASSANDRA_HOME/data.

Having slept on it, I don't think the defaults in cassandra.yaml should change. 
It should always reflect sane defaults for *production* use. What we're talking 
about here is non-production, so it gets the short straw and will do the yaml 
mangling.

I'm leaning towards adding a separate launcher script as well, something like 
./run_ephemeral.sh or some other name with obvious meaning. Various 
installations and testing packages have come to expect consistent behavior from 
the current setup and there's no good reason to change those if we can simply 
add another script.

 Change default paths to ~ instead of /var
 -

 Key: CASSANDRA-7136
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7136
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Albert P Tobey
 Fix For: 2.1.0


 Defaulting to /var makes it more difficult for both multi-user systems and 
 people unfamiliar with the command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6487) Log WARN on large batch sizes

2013-12-13 Thread Albert P Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848099#comment-13848099
 ] 

Albert P Tobey commented on CASSANDRA-6487:
---

If it's not out of the way, it would help to include the keyspace and column 
family and maybe the session ID/info.

 Log WARN on large batch sizes
 -

 Key: CASSANDRA-6487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487
 Project: Cassandra
  Issue Type: Improvement
Reporter: Patrick McFadin
Priority: Minor

 Large batches on a coordinator can cause a lot of node stress. I propose 
 adding a WARN log entry if batch sizes go beyond a configurable size. This 
 will give more visibility to operators on something that can happen on the 
 developer side. 
 New yaml setting with 5k default.
 # Log WARN on any batch size exceeding this value. 5k by default.
 # Caution should be taken on increasing the size of this threshold as it can 
 lead to node instability.
 batch_size_warn_threshold: 5k



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)