[ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877404#comment-14877404
 ] 

Benedict commented on CASSANDRA-7486:
-------------------------------------

bq. Is the picture equally bleak at RF=3?

[Regrettably 
so|http://cstar.datastax.com/graph?stats=a1eee43a-5f32-11e5-88b3-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=112.2&ymin=0&ymax=163663.5]

bq. Do the "2.2 GC" settings include anything other than the defaults from 
cassandra-env.sh? "ps -efw" output is sufficient.

I haven't double checked, I simply copied [~tjake]'s branch and rebased to 
latest 3.0. It looks like it's just 2.2 defaults.

bq. I'd be happy to take a look at the GC logs if they are available.

The thing is, as I say, the _GC_ burden is pretty consistently lower. However 
the application performance is also worse. Indicating the problem isn't the 
collections, but the VM behavioural changes required to enable G1GC. So 
analyzing GC logs is unlikely to deliver much, and figuring out how to modify 
the application to reduce the burden here is unlikely to be a short task (if 
achievable).

bq. This is not in debate.

I'm afraid nothing is not in debate in this world :)

If you mean to say "CMS will *not* scale with _increasingly gigantic heap 
sizes_" then we would probably be in agreement, however with smallish heaps CMS 
works just fine - better, even. If the mid-to-long term goal of Cassandra is to 
have a constant heap burden, i.e. decouple heap requirements from dataset, then 
it doesn't follow that increasing hardware capabilities requires G1GC. There 
are lots of reasons why this _should_ be our goal, and my understanding is 
there is a general consensus on that, but that's a separate debate. 

Certainly we need to do more research, but I will prognosticate briefly: I 
suspect we will find that with very large heaps (16Gb+) and with lots of 
headroom G1GC begins to outperform CMS, especially wrt the most critical of 
metrics, 99.9%ile. However I suspect we will find CMS continues to dominate in 
domains where it can maintain sufficiently low pause times. 

Since many users target the more modest heap sizes, we may find that it makes 
most sense to provide two default configurations, and have the user opt into 
our "default" G1GC settings if they intend to run with a very large heap. If, 
after extensive research, we find that we can confidently predict configs where 
it makes more sense, we should consider doing this automatically in 
cassandra-env.

My suspicion is we won't manage to do this research in time for GA, but that 
doesn't stop us providing the parallel defaults and documentation to make it 
easy for users to enable it.

> Migrate to G1GC by default
> --------------------------
>
>                 Key: CASSANDRA-7486
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Config
>            Reporter: Jonathan Ellis
>            Assignee: Albert P Tobey
>             Fix For: 3.0 alpha 1
>
>
> See 
> http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
>  and https://twitter.com/rbranson/status/482113561431265281
> May want to default 2.1 to G1.
> 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
> Suspect this will help G1 even more than CMS.  (NB this is off by default but 
> needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to