[ https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877404#comment-14877404 ]
Benedict commented on CASSANDRA-7486: ------------------------------------- bq. Is the picture equally bleak at RF=3? [Regrettably so|http://cstar.datastax.com/graph?stats=a1eee43a-5f32-11e5-88b3-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=112.2&ymin=0&ymax=163663.5] bq. Do the "2.2 GC" settings include anything other than the defaults from cassandra-env.sh? "ps -efw" output is sufficient. I haven't double checked, I simply copied [~tjake]'s branch and rebased to latest 3.0. It looks like it's just 2.2 defaults. bq. I'd be happy to take a look at the GC logs if they are available. The thing is, as I say, the _GC_ burden is pretty consistently lower. However the application performance is also worse. Indicating the problem isn't the collections, but the VM behavioural changes required to enable G1GC. So analyzing GC logs is unlikely to deliver much, and figuring out how to modify the application to reduce the burden here is unlikely to be a short task (if achievable). bq. This is not in debate. I'm afraid nothing is not in debate in this world :) If you mean to say "CMS will *not* scale with _increasingly gigantic heap sizes_" then we would probably be in agreement, however with smallish heaps CMS works just fine - better, even. If the mid-to-long term goal of Cassandra is to have a constant heap burden, i.e. decouple heap requirements from dataset, then it doesn't follow that increasing hardware capabilities requires G1GC. There are lots of reasons why this _should_ be our goal, and my understanding is there is a general consensus on that, but that's a separate debate. Certainly we need to do more research, but I will prognosticate briefly: I suspect we will find that with very large heaps (16Gb+) and with lots of headroom G1GC begins to outperform CMS, especially wrt the most critical of metrics, 99.9%ile. However I suspect we will find CMS continues to dominate in domains where it can maintain sufficiently low pause times. Since many users target the more modest heap sizes, we may find that it makes most sense to provide two default configurations, and have the user opt into our "default" G1GC settings if they intend to run with a very large heap. If, after extensive research, we find that we can confidently predict configs where it makes more sense, we should consider doing this automatically in cassandra-env. My suspicion is we won't manage to do this research in time for GA, but that doesn't stop us providing the parallel defaults and documentation to make it easy for users to enable it. > Migrate to G1GC by default > -------------------------- > > Key: CASSANDRA-7486 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7486 > Project: Cassandra > Issue Type: New Feature > Components: Config > Reporter: Jonathan Ellis > Assignee: Albert P Tobey > Fix For: 3.0 alpha 1 > > > See > http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning > and https://twitter.com/rbranson/status/482113561431265281 > May want to default 2.1 to G1. > 2.1 is a different animal from 2.0 after moving most of memtables off heap. > Suspect this will help G1 even more than CMS. (NB this is off by default but > needs to be part of the test.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)