[
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877404#comment-14877404
]
Benedict commented on CASSANDRA-7486:
-------------------------------------
bq. Is the picture equally bleak at RF=3?
[Regrettably
so|http://cstar.datastax.com/graph?stats=a1eee43a-5f32-11e5-88b3-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=112.2&ymin=0&ymax=163663.5]
bq. Do the "2.2 GC" settings include anything other than the defaults from
cassandra-env.sh? "ps -efw" output is sufficient.
I haven't double checked, I simply copied [~tjake]'s branch and rebased to
latest 3.0. It looks like it's just 2.2 defaults.
bq. I'd be happy to take a look at the GC logs if they are available.
The thing is, as I say, the _GC_ burden is pretty consistently lower. However
the application performance is also worse. Indicating the problem isn't the
collections, but the VM behavioural changes required to enable G1GC. So
analyzing GC logs is unlikely to deliver much, and figuring out how to modify
the application to reduce the burden here is unlikely to be a short task (if
achievable).
bq. This is not in debate.
I'm afraid nothing is not in debate in this world :)
If you mean to say "CMS will *not* scale with _increasingly gigantic heap
sizes_" then we would probably be in agreement, however with smallish heaps CMS
works just fine - better, even. If the mid-to-long term goal of Cassandra is to
have a constant heap burden, i.e. decouple heap requirements from dataset, then
it doesn't follow that increasing hardware capabilities requires G1GC. There
are lots of reasons why this _should_ be our goal, and my understanding is
there is a general consensus on that, but that's a separate debate.
Certainly we need to do more research, but I will prognosticate briefly: I
suspect we will find that with very large heaps (16Gb+) and with lots of
headroom G1GC begins to outperform CMS, especially wrt the most critical of
metrics, 99.9%ile. However I suspect we will find CMS continues to dominate in
domains where it can maintain sufficiently low pause times.
Since many users target the more modest heap sizes, we may find that it makes
most sense to provide two default configurations, and have the user opt into
our "default" G1GC settings if they intend to run with a very large heap. If,
after extensive research, we find that we can confidently predict configs where
it makes more sense, we should consider doing this automatically in
cassandra-env.
My suspicion is we won't manage to do this research in time for GA, but that
doesn't stop us providing the parallel defaults and documentation to make it
easy for users to enable it.
> Migrate to G1GC by default
> --------------------------
>
> Key: CASSANDRA-7486
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
> Project: Cassandra
> Issue Type: New Feature
> Components: Config
> Reporter: Jonathan Ellis
> Assignee: Albert P Tobey
> Fix For: 3.0 alpha 1
>
>
> See
> http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
> and https://twitter.com/rbranson/status/482113561431265281
> May want to default 2.1 to G1.
> 2.1 is a different animal from 2.0 after moving most of memtables off heap.
> Suspect this will help G1 even more than CMS. (NB this is off by default but
> needs to be part of the test.)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)