[
https://issues.apache.org/jira/browse/CASSANDRA-8477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246584#comment-14246584
]
Benedict commented on CASSANDRA-8477:
-------------------------------------
Ok. You may simply have some data modelling issues. Each reader is constructing
a response with over 4M range tombstones (i.e. 4M cql row deletes with the same
partition key). This is only a problem during read or compaction, because they
were flushed to disk separately and only have to all be instantiated at once at
read time. Each reader is consuming ~300Mb of heap just to maintain this list.
The current state of c* does not really support such gigantic partitions, and
there's not a lot that is likely to be done to improve this situation until at
least 3.1.
There may be some shorter term wins for your case, because we current retain
the full history of range tombstones over the queried range, and keep them for
the entire life of the query - when in most cases we could prune for ranges
we've fully materialized the data for. This would likely prevent your heap
buildup. [~jbellis] [~iamaleksey] [~slebresne]: not sure who is most suited to
deciding on and implementing this change?
It's also possible we could offer the option to coalesce range deletes with
different deletion timestamps into a single record with the newest timestamp,
but this would be semantically tricky. If you can manually coalesce your
deletes, this would help mitigate the problem also. i.e., if you do N "delete
from x where a = xn" and you don't keep any records between x0 and xN you
should be able to do "delete from x where a >= x0 and a <= xN" to flatten N of
these tombstones into a single record. If you did this for all 4M items, your
problem would disappear.
That said, having the potential for 4M items in a single partition really isn't
something c* is currently optimised for, so is probably something you want to
revisit anyway.
> CMS GC can not recycle objects
> ------------------------------
>
> Key: CASSANDRA-8477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8477
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: 2.1.1 or 2.1.2-SNAPSHOT(after CASSANDRA-8459 resolved)
> Reporter: Philo Yang
> Attachments: cassandra.yaml, histo.txt, jstack.txt, log1.txt,
> log2.txt, system.log, system.log.2014-12-15_1226
>
>
> I have a trouble in my cluster that CMS full gc can not reduce the size of
> old gen. Days ago I post this problem to the maillist, people think it will
> be solved by tuning the gc setting, however it doesn't work for me.
> Then I saw a similar bug in CASSANDRA-8447, but [~benedict] think it is not
> related. With the jstack on
> https://gist.github.com/yangzhe1991/755ea2a10520be1fe59a, [~benedict] find a
> bug and resolved it in CASSANDRA-8459. So I build a latest version on 2.1
> branch and run the SNAPSHOT version on the nodes with gc trouble.
> However, there is still the gc issue. So I think opening a new tick and post
> more information is a good idea. Thanks for helping me.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)