Hi all,

thank you all for the info.

To answer the questions:
 - we have 2 DCs with 5 nodes in each, each node has 256G of memory,
24x1T drives, 2x Xeon CPU - there are multiple cassandra instances
running for different project. The node itself is powerful enough.
 - there 2 keyspaces, one with 3 replicas per DC, one with 1 replica per
DC (because of amount of data and because it serves more or less like a
cache)
 - there are about 4k/s Request-response, 3k/s Read and 2k/s Mutation
requests  - numbers are sum of all nodes
 - we us STCS (LCS would be quite IO have for this amount of data)
 - number of tombstones - how can I reliably find it out?
 - the biggest CF (3.6T per node) has 7000 sstables

Now, I understand that the best practice for Cassandra is to run "with
the minimum size of heap which is enough" which for this case we thought
is about 12G - there is always 8G consumbed by the SSTable readers.
Also, I though that high number of tombstones create pressure in the new
space (which can then cause pressure in old space as well), but this is
not what we are seeing. We see continuous GC activity in Old generation
only.

Also, I noticed that the biggest CF has Compression factor of 0.99 which
basically means that the data come compressed already. Do you think that
turning off the compression should help with memory consumption?

Also, I think that tuning CMSInitiatingOccupancyFraction=75 might help
here, as it seems that 8G is something that Cassandra needs for
bookkeeping this amount of data and that this was sligtly above the 75%
limit which triggered the CMS again and again.

I will definitely have a look at the presentation.

Regards
Jiri Horky

On 02/08/2015 10:32 PM, Mark Reddy wrote:
> Hey Jiri, 
>
> While I don't have any experience running 4TB nodes (yet), I would
> recommend taking a look at a presentation by Arron Morton on large
> nodes: 
> http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/
> to see if you can glean anything from that.
>
> I would note that at the start of his talk he mentions that in version
> 1.2 we can now talk about nodes around 1 - 3 TB in size, so if you are
> storing anything more than that you are getting into very specialised
> use cases.
>
> If you could provide us with some more information about your cluster
> setup (No. of CFs, read/write patterns, do you delete / update often,
> etc.) that may help in getting you to a better place.
>
>
> Regards,
> Mark
>
> On 8 February 2015 at 21:10, Kevin Burton <bur...@spinn3r.com
> <mailto:bur...@spinn3r.com>> wrote:
>
>     Do you have a lot of individual tables?  Or lots of small
>     compactions?
>
>     I think the general consensus is that (at least for Cassandra),
>     8GB heaps are ideal.  
>
>     If you have lots of small tables it’s a known anti-pattern (I
>     believe) because the Cassandra internals could do a better job on
>     handling the in memory metadata representation.
>
>     I think this has been improved in 2.0 and 2.1 though so the fact
>     that you’re on 1.2.18 could exasperate the issue.  You might want
>     to consider an upgrade (though that has its own issues as well).
>
>     On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky <ho...@avast.com
>     <mailto:ho...@avast.com>> wrote:
>
>         Hi all,
>
>         we are seeing quite high GC pressure (in old space by CMS GC
>         Algorithm)
>         on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap
>         memory
>         (2G for new space). The node runs fine for couple of days when
>         the GC
>         activity starts to raise and reaches about 15% of the C*
>         activity which
>         causes dropped messages and other problems.
>
>         Taking a look at heap dump, there is about 8G used by
>         SSTableReader
>         classes in
>         org.apache.cassandra.io.compress.CompressedRandomAccessReader.
>
>         Is this something expected and we have just reached the limit
>         of how
>         many data a single Cassandra instance can handle or it is
>         possible to
>         tune it better?
>
>         Regards
>         Jiri Horky
>
>
>
>
>     -- 
>     Founder/CEO Spinn3r.com <http://Spinn3r.com>
>     Location: *San Francisco, CA*
>     blog:* *http://burtonator.wordpress.com
>     … or check out my Google+ profile
>     <https://plus.google.com/102718274791889610666/posts>
>     <http://spinn3r.com>
>
>

Reply via email to