Re: High GC activity on node with 4TB on data

Jiri Horky Thu, 12 Feb 2015 11:10:25 -0800

Number of cores: 2x6Cores x 2(HT).

I do agree with you that the the hardware is certainly overestimated for
just one Cassandra, but we got a very good price since we ordered
several 10s of the same nodes for a different project. That's why we use
for multiple cassandra instances.


Jirka H.

On 02/12/2015 04:18 PM, Eric Stevens wrote:
> > each node has 256G of memory, 24x1T drives, 2x Xeon CPU
>
> I don't have first hand experience running Cassandra on such massive
> hardware, but it strikes me that these machines are dramatically
> oversized to be good candidates for Cassandra (though I wonder how
> many cores are in those CPUs; I'm guessing closer to 18 than 2 based
> on the other hardware).
>
> A larger cluster of smaller hardware would be a much better shape for
> Cassandra.  Or several clusters of smaller hardware since you're
> running multiple instances on this hardware - best practices have one
> instance per host no matter the hardware size.
>
> On Thu, Feb 12, 2015 at 12:36 AM, Jiri Horky <ho...@avast.com
> <mailto:ho...@avast.com>> wrote:
>
>     Hi Chris,
>
>     On 02/09/2015 04:22 PM, Chris Lohfink wrote:
>>      - number of tombstones - how can I reliably find it out?
>>     https://github.com/spotify/cassandra-opstools
>>     https://github.com/cloudian/support-tools
>     thanks.
>>
>>     If not getting much compression it may be worth trying to disable
>>     it, it may contribute but its very unlikely that its the cause of
>>     the gc pressure itself.
>>
>>     7000 sstables but STCS? Sounds like compactions couldn't keep
>>     up.  Do you have a lot of pending compactions (nodetool)?  You
>>     may want to increase your compaction throughput (nodetool) to see
>>     if you can catch up a little, it would cause a lot of heap
>>     overhead to do reads with that many.  May even need to take more
>>     drastic measures if it cant catch back up.
>     I am sorry, I was wrong. We actually do use LCS (the switch was
>     done recently). There are almost none pending compaction. We have
>     increased the size sstable to 768M, so it should help as as well.
>
>>
>>     May also be good to check `nodetool cfstats` for very wide
>>     partitions.  
>     There are basically none, this is fine.
>
>     It seems that the problem really comes from having so much data in
>     so many sstables, so
>     org.apache.cassandra.io.compress.CompressedRandomAccessReader
>     classes consumes more memory than 0.75*HEAP_SIZE, which triggers
>     the CMS over and over.
>
>     We have turned off the compression and so far, the situation seems
>     to be fine.
>
>     Cheers
>     Jirka H.
>
>
>>
>>     Theres a good chance if under load and you have over 8gb heap
>>     your GCs could use tuning.  The bigger the nodes the more manual
>>     tweaking it will require to get the most out of
>>     them https://issues.apache.org/jira/browse/CASSANDRA-8150 also
>>     has some ideas.
>>
>>     Chris
>>
>>     On Mon, Feb 9, 2015 at 2:00 AM, Jiri Horky <ho...@avast.com
>>     <mailto:ho...@avast.com>> wrote:
>>
>>         Hi all,
>>
>>         thank you all for the info.
>>
>>         To answer the questions:
>>          - we have 2 DCs with 5 nodes in each, each node has 256G of
>>         memory, 24x1T drives, 2x Xeon CPU - there are multiple
>>         cassandra instances running for different project. The node
>>         itself is powerful enough.
>>          - there 2 keyspaces, one with 3 replicas per DC, one with 1
>>         replica per DC (because of amount of data and because it
>>         serves more or less like a cache)
>>          - there are about 4k/s Request-response, 3k/s Read and 2k/s
>>         Mutation requests  - numbers are sum of all nodes
>>          - we us STCS (LCS would be quite IO have for this amount of
>>         data)
>>          - number of tombstones - how can I reliably find it out?
>>          - the biggest CF (3.6T per node) has 7000 sstables
>>
>>         Now, I understand that the best practice for Cassandra is to
>>         run "with the minimum size of heap which is enough" which for
>>         this case we thought is about 12G - there is always 8G
>>         consumbed by the SSTable readers. Also, I though that high
>>         number of tombstones create pressure in the new space (which
>>         can then cause pressure in old space as well), but this is
>>         not what we are seeing. We see continuous GC activity in Old
>>         generation only.
>>
>>         Also, I noticed that the biggest CF has Compression factor of
>>         0.99 which basically means that the data come compressed
>>         already. Do you think that turning off the compression should
>>         help with memory consumption?
>>
>>         Also, I think that tuning CMSInitiatingOccupancyFraction=75
>>         might help here, as it seems that 8G is something that
>>         Cassandra needs for bookkeeping this amount of data and that
>>         this was sligtly above the 75% limit which triggered the CMS
>>         again and again.
>>
>>         I will definitely have a look at the presentation.
>>
>>         Regards
>>         Jiri Horky
>>
>>
>>         On 02/08/2015 10:32 PM, Mark Reddy wrote:
>>>         Hey Jiri, 
>>>
>>>         While I don't have any experience running 4TB nodes (yet), I
>>>         would recommend taking a look at a presentation by Arron
>>>         Morton on large
>>>         nodes: 
>>> http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/
>>>         to see if you can glean anything from that.
>>>
>>>         I would note that at the start of his talk he mentions that
>>>         in version 1.2 we can now talk about nodes around 1 - 3 TB
>>>         in size, so if you are storing anything more than that you
>>>         are getting into very specialised use cases.
>>>
>>>         If you could provide us with some more information about
>>>         your cluster setup (No. of CFs, read/write patterns, do you
>>>         delete / update often, etc.) that may help in getting you to
>>>         a better place.
>>>
>>>
>>>         Regards,
>>>         Mark
>>>
>>>         On 8 February 2015 at 21:10, Kevin Burton
>>>         <bur...@spinn3r.com <mailto:bur...@spinn3r.com>> wrote:
>>>
>>>             Do you have a lot of individual tables?  Or lots of
>>>             small compactions?
>>>
>>>             I think the general consensus is that (at least for
>>>             Cassandra), 8GB heaps are ideal.  
>>>
>>>             If you have lots of small tables it’s a known
>>>             anti-pattern (I believe) because the Cassandra internals
>>>             could do a better job on handling the in memory metadata
>>>             representation.
>>>
>>>             I think this has been improved in 2.0 and 2.1 though so
>>>             the fact that you’re on 1.2.18 could exasperate the
>>>             issue.  You might want to consider an upgrade (though
>>>             that has its own issues as well).
>>>
>>>             On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky
>>>             <ho...@avast.com <mailto:ho...@avast.com>> wrote:
>>>
>>>                 Hi all,
>>>
>>>                 we are seeing quite high GC pressure (in old space
>>>                 by CMS GC Algorithm)
>>>                 on a node with 4TB of data. It runs C* 1.2.18 with
>>>                 12G of heap memory
>>>                 (2G for new space). The node runs fine for couple of
>>>                 days when the GC
>>>                 activity starts to raise and reaches about 15% of
>>>                 the C* activity which
>>>                 causes dropped messages and other problems.
>>>
>>>                 Taking a look at heap dump, there is about 8G used
>>>                 by SSTableReader
>>>                 classes in
>>>                 
>>> org.apache.cassandra.io.compress.CompressedRandomAccessReader.
>>>
>>>                 Is this something expected and we have just reached
>>>                 the limit of how
>>>                 many data a single Cassandra instance can handle or
>>>                 it is possible to
>>>                 tune it better?
>>>
>>>                 Regards
>>>                 Jiri Horky
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Founder/CEO Spinn3r.com <http://Spinn3r.com>
>>>             Location: *San Francisco, CA*
>>>             blog:* *http://burtonator.wordpress.com
>>>             … or check out my Google+ profile
>>>             <https://plus.google.com/102718274791889610666/posts>
>>>             <http://spinn3r.com>
>>>
>>>
>>
>>
>
>

Re: High GC activity on node with 4TB on data

Reply via email to