Re: OOM on Apache Cassandra on 30 Plus node at the same time

Edward Capriolo Sat, 04 Mar 2017 08:22:40 -0800

On Saturday, March 4, 2017, Thakrar, Jayesh <jthak...@conversantmedia.com>
wrote:


> LCS does not rule out frequent updates - it just says that there will be
> more frequent compaction, which can potentially increase compaction
> activity (which again can be throttled as needed).
>
> But STCS will guarantee OOM when you have large datasets.
>
> Did you have a look at the offheap + onheap size of our jvm using
> "nodetool -info" ?
>
>
>
>
>
> *From: *Shravan C <chall...@outlook.com
> <javascript:_e(%7B%7D,'cvml','chall...@outlook.com');>>
> *Date: *Friday, March 3, 2017 at 11:11 PM
> *To: *Joaquin Casares <joaq...@thelastpickle.com
> <javascript:_e(%7B%7D,'cvml','joaq...@thelastpickle.com');>>, "
> user@cassandra.apache.org
> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>" <
> user@cassandra.apache.org
> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>>
> *Subject: *Re: OOM on Apache Cassandra on 30 Plus node at the same time
>
>
>
> We run C* at 32 GB and all servers have 96GB RAM. We use STCS . LCS is not
> an option for us as we have frequent updates.
>
>
>
> Thanks,
>
> Shravan
> ------------------------------
>
> *From:* Thakrar, Jayesh <jthak...@conversantmedia.com
> <javascript:_e(%7B%7D,'cvml','jthak...@conversantmedia.com');>>
> *Sent:* Friday, March 3, 2017 3:47:27 PM
> *To:* Joaquin Casares; user@cassandra.apache.org
> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>
> *Subject:* Re: OOM on Apache Cassandra on 30 Plus node at the same time
>
>
>
> Had been fighting a similar battle, but am now over the hump for most part.
>
>
>
> Get info on the server config (e.g. memory, cpu, free memory (free -g),
> etc)
>
> Run "nodetool info" on the nodes to get heap and off-heap sizes
>
> Run "nodetool tablestats" or "nodetool tablestats <kespace>.<tablename>"
> on the key large tables
>
> Essentially the purpose is to see if you really had a true OOM or was your
> machine running out of memory.
>
>
>
> Cassandra can use offheap memory very well - so "nodetool info" will give
> you both heap and offheap.
>
>
>
> Also, what is the compaction strategy of your tables?
>
>
>
> Personally, I have found STCS to be awful at large scale - when you have
> sstables that are 100+ GB in size.
>
> See https://issues.apache.org/jira/browse/CASSANDRA-10821?
> focusedCommentId=15389451&page=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-15389451
>
>
>
> LCS seems better and should be the default (my opinion) unless you want
> DTCS
>
>
>
> A good description of all three compactions is here -
> http://docs.scylladb.com/kb/compaction/
>
> Documentation <http://docs.scylladb.com/kb/compaction/>
>
> docs.scylladb.com
>
> Scylla is a Cassandra-compatible NoSQL data store that can handle 1
> million transactions per second on a single server.
>
>
>
>
>
>
>
> *From: *Joaquin Casares <joaq...@thelastpickle.com
> <javascript:_e(%7B%7D,'cvml','joaq...@thelastpickle.com');>>
> *Date: *Friday, March 3, 2017 at 11:34 AM
> *To: *<user@cassandra.apache.org
> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>>
> *Subject: *Re: OOM on Apache Cassandra on 30 Plus node at the same time
>
>
>
> Hello Shravan,
>
>
>
> Typically asynchronous requests are recommended over batch statements
> since batch statements will cause more work on the coordinator node while
> individual requests, when using a TokenAwarePolicy, will hit a specific
> coordinator, perform a local disk seek, and return the requested
> information.
>
>
>
> The only times that using batch statements are ideal is if writing to the
> same partition key, even if it's across multiple tables when using the same
> hashing algorithm (like murmur3).
>
>
>
> Could you provide a bit of insight into what the batch statement was
> trying to accomplish and how many child statements were bundled up within
> that batch?
>
>
>
> Cheers,
>
>
>
> Joaquin
>
>
> Joaquin Casares
>
> Consultant
>
> Austin, TX
>
>
>
> Apache Cassandra Consulting
>
> http://www.thelastpickle.com
>
> The Last Pickle • Apache Cassandra Consulting & Services
> <http://www.thelastpickle.com/>
>
> www.thelastpickle.com
>
> Apache Cassandra Consulting & Services. Our wealth of experience with
> Apache Cassandra will ensure success at all stages of a your project
> lifecycle.
>
>
>
> On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch <chall...@outlook.com
> <javascript:_e(%7B%7D,'cvml','chall...@outlook.com');>> wrote:
>
> Hello,
>
> More than 30 plus Cassandra servers in the primary DC went down OOM
> exception below. What puzzles me is the scale at which it happened (at the
> same minute). I will share some more details below.
>
> System Log: http://pastebin.com/iPeYrWVR
>
> GC Log: http://pastebin.com/CzNNGs0r
>
> During the OOM I saw lot of WARNings like the below (these were there for
> quite sometime may be weeks)
> *WARN  [SharedPool-Worker-81] 2017-03-01 19:55:41,209
> BatchStatement.java:252 - Batch of prepared statements for [keyspace.table]
> is of size 225455, exceeding specified threshold of 65536 by 159919.*
>
> * Environment:*
> We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more
> C* nodes on SSD and apps run here)  and secondary DC (geographically remote
> and more like a DR to primary) on SAS drives.
> Cassandra config:
>
> Java 1.8.0_65
> Garbage Collector: G1GC
> memtable_allocation_type: offheap_objects
>
> Post this OOM I am seeing huge hints pile up on majority of the nodes and
> the pending hints keep going up. I have increased HintedHandoff CoreThreads
> to 6 but that did not help (I admit that I tried this on one node to try).
>
> nodetool compactionstats -H
> pending tasks: 3
> compaction type            keyspace                          table
> completed      total    unit   progress
>         Compaction              system                          hints
> 28.5 GB   92.38 GB   bytes     30.85%
>
>
>
> Appreciate your inputs here.
>
> Thanks,
>
> Shravan
>
>
>
>
Stcs does not guarentee oom with large sets. Cassandra had large sets for
yeara when stcs was the only thing that existed.


-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.

Re: OOM on Apache Cassandra on 30 Plus node at the same time

Reply via email to