On Saturday, March 4, 2017, Thakrar, Jayesh <jthak...@conversantmedia.com> wrote:
> LCS does not rule out frequent updates - it just says that there will be > more frequent compaction, which can potentially increase compaction > activity (which again can be throttled as needed). > > But STCS will guarantee OOM when you have large datasets. > > Did you have a look at the offheap + onheap size of our jvm using > "nodetool -info" ? > > > > > > *From: *Shravan C <chall...@outlook.com > <javascript:_e(%7B%7D,'cvml','chall...@outlook.com');>> > *Date: *Friday, March 3, 2017 at 11:11 PM > *To: *Joaquin Casares <joaq...@thelastpickle.com > <javascript:_e(%7B%7D,'cvml','joaq...@thelastpickle.com');>>, " > user@cassandra.apache.org > <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>" < > user@cassandra.apache.org > <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>> > *Subject: *Re: OOM on Apache Cassandra on 30 Plus node at the same time > > > > We run C* at 32 GB and all servers have 96GB RAM. We use STCS . LCS is not > an option for us as we have frequent updates. > > > > Thanks, > > Shravan > ------------------------------ > > *From:* Thakrar, Jayesh <jthak...@conversantmedia.com > <javascript:_e(%7B%7D,'cvml','jthak...@conversantmedia.com');>> > *Sent:* Friday, March 3, 2017 3:47:27 PM > *To:* Joaquin Casares; user@cassandra.apache.org > <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');> > *Subject:* Re: OOM on Apache Cassandra on 30 Plus node at the same time > > > > Had been fighting a similar battle, but am now over the hump for most part. > > > > Get info on the server config (e.g. memory, cpu, free memory (free -g), > etc) > > Run "nodetool info" on the nodes to get heap and off-heap sizes > > Run "nodetool tablestats" or "nodetool tablestats <kespace>.<tablename>" > on the key large tables > > Essentially the purpose is to see if you really had a true OOM or was your > machine running out of memory. > > > > Cassandra can use offheap memory very well - so "nodetool info" will give > you both heap and offheap. > > > > Also, what is the compaction strategy of your tables? > > > > Personally, I have found STCS to be awful at large scale - when you have > sstables that are 100+ GB in size. > > See https://issues.apache.org/jira/browse/CASSANDRA-10821? > focusedCommentId=15389451&page=com.atlassian.jira. > plugin.system.issuetabpanels:comment-tabpanel#comment-15389451 > > > > LCS seems better and should be the default (my opinion) unless you want > DTCS > > > > A good description of all three compactions is here - > http://docs.scylladb.com/kb/compaction/ > > Documentation <http://docs.scylladb.com/kb/compaction/> > > docs.scylladb.com > > Scylla is a Cassandra-compatible NoSQL data store that can handle 1 > million transactions per second on a single server. > > > > > > > > *From: *Joaquin Casares <joaq...@thelastpickle.com > <javascript:_e(%7B%7D,'cvml','joaq...@thelastpickle.com');>> > *Date: *Friday, March 3, 2017 at 11:34 AM > *To: *<user@cassandra.apache.org > <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>> > *Subject: *Re: OOM on Apache Cassandra on 30 Plus node at the same time > > > > Hello Shravan, > > > > Typically asynchronous requests are recommended over batch statements > since batch statements will cause more work on the coordinator node while > individual requests, when using a TokenAwarePolicy, will hit a specific > coordinator, perform a local disk seek, and return the requested > information. > > > > The only times that using batch statements are ideal is if writing to the > same partition key, even if it's across multiple tables when using the same > hashing algorithm (like murmur3). > > > > Could you provide a bit of insight into what the batch statement was > trying to accomplish and how many child statements were bundled up within > that batch? > > > > Cheers, > > > > Joaquin > > > Joaquin Casares > > Consultant > > Austin, TX > > > > Apache Cassandra Consulting > > http://www.thelastpickle.com > > The Last Pickle • Apache Cassandra Consulting & Services > <http://www.thelastpickle.com/> > > www.thelastpickle.com > > Apache Cassandra Consulting & Services. Our wealth of experience with > Apache Cassandra will ensure success at all stages of a your project > lifecycle. > > > > On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch <chall...@outlook.com > <javascript:_e(%7B%7D,'cvml','chall...@outlook.com');>> wrote: > > Hello, > > More than 30 plus Cassandra servers in the primary DC went down OOM > exception below. What puzzles me is the scale at which it happened (at the > same minute). I will share some more details below. > > System Log: http://pastebin.com/iPeYrWVR > > GC Log: http://pastebin.com/CzNNGs0r > > During the OOM I saw lot of WARNings like the below (these were there for > quite sometime may be weeks) > *WARN [SharedPool-Worker-81] 2017-03-01 19:55:41,209 > BatchStatement.java:252 - Batch of prepared statements for [keyspace.table] > is of size 225455, exceeding specified threshold of 65536 by 159919.* > > * Environment:* > We are using ApacheCassandra-2.1.9 on Multi DC cluster. Primary DC (more > C* nodes on SSD and apps run here) and secondary DC (geographically remote > and more like a DR to primary) on SAS drives. > Cassandra config: > > Java 1.8.0_65 > Garbage Collector: G1GC > memtable_allocation_type: offheap_objects > > Post this OOM I am seeing huge hints pile up on majority of the nodes and > the pending hints keep going up. I have increased HintedHandoff CoreThreads > to 6 but that did not help (I admit that I tried this on one node to try). > > nodetool compactionstats -H > pending tasks: 3 > compaction type keyspace table > completed total unit progress > Compaction system hints > 28.5 GB 92.38 GB bytes 30.85% > > > > Appreciate your inputs here. > > Thanks, > > Shravan > > > > Stcs does not guarentee oom with large sets. Cassandra had large sets for yeara when stcs was the only thing that existed. -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.