>From jhat output, top 10 entries for "Instance Count for All Classes (excluding platform)" shows:
2088223 instances of class org.apache.cassandra.db.BufferCell 1983245 instances of class org.apache.cassandra.db.composites.CompoundSparseCellName 1885974 instances of class org.apache.cassandra.db.composites.CompoundDenseCellName 630000 instances of class org.apache.cassandra.io.sstable.IndexHelper$IndexInfo 503687 instances of class org.apache.cassandra.db.BufferDeletedCell 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier 101800 instances of class org.apache.cassandra.utils.concurrent.Ref 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State 90704 instances of class org.apache.cassandra.utils.concurrent.Ref$GlobalState 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey At the bottom of the page, it shows: Total of 8739510 instances occupying 193607512 bytes. JFYI. Kunal On 10 July 2015 at 23:49, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote: > Thanks for quick reply. > > 1. I don't know what are the thresholds that I should look for. So, to > save this back-and-forth, I'm attaching the cfstats output for the keyspace. > > There is one table - daily_challenges - which shows compacted partition > max bytes as ~460M and another one - daily_guest_logins - which shows > compacted partition max bytes as ~36M. > > Can that be a problem? > Here is the CQL schema for the daily_challenges column family: > > CREATE TABLE app_10001.daily_challenges ( > segment_type text, > date timestamp, > user_id int, > sess_id text, > data text, > deleted boolean, > PRIMARY KEY (segment_type, date, user_id, sess_id) > ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '4', 'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > > CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted); > > > 2. I don't know - how do I check? As I mentioned, I just installed the > dsc21 update from datastax's debian repo (ver 2.1.7). > > Really appreciate your help. > > Thanks, > Kunal > > On 10 July 2015 at 23:33, Sebastian Estevez < > sebastian.este...@datastax.com> wrote: > >> 1. You want to look at # of sstables in cfhistograms or in cfstats look >> at: >> Compacted partition maximum bytes >> Maximum live cells per slice >> >> 2) No, here's the env.sh from 3.0 which should work with some tweaks: >> >> https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh >> >> You'll at least have to modify the jamm version to what's in yours. I >> think it's 2.5 >> >> >> >> All the best, >> >> >> [image: datastax_logo.png] <http://www.datastax.com/> >> >> Sebastián Estévez >> >> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com >> >> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: >> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] >> <https://twitter.com/datastax> [image: g+.png] >> <https://plus.google.com/+Datastax/about> >> <http://feeds.feedburner.com/datastax> >> >> <http://cassandrasummit-datastax.com/> >> >> DataStax is the fastest, most scalable distributed database technology, >> delivering Apache Cassandra to the world’s most innovative enterprises. >> Datastax is built to be agile, always-on, and predictably scalable to any >> size. With more than 500 customers in 45 countries, DataStax is the >> database technology and transactional backbone of choice for the worlds >> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >> >> On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar < >> kgangakhed...@gmail.com> wrote: >> >>> Thanks, Sebastian. >>> >>> Couple of questions (I'm really new to cassandra): >>> 1. How do I interpret the output of 'nodetool cfstats' to figure out the >>> issues? Any documentation pointer on that would be helpful. >>> >>> 2. I'm primarily a python/c developer - so, totally clueless about JVM >>> environment. So, please bare with me as I would need a lot of hand-holding. >>> Should I just copy+paste the settings you gave and try to restart the >>> failing cassandra server? >>> >>> Thanks, >>> Kunal >>> >>> On 10 July 2015 at 22:35, Sebastian Estevez < >>> sebastian.este...@datastax.com> wrote: >>> >>>> #1 You need more information. >>>> >>>> a) Take a look at your .hprof file (memory heap from the OOM) with an >>>> introspection tool like jhat or visualvm or java flight recorder and see >>>> what is using up your RAM. >>>> >>>> b) How big are your large rows (use nodetool cfstats on each node). If >>>> your data model is bad, you are going to have to re-design it no matter >>>> what. >>>> >>>> #2 As a possible workaround try using the G1GC allocator with the >>>> settings from c* 3.0 instead of CMS. I've seen lots of success with it >>>> lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely >>>> tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not* >>>> set the newgen size for G1 sets it dynamically: >>>> >>>> # min and max heap sizes should be set to the same value to avoid >>>>> # stop-the-world GC pauses during resize, and so that we can lock the >>>>> # heap in memory on startup to prevent any of it from being swapped >>>>> # out. >>>>> JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}" >>>>> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}" >>>>> >>>>> # Per-thread stack size. >>>>> JVM_OPTS="$JVM_OPTS -Xss256k" >>>>> >>>>> # Use the Hotspot garbage-first collector. >>>>> JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" >>>>> >>>>> # Have the JVM do less remembered set work during STW, instead >>>>> # preferring concurrent GC. Reduces p99.9 latency. >>>>> JVM_OPTS="$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5" >>>>> >>>>> # The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC. >>>>> # Machines with > 10 cores may need additional threads. >>>>> # Increase to <= full cores (do not count HT cores). >>>>> #JVM_OPTS="$JVM_OPTS -XX:ParallelGCThreads=16" >>>>> #JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=16" >>>>> >>>>> # Main G1GC tunable: lowering the pause target will lower throughput >>>>> and vise versa. >>>>> # 200ms is the JVM default and lowest viable setting >>>>> # 1000ms increases throughput. Keep it smaller than the timeouts in >>>>> cassandra.yaml. >>>>> JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=500" >>>>> # Do reference processing in parallel GC. >>>>> JVM_OPTS="$JVM_OPTS -XX:+ParallelRefProcEnabled" >>>>> >>>>> # This may help eliminate STW. >>>>> # The default in Hotspot 8u40 is 40%. >>>>> #JVM_OPTS="$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25" >>>>> >>>>> # For workloads that do large allocations, increasing the region >>>>> # size may make things more efficient. Otherwise, let the JVM >>>>> # set this automatically. >>>>> #JVM_OPTS="$JVM_OPTS -XX:G1HeapRegionSize=32m" >>>>> >>>>> # Make sure all memory is faulted and zeroed on startup. >>>>> # This helps prevent soft faults in containers and makes >>>>> # transparent hugepage allocation more effective. >>>>> JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch" >>>>> >>>>> # Biased locking does not benefit Cassandra. >>>>> JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking" >>>>> >>>>> # Larger interned string table, for gossip's benefit (CASSANDRA-6410) >>>>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=1000003" >>>>> >>>>> # Enable thread-local allocation blocks and allow the JVM to >>>>> automatically >>>>> # resize them at runtime. >>>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB" >>>>> >>>>> # http://www.evanjones.ca/jvm-mmap-pause.html >>>>> JVM_OPTS="$JVM_OPTS -XX:+PerfDisableSharedMem" >>>> >>>> >>>> All the best, >>>> >>>> >>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>> >>>> Sebastián Estévez >>>> >>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com >>>> >>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: >>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] >>>> <https://twitter.com/datastax> [image: g+.png] >>>> <https://plus.google.com/+Datastax/about> >>>> <http://feeds.feedburner.com/datastax> >>>> >>>> <http://cassandrasummit-datastax.com/> >>>> >>>> DataStax is the fastest, most scalable distributed database >>>> technology, delivering Apache Cassandra to the world’s most innovative >>>> enterprises. Datastax is built to be agile, always-on, and predictably >>>> scalable to any size. With more than 500 customers in 45 countries, >>>> DataStax >>>> is the database technology and transactional backbone of choice for the >>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>>> On Fri, Jul 10, 2015 at 12:55 PM, Kunal Gangakhedkar < >>>> kgangakhed...@gmail.com> wrote: >>>> >>>>> I upgraded my instance from 8GB to a 14GB one. >>>>> Allocated 8GB to jvm heap in cassandra-env.sh. >>>>> >>>>> And now, it crashes even faster with an OOM.. >>>>> >>>>> Earlier, with 4GB heap, I could go upto ~90% replication completion >>>>> (as reported by nodetool netstats); now, with 8GB heap, I cannot even get >>>>> there. I've already restarted cassandra service 4 times with 8GB heap. >>>>> >>>>> No clue what's going on.. :( >>>>> >>>>> Kunal >>>>> >>>>> On 10 July 2015 at 17:45, Jack Krupansky <jack.krupan...@gmail.com> >>>>> wrote: >>>>> >>>>>> You, and only you, are responsible for knowing your data and data >>>>>> model. >>>>>> >>>>>> If columns per row or rows per partition can be large, then an 8GB >>>>>> system is probably too small. But the real issue is that you need to keep >>>>>> your partition size from getting too large. >>>>>> >>>>>> Generally, an 8GB system is okay, but only for reasonably-sized >>>>>> partitions, like under 10MB. >>>>>> >>>>>> >>>>>> -- Jack Krupansky >>>>>> >>>>>> On Fri, Jul 10, 2015 at 8:05 AM, Kunal Gangakhedkar < >>>>>> kgangakhed...@gmail.com> wrote: >>>>>> >>>>>>> I'm new to cassandra >>>>>>> How do I find those out? - mainly, the partition params that you >>>>>>> asked for. Others, I think I can figure out. >>>>>>> >>>>>>> We don't have any large objects/blobs in the column values - it's >>>>>>> all textual, date-time, numeric and uuid data. >>>>>>> >>>>>>> We use cassandra to primarily store segmentation data - with segment >>>>>>> type as partition key. That is again divided into two separate column >>>>>>> families; but they have similar structure. >>>>>>> >>>>>>> Columns per row can be fairly large - each segment type as the row >>>>>>> key and associated user ids and timestamp as column value. >>>>>>> >>>>>>> Thanks, >>>>>>> Kunal >>>>>>> >>>>>>> On 10 July 2015 at 16:36, Jack Krupansky <jack.krupan...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> What does your data and data model look like - partition size, rows >>>>>>>> per partition, number of columns per row, any large values/blobs in >>>>>>>> column >>>>>>>> values? >>>>>>>> >>>>>>>> You could run fine on an 8GB system, but only if your rows and >>>>>>>> partitions are reasonably small. Any large partitions could blow you >>>>>>>> away. >>>>>>>> >>>>>>>> -- Jack Krupansky >>>>>>>> >>>>>>>> On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar < >>>>>>>> kgangakhed...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Attaching the stack dump captured from the last OOM. >>>>>>>>> >>>>>>>>> Kunal >>>>>>>>> >>>>>>>>> On 10 July 2015 at 13:32, Kunal Gangakhedkar < >>>>>>>>> kgangakhed...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Forgot to mention: the data size is not that big - it's barely >>>>>>>>>> 10GB in all. >>>>>>>>>> >>>>>>>>>> Kunal >>>>>>>>>> >>>>>>>>>> On 10 July 2015 at 13:29, Kunal Gangakhedkar < >>>>>>>>>> kgangakhed...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I have a 2 node setup on Azure (east us region) running Ubuntu >>>>>>>>>>> server 14.04LTS. >>>>>>>>>>> Both nodes have 8GB RAM. >>>>>>>>>>> >>>>>>>>>>> One of the nodes (seed node) died with OOM - so, I am trying to >>>>>>>>>>> add a replacement node with same configuration. >>>>>>>>>>> >>>>>>>>>>> The problem is this new node also keeps dying with OOM - I've >>>>>>>>>>> restarted the cassandra service like 8-10 times hoping that it >>>>>>>>>>> would finish >>>>>>>>>>> the replication. But it didn't help. >>>>>>>>>>> >>>>>>>>>>> The one node that is still up is happily chugging along. >>>>>>>>>>> All nodes have similar configuration - with libjna installed. >>>>>>>>>>> >>>>>>>>>>> Cassandra is installed from datastax's debian repo - pkg: dsc21 >>>>>>>>>>> version 2.1.7. >>>>>>>>>>> I started off with the default configuration - i.e. the default >>>>>>>>>>> cassandra-env.sh - which calculates the heap size automatically >>>>>>>>>>> (1/4 * RAM >>>>>>>>>>> = 2GB) >>>>>>>>>>> >>>>>>>>>>> But, that didn't help. So, I then tried to increase the heap to >>>>>>>>>>> 4GB manually and restarted. It still keeps crashing. >>>>>>>>>>> >>>>>>>>>>> Any clue as to why it's happening? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Kunal >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >