Re: cassandra OOM

2017-04-26 Thread Jean Carlo
t; > > *From:* Gopal, Dhruva [mailto:dhruva.go...@aspect.com] > *Sent:* Tuesday, April 04, 2017 5:34 PM > *To:* user@cassandra.apache.org > *Subject:* Re: cassandra OOM > > > > Thanks, that’s interesting – so CMS is a better option for > stability/performance? We’ll try this out

Re: cassandra OOM

2017-04-25 Thread Carlos Rolo
sider CMS any more. > > > > > > Sean Durity > > > > *From:* Gopal, Dhruva [mailto:dhruva.go...@aspect.com] > *Sent:* Tuesday, April 04, 2017 5:34 PM > *To:* user@cassandra.apache.org > *Subject:* Re: cassandra OOM > > > > Thanks, that’s interesting – so C

RE: cassandra OOM

2017-04-25 Thread Durity, Sean R
We have seen much better stability (and MUCH less GC pauses) from G1 with a variety of heap sizes. I don’t even consider CMS any more. Sean Durity From: Gopal, Dhruva [mailto:dhruva.go...@aspect.com] Sent: Tuesday, April 04, 2017 5:34 PM To: user@cassandra.apache.org Subject: Re: cassandra OOM

Re: cassandra OOM

2017-04-04 Thread Gopal, Dhruva
at 10:31 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: cassandra OOM Hi, we've seen G1GC going OOM on production clusters (repeatedly) with a 16GB heap when the workload is intense, and given you're running on m4.2xl I wouldn't go over 16GB for the heap. I'

Re: cassandra OOM

2017-04-03 Thread Alexander Dejanovski
t; # By default, ConcGCThreads is 1/4 of ParallelGCThreads. > > # Setting both to the same value can reduce STW durations. > > #-XX:ConcGCThreads=16 > > > > ### GC logging options -- uncomment to enable > > > > #-XX:+PrintGCDetails > > #-XX:+PrintGCDateStamps > > #-X

Re: cassandra OOM

2017-04-03 Thread Gopal, Dhruva
t; Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Monday, April 3, 2017 at 8:00 AM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: cassandra OOM Hi, could you share your GC settings ? G1 or CMS ? Heap size, etc... Than

Re: cassandra OOM

2017-04-03 Thread Alexander Dejanovski
Hi, could you share your GC settings ? G1 or CMS ? Heap size, etc... Thanks, On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva wrote: > Hi – > > We’ve had what looks like an OOM situation with Cassandra (we have a > dump file that got generated) in our staging

Re: Cassandra OOM on joining existing ring

2015-07-13 Thread Sebastian Estevez
Are you on the azure premium storage? http://www.datastax.com/2015/04/getting-started-with-azure-premium-storage-and-datastax-enterprise-dse Secondary indexes are built for convenience not performance. http://www.datastax.com/resources/data-modeling What's your compaction strategy? Your nodes

Re: Cassandra OOM on joining existing ring

2015-07-13 Thread Anuj Wadehra
We faced similar issue where we had 60k sstables due to coldness bug in 2.0.3. We solved it by following Datastax recommendation for Production at http://docs.datastax.com/en/cassandra/1.2/cassandra/install/installRecommendSettings.html : Step 1 : Add the following line to /etc/sysctl.conf :

Re: Cassandra OOM on joining existing ring

2015-07-12 Thread Kunal Gangakhedkar
Hi, Looks like that is my primary problem - the sstable count for the daily_challenges column family is 5k. Azure had scheduled maintenance window on Sat. All the VMs got rebooted one by one - including the current cassandra one - and it's taking forever to bring cassandra back up online. Is

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Attaching the stack dump captured from the last OOM. Kunal On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Forgot to mention: the data size is not that big - it's barely 10GB in all. Kunal On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Forgot to mention: the data size is not that big - it's barely 10GB in all. Kunal On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Hi, I have a 2 node setup on Azure (east us region) running Ubuntu server 14.04LTS. Both nodes have 8GB RAM. One of the nodes

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
I'm new to cassandra How do I find those out? - mainly, the partition params that you asked for. Others, I think I can figure out. We don't have any large objects/blobs in the column values - it's all textual, date-time, numeric and uuid data. We use cassandra to primarily store segmentation

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Jack Krupansky
You, and only you, are responsible for knowing your data and data model. If columns per row or rows per partition can be large, then an 8GB system is probably too small. But the real issue is that you need to keep your partition size from getting too large. Generally, an 8GB system is okay, but

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Jack Krupansky
What does your data and data model look like - partition size, rows per partition, number of columns per row, any large values/blobs in column values? You could run fine on an 8GB system, but only if your rows and partitions are reasonably small. Any large partitions could blow you away. -- Jack

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Thanks for quick reply. 1. I don't know what are the thresholds that I should look for. So, to save this back-and-forth, I'm attaching the cfstats output for the keyspace. There is one table - daily_challenges - which shows compacted partition max bytes as ~460M and another one -

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
And here is my cassandra-env.sh https://gist.github.com/kunalg/2c092cb2450c62be9a20 Kunal On 11 July 2015 at 00:04, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: From jhat output, top 10 entries for Instance Count for All Classes (excluding platform) shows: 2088223 instances of class

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
Thanks, Sebastian. Couple of questions (I'm really new to cassandra): 1. How do I interpret the output of 'nodetool cfstats' to figure out the issues? Any documentation pointer on that would be helpful. 2. I'm primarily a python/c developer - so, totally clueless about JVM environment. So,

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez
1. You want to look at # of sstables in cfhistograms or in cfstats look at: Compacted partition maximum bytes Maximum live cells per slice 2) No, here's the env.sh from 3.0 which should work with some tweaks:

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
From jhat output, top 10 entries for Instance Count for All Classes (excluding platform) shows: 2088223 instances of class org.apache.cassandra.db.BufferCell 1983245 instances of class org.apache.cassandra.db.composites.CompoundSparseCellName 1885974 instances of class

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
I upgraded my instance from 8GB to a 14GB one. Allocated 8GB to jvm heap in cassandra-env.sh. And now, it crashes even faster with an OOM.. Earlier, with 4GB heap, I could go upto ~90% replication completion (as reported by nodetool netstats); now, with 8GB heap, I cannot even get there. I've

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez
#1 You need more information. a) Take a look at your .hprof file (memory heap from the OOM) with an introspection tool like jhat or visualvm or java flight recorder and see what is using up your RAM. b) How big are your large rows (use nodetool cfstats on each node). If your data model is bad,

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez
#1 There is one table - daily_challenges - which shows compacted partition max bytes as ~460M and another one - daily_guest_logins - which shows compacted partition max bytes as ~36M. 460 is high, I like to keep my partitions under 100mb when possible. I've seen worse though. The fix is to

Re: Cassandra OOM, many deletedColumn

2013-03-13 Thread aaron morton
For JVM Heap it is 2G Try 4G and gc_grace = 1800 Realised that I did not provide a warning about the implication this has for node tool repair. If you are doing deleted on the CF you need to run nodetool repair every gc_grace seconds. In this case I think you main problem was not enough

Re: Cassandra OOM, many deletedColumn

2013-03-12 Thread 金剑
Thanks for you reply. we will try both of your recommentation. The OS memory is 8G, For JVM Heap it is 2G, DeletedColumn used 1.4G which are rooted from readStage thread. Do you think we need increase the size of JVM Heap? Configuration for the index columnFamily is create column family purge

Re: Cassandra OOM, many deletedColumn

2013-03-08 Thread aaron morton
You need to provide some details of the machine and the JVM configuration. But lets say you need to have 4Gb to 8GB for the JVM heap. If you have many deleted columns I would say you have a *lot* of garbage in each row. Consider reducing the gc_grace seconds so the columns are purged more

Re: Cassandra OOM, many deletedColumn

2013-03-06 Thread Jason Wee
hmm.. did you managed to take a look using nodetool tpstats? That may give you indication further.. Jason On Thu, Mar 7, 2013 at 1:56 PM, 金剑 jinjia...@gmail.com wrote: Hi, My version is 1.1.7 Our use case is : we have a index columnfamily to record how many resource is stored for a

Re: Cassandra OOM crash while mapping commitlog

2012-08-15 Thread Robin Verlangen
Everything still runs smooth. It's really plausible that the 1.1.3 version resolved this bug. 2012/8/13 Robin Verlangen ro...@us2.nl 3 hours ago I finished the upgraded of our cluster. Currently it runs quite smooth. I'll give an update within a week if this really solved our issues.

Re: Cassandra OOM crash while mapping commitlog

2012-08-13 Thread Robin Verlangen
@Tyler: We were already running most of our machines in 64bit JVM (Sun, not the OpenJDK). Those also crashed. @Holger: Good to hear that. I'll schedule an update for our Cassandra cluster. Thank you both for your time. 2012/8/13 Holger Hoffstaette holger.hoffstae...@googlemail.com On Sun, 12

Re: Cassandra OOM crash while mapping commitlog

2012-08-13 Thread Robin Verlangen
3 hours ago I finished the upgraded of our cluster. Currently it runs quite smooth. I'll give an update within a week if this really solved our issues. Cheers! 2012/8/13 Robin Verlangen ro...@us2.nl @Tyler: We were already running most of our machines in 64bit JVM (Sun, not the OpenJDK).

Re: Cassandra OOM crash while mapping commitlog

2012-08-12 Thread Robin Verlangen
Hmm, is issue caused by some 1.x version? Before it never occurred to us. Op 11 aug. 2012 22:36 schreef Tyler Hobbs ty...@datastax.com het volgende: We've seen something similar when running on a 32bit JVM, so make sure you're using the latest 64bit Java 6 JVM. On Sat, Aug 11, 2012 at 11:59

Re: Cassandra OOM crash while mapping commitlog

2012-08-12 Thread Holger Hoffstaette
On Sun, 12 Aug 2012 13:36:42 +0200, Robin Verlangen wrote: Hmm, is issue caused by some 1.x version? Before it never occurred to us. This bug was introduced in 1.1.0 and has been fixed in 1.1.3, where the closed/recycled segments are now closed unmapped properly. The default sizes are also

Re: Cassandra OOM crash while mapping commitlog

2012-08-11 Thread Tyler Hobbs
We've seen something similar when running on a 32bit JVM, so make sure you're using the latest 64bit Java 6 JVM. On Sat, Aug 11, 2012 at 11:59 AM, Robin Verlangen ro...@us2.nl wrote: Hi there, I currently see Cassandra crash every couple of days. I run a 3 node cluster on version 1.1.2. Does

Re: Cassandra OOM - 1.0.2

2012-02-07 Thread aaron morton
Just to ask the stupid question, have you tried setting it really high ? Like 50 ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 7/02/2012, at 10:27 AM, Ajeet Grewal wrote: Here are the last few lines of strace (of one of the

Re: Cassandra OOM - 1.0.2

2012-02-07 Thread Ajeet Grewal
On Tue, Feb 7, 2012 at 10:45 AM, aaron morton aa...@thelastpickle.com wrote: Just to ask the stupid question, have you tried setting it really high ? Like 50 ? No I have not. I moved to mmap_index_only as a stopgap solution. Is it possible for there to be that many mmaps for about 300 db

Re: Cassandra OOM - 1.0.2

2012-02-06 Thread Ajeet Grewal
On Sat, Feb 4, 2012 at 7:03 AM, Jonathan Ellis jbel...@gmail.com wrote: Sounds like you need to increase sysctl vm.max_map_count This did not work. I increased vm.max_map_count from 65536 to 131072. I am still getting the same error. ERROR [SSTableBatchOpen:4] 2012-02-06 11:43:50,463

Re: Cassandra OOM - 1.0.2

2012-02-06 Thread Ajeet Grewal
On Mon, Feb 6, 2012 at 11:50 AM, Ajeet Grewal asgre...@gmail.com wrote: On Sat, Feb 4, 2012 at 7:03 AM, Jonathan Ellis jbel...@gmail.com wrote: Sounds like you need to increase sysctl vm.max_map_count This did not work. I increased vm.max_map_count from 65536 to 131072. I am still getting the

Re: Cassandra OOM - 1.0.2

2012-02-06 Thread Ajeet Grewal
Here are the last few lines of strace (of one of the threads). There are a bunch of mmap system calls. Notice the last mmap call a couple of lines before the trace ends. Could the last mmap call fail? == BEGIN STRACE == mmap(NULL, 2147487599, PROT_READ, MAP_SHARED, 37, 0xbb000) = 0x7709b54000

Re: Cassandra OOM - 1.0.2

2012-02-04 Thread Jonathan Ellis
Sounds like you need to increase sysctl vm.max_map_count On Fri, Feb 3, 2012 at 7:27 PM, Ajeet Grewal asgre...@gmail.com wrote: Hey guys, I am getting an out of memory (mmap failed) error with Cassandra 1.0.2. The relevant log lines are pasted at http://pastebin.com/UM28ZC1g. Cassandra

Re: Cassandra OOM

2012-01-13 Thread Віталій Тимчишин
2012/1/4 Vitalii Tymchyshyn tiv...@gmail.com 04.01.12 14:25, Radim Kolar написав(ла): So, what are cassandra memory requirement? Is it 1% or 2% of disk data? It depends on number of rows you have. if you have lot of rows then primary memory eaters are index sampling data and bloom filters.

Re: Cassandra OOM

2012-01-04 Thread Vitalii Tymchyshyn
Hello. BTW: It would be great for cassandra to shutdown on Errors like OOM because now I am not sure if the problem described in previous email is the root cause or some of OOM error found in log made some writer stop. I am now looking at different OOMs in my cluster. Currently each node

Re: Cassandra OOM

2012-01-04 Thread Vitalii Tymchyshyn
04.01.12 14:25, Radim Kolar написав(ла): So, what are cassandra memory requirement? Is it 1% or 2% of disk data? It depends on number of rows you have. if you have lot of rows then primary memory eaters are index sampling data and bloom filters. I use index sampling 512 and bloom filters set

Re: Cassandra OOM

2012-01-03 Thread aaron morton
The DynamicSnitch can result in less read operations been sent to a node, but as long as a node is marked as UP mutations are sent to all replicas. Nodes will shed load when they pull messages off the queue that have expired past rpc_timeout, but they will not feed back flow control to the

Re: Cassandra OOM on repair.

2011-07-17 Thread Andrey Stepachev
Looks like problem in code: public IndexSummary(long expectedKeys) { long expectedEntries = expectedKeys / DatabaseDescriptor.getIndexInterval(); if (expectedEntries Integer.MAX_VALUE) // TODO: that's a _lot_ of keys, or a very low interval throw

Re: Cassandra OOM on repair.

2011-07-17 Thread Jonathan Ellis
Can't think of any. On Sun, Jul 17, 2011 at 1:27 PM, Andrey Stepachev oct...@gmail.com wrote: Looks like problem in code:     public IndexSummary(long expectedKeys)     {         long expectedEntries = expectedKeys / DatabaseDescriptor.getIndexInterval();         if (expectedEntries

Re: Cassandra OOM on repair.

2011-07-15 Thread Andrey Stepachev
Looks like key indexes eat all memory: http://paste.kde.org/97213/ 2011/7/15 Andrey Stepachev oct...@gmail.com UPDATE: I found, that a) with min10G cassandra survive. b) I have ~1000 sstables c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow So, I have a