Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-07 Thread Shravan C
In fact I truncated hints table to stabilize the cluster. Through the heap dumps I was able to identify the table on which there were numerous queries. Then I focused on system_traces.session table around the time OOM occurred. It turned out to be a full table scan on a large table which caused

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-07 Thread Jeff Jirsa
On 2017-03-03 09:18 (-0800), Shravan Ch wrote: > > nodetool compactionstats -H > pending tasks: 3 > compaction typekeyspace table > completed totalunit progress > Compaction system

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-07 Thread Jeff Jirsa
On 2017-03-04 07:23 (-0800), "Thakrar, Jayesh" wrote: > LCS does not rule out frequent updates - it just says that there will be more > frequent compaction, which can potentially increase compaction activity > (which again can be throttled as needed). > But

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-06 Thread Eric Evans
On Fri, Mar 3, 2017 at 11:18 AM, Shravan Ch wrote: > More than 30 plus Cassandra servers in the primary DC went down OOM > exception below. What puzzles me is the scale at which it happened (at the > same minute). I will share some more details below. You'd be surprised;

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Thakrar, Jayesh
If possible, I would suggest running that command on a periodic basis (cron or whatever). Also, you can run it on a single server and iterate through all the nodes in the cluster/DC. Would also recommend running "nodetool compactionstats And looked at your concern about high value for hinted

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Priyanka
Sent from my iPhone > On Mar 3, 2017, at 12:18 PM, Shravan Ch wrote: > > Hello, > > More than 30 plus Cassandra servers in the primary DC went down OOM exception > below. What puzzles me is the scale at which it happened (at the same > minute). I will share some more

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Shravan C
I was looking at nodetool info across all nodes. Consistently JVM heap used is ~ 12GB and off heap is ~ 4-5GB. From: Thakrar, Jayesh Sent: Saturday, March 4, 2017 9:23:01 AM To: Shravan C; Joaquin Casares; user@cassandra.apache.org

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Edward Capriolo
On Saturday, March 4, 2017, Thakrar, Jayesh wrote: > LCS does not rule out frequent updates - it just says that there will be > more frequent compaction, which can potentially increase compaction > activity (which again can be throttled as needed). > > But STCS will

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Thakrar, Jayesh
LCS does not rule out frequent updates - it just says that there will be more frequent compaction, which can potentially increase compaction activity (which again can be throttled as needed). But STCS will guarantee OOM when you have large datasets. Did you have a look at the offheap + onheap

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Shravan C
We run C* at 32 GB and all servers have 96GB RAM. We use STCS . LCS is not an option for us as we have frequent updates. Thanks, Shravan From: Thakrar, Jayesh Sent: Friday, March 3, 2017 3:47:27 PM To: Joaquin Casares;

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Thakrar, Jayesh
Had been fighting a similar battle, but am now over the hump for most part. Get info on the server config (e.g. memory, cpu, free memory (free -g), etc) Run "nodetool info" on the nodes to get heap and off-heap sizes Run "nodetool tablestats" or "nodetool tablestats ." on the key large tables

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Shravan C
Hi Joaquin, We have inserts going into a tracking table. Tracking table is a simple table [PRIMARY KEY (comid, status_timestamp) ] with a few tracking attributes and sorted by status_timestamp. From a volume perspective it is not a whole lot. Thanks, Shravan

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Joaquin Casares
Hello Shravan, Typically asynchronous requests are recommended over batch statements since batch statements will cause more work on the coordinator node while individual requests, when using a TokenAwarePolicy, will hit a specific coordinator, perform a local disk seek, and return the requested