RE: another OOM

Brian Burruss Fri, 18 Dec 2009 16:28:16 -0800

i am simulating load by using two virtual machines (on separate boxes than the 
servers) each running an app that spawns 12 threads; 6 threads doing reads and 
6 threads doing writes.  so i have a total of 12 read threads, and 12 write 
threads.  between each thread's operation it waits 10ms.  the write threads are 
writing a 2k block of data, and the read threads are reading what is written so 
every read should return data.  right now i'm seeing about 800 ops/sec total 
throughput for all clients/servers.  if i take the 10ms delay out, of course it 
will go faster but seems to burden cassandra too much.

we are trying to prove that cassandra can run and sustain load.  we are 
planning a 10TB system that needs to handle about 10k ops/sec.

for my tests i have two machines for servers, each with 16G RAM, 600G 10k SCSI 
drive, 2x 2-core CPU (total 4 cores per machine).  starting JVM with -Xmx6G.  
the network is 100Mbits.  (this is not how the cluster would look in prod, but 
it's all the hardware i have until first of 2010.)

cluster contains ~126,281,657 data elements using about 298G on one node's disk

i don't have the commitlog on a separate drive yet.

during normal operation, i see the following:

- memory is staying fairly low for the size of data, low enough where i didn't 
monitor it, but i believe it was less than 3G.
- "global" read latency creep up slightly as reported by StorageProxy.
- "round trip time on the wire" as reported by my client creeps up at a steeper 
slope then "global" read latency.  so there is a discrepancy somewhere with the 
stats - i have added another JMX data point to cassandra to measure the overall 
time spent in cassandra -  but i got to get the servers started again to see 
what it reports ;)

using node 1 and node 2, simulating a crash of node 1 using kill -9:

- node 1 was OOM'ing when trying to restart after a crash, but this seems 
fixed.  it is staying cool and quiet
- node 2 is now OOM'ing during restart of node 1.  memory steadily grows.  last 
thing i see in log is "Starting up server gossip" until OOM

what bothers me the most is not that i'm getting an OOM, but i can't predict 
when i'll get it.  the fact that restarting a failed node requires more than 
double the "normal operating" RAM is a bit of a worry.

not sure what else to tell you at the moment.  lemme know what i can provide so 
we can figure this out.

thx!

________________________________________
From: Jonathan Ellis [[email protected]]
Sent: Friday, December 18, 2009 3:49 PM
To: [email protected]
Subject: Re: another OOM

It sounds like you're simply throwing too much load at Cassandra.
Adding more machines can help.  Look at
http://wiki.apache.org/cassandra/Operations for how to track metrics
that will tell you how much is "too much."

Telling us more about your workload would be useful in sanity checking
that hypothesis. :)

-Jonathan

On Fri, Dec 18, 2009 at 4:34 PM, Brian Burruss <[email protected]> wrote:
> this time i simulated node 1 crashing, waited a few minutes, then restarted 
> it.  after a while node 2 OOM'ed.
>
> same 2 node cluster with RF=2, W=1, R=1.  i up'ed the RAM to 6G this time.
>
> cluster contains ~126,281,657 data elements containing about 298G on one 
> node's disk
>
> thx!

RE: another OOM

Reply via email to