We are trying to learn what we can about the performance of Cassandra. I hope to have some results to share publicly in the next couple of weeks.
The 0.4 version seems to have handled the insert load better, but is having trouble with a 50/50 read/write workload. One server again has a busy core with the other 7 cores (and the other servers) idle or near idle. Any ideas? The problem seems to come when we dial up the request rate made by the client; after a certain point, the achievable throughput slows way down, even lower than what we could have achieved with a lower request rate. (Incidentally, we are reading and writing 10 KB records; does the large data size have any impact?) And using top -H, it looks like it is one of the Java threads that is consistently busy. Maybe it is GC again. I was hoping to chat with some of you Cassandra folks when we visited FB last week...perhaps we can grab coffee sometime and chat about these issues... Thanks! brian ________________________________________ From: Sandeep Tata [[email protected]] Sent: Wednesday, August 19, 2009 1:29 PM To: [email protected] Subject: Re: Anybody experience one Cassandra server locking up? Brian, Are you guys planning to run workloads at Yahoo to compare Cassandra and PNUTS? We'd be curious to see what you learn with the 0.4/trunk code. Sandeep On Wed, Aug 19, 2009 at 10:20 AM, Brian Frank Cooper<[email protected]> wrote: > Probably you are right; after Jun's response I looked in the log and saw an > out of memory exception. I'll try the 0.4 beta... > > Thanks! > > brian > > -----Original Message----- > From: Jonathan Ellis [mailto:[email protected]] > Sent: Wednesday, August 19, 2009 9:12 AM > To: [email protected] > Subject: Re: Anybody experience one Cassandra server locking up? > > sounds like you are exhausting the memory on that instance and it is > going into "GC swap" trying to free enough to continue. this is very > easy to do on 0.3 -- try upgrading to the 0.4 beta if you are using > 0.3. > > On Tue, Aug 18, 2009 at 3:36 PM, Brian Frank > Cooper<[email protected]> wrote: >> Hi folks, >> >> >> >> I have been loading a 6-server Cassandra cluster with 1KB records. After a >> few million inserts, the insert rate drops dramatically. After >> investigation, one of the Cassandra servers seems to be in a bad state, >> using 100% of one core on an 8-core machine, and 0% on the other cores. >> Inserts to this box have completely stopped, and the inserts to the other >> boxes have slowed way down (more than a factor of 10 slower.) A "kill" or >> "kill -3" to the bad java process does nothing; I have to use "kill -9" to >> stop it. Has anybody experienced anything like this? >> >> >> >> Additional info: >> >> >> >> The servers are 8 core, 8GB servers. I am running 64 bit java 1.6, and here >> are the JVM options: >> >> >> >> # Arguments to pass to the JVM >> >> JVM_OPTS=" \ >> >> -ea \ >> >> -Xdebug \ >> >> -Xrunjdwp:transport=dt_socket,server=y,address=8888,suspend=n \ >> >> -Xms128M \ >> >> -Xmx6G \ >> >> -XX:SurvivorRatio=8 \ >> >> -XX:TargetSurvivorRatio=90 \ >> >> -XX:+AggressiveOpts \ >> >> -XX:+UseParNewGC \ >> >> -XX:+UseConcMarkSweepGC \ >> >> -XX:CMSInitiatingOccupancyFraction=1 \ >> >> -XX:+CMSParallelRemarkEnabled \ >> >> -XX:+HeapDumpOnOutOfMemoryError \ >> >> -Dcom.sun.management.jmxremote.port=8080 \ >> >> -Dcom.sun.management.jmxremote.ssl=false \ >> >> -Dcom.sun.management.jmxremote.authenticate=false" >> >> >> >> (standard options from the Cassandra distribution, except for the 6GB of >> heap space.) >> >> >> >> Replication factor is 1 (this is just a test, not a production setup) and >> memtable size is set to 1GB. >> >> >> >> Thanks. >> >> >> >> brian >
