RE: How know node is fully up?
when i see this message it seems like it is still not responding to read requests, so IMO it isn't fully up and operational. i assume it is still handing off or some other sync operation. i think a JMX paramater should be exposed that contains the state of a node. From: Jonathan Ellis [jbel...@gmail.com] Sent: Tuesday, December 22, 2009 8:50 PM To: cassandra-user@incubator.apache.org Subject: Re: How know node is fully up? It's up when it logs Cassandra starting up... and starts listening for thrift connections On Tue, Dec 22, 2009 at 10:16 PM, Brian Burruss bburr...@real.com wrote: I never heard from anyone about this. I think it is important for bringing nodes out of service during upgrades so no data loss occurs. Also when introducing a new node you need to know when it is fully populated. Tux! Brian Burruss bburr...@real.com wrote: How can i tell that a node is completely up and taking reads and writes? - at startup? - after new bootstrap? - after a node has been unavailable for some time and rejoins the cluster? i see the INFO [main] [CassandraDaemon.java:141] Cassandra starting up... message in the log, but it seems to have happened way too fast after i simulated a crash. using tpstats i don't see any ROW-READ-STAGE completed, but lots of ROW-MUTATION-STAGE completed which seems to be correct for a node that is still sync'ing with the cluster after being unavailable. .. but how do i know ;) thx!
RE: MultiThread Client problem with thrift
i don't close the connection to a server unless i get exceptions. and when i close the connection i try a new server in the cluster just to keep the connections spread across the cluster. should i be closing them? if the connection is closed by client or server i'll just reconnect. From: Ran Tavory [ran...@gmail.com] Sent: Tuesday, December 22, 2009 9:50 AM To: cassandra-user@incubator.apache.org Subject: Re: MultiThread Client problem with thrift I don't know how keeping the connections open affects at scale. I suppose if you have 10 to 1 ratio of cassandra clients to cassandra server (probably a typical ratio) then you may be using too much server resources On Tue, Dec 22, 2009 at 4:46 PM, matthew hawthorne mhawtho...@gmail.commailto:mhawtho...@gmail.com wrote: On Tue, Dec 22, 2009 at 9:10 AM, Ran Tavory ran...@gmail.commailto:ran...@gmail.com wrote: Not at expert in this field, but I think what you want is use a connection pool and NOT close the connections - reuse them. Only idle connections are released after, say 1sec. Also, with a connection pool it's easy to throttle the application, you can tell the pool to block if all 50 connections, or how many you define are allowed. I did something very similar to this. A difference in my approach is that I did not release idle connections after a specific time period, instead I performed a liveness check on each connection after obtaining it from the pool, like this: // get client connection from pool Cassandra.Client client = try { client.getInputProtocol().getTransport().flush(); } catch (TTransportException e) { // connection is invalid, obtain new connection } It seemed to work during my testing, not sure if the thrift specifics are 100% correct (meaning I'm not sure if the catch block will work for all situations involving stale or expired connections). -matt On Tue, Dec 22, 2009 at 4:01 PM, Richard Grossman richie...@gmail.commailto:richie...@gmail.com wrote: So I can't use it. But I've make my own connection pool. This are not fix nothing because the problem is lower than even java. In fact the socket is closed and java consider it as close but the system keep the Socket in the state TIME_WAIT. Then the port used is actually still in use. So my question is that is there people that manage to open multiple connection and ride off the TIME_WAIT. No matter in which language PHP or Python etc... Thanks On Tue, Dec 22, 2009 at 2:55 PM, Ran Tavory ran...@gmail.commailto:ran...@gmail.com wrote: I don't have a 0.5.0-beta2 version, no. It's not too difficult to add it, but I haven't done so myself, I'm using 0.4.2 On Tue, Dec 22, 2009 at 2:42 PM, Richard Grossman richie...@gmail.commailto:richie...@gmail.com wrote: Yes of course but do you have updated to cassandra 0.5.0-beta2 ? On Tue, Dec 22, 2009 at 2:30 PM, Ran Tavory ran...@gmail.commailto:ran...@gmail.com wrote: Would connection pooling work for you? This Java client http://code.google.com/p/cassandra-java-client/ has connection pooling. I haven't put the client under stress yet so I can't testify, but this may be a good solution for you On Tue, Dec 22, 2009 at 2:22 PM, Richard Grossman richie...@gmail.commailto:richie...@gmail.com wrote: I agree it's solve my problem but can give a bigger one. The problem is I can't succeed to prevent opening a lot of connection On Tue, Dec 22, 2009 at 1:51 PM, Jaakko rosvopaalli...@gmail.commailto:rosvopaalli...@gmail.com wrote: Hi, I don't know the particulars of java implementation, but if it works the same way as Unix native socket API, then I would not recommend setting linger to zero. SO_LINGER option with zero value will cause TCP connection to be aborted immediately as soon as the socket is closed. That is, (1) remaining data in the send buffer will be discarded, (2) no proper disconnect handshake and (3) receiving end will get TCP reset. Sure this will avoid TIME_WAIT state, but TIME_WAIT is our friend and is there to avoid packets from old connection being delivered to new incarnation of the connection. Instead of avoiding the state, the application should be changed so that TIME_WAIT will not be a problem. How many open files you can see when the exception happens? Might be that you're out of file descriptors. -Jaakko On Tue, Dec 22, 2009 at 8:17 PM, Richard Grossman richie...@gmail.commailto:richie...@gmail.com wrote: Hi To all is interesting I've found a solution seems not recommended but working. When opening a Socket set this: tSocket.getSocket().setReuseAddress(true); tSocket.getSocket().setSoLinger(true, 0); it's prevent to have a lot of connection TIME_WAIT state but not recommended.
Re: How know node is fully up?
I never heard from anyone about this. I think it is important for bringing nodes out of service during upgrades so no data loss occurs. Also when introducing a new node you need to know when it is fully populated. Tux! Brian Burruss bburr...@real.com wrote: How can i tell that a node is completely up and taking reads and writes? - at startup? - after new bootstrap? - after a node has been unavailable for some time and rejoins the cluster? i see the INFO [main] [CassandraDaemon.java:141] Cassandra starting up... message in the log, but it seems to have happened way too fast after i simulated a crash. using tpstats i don't see any ROW-READ-STAGE completed, but lots of ROW-MUTATION-STAGE completed which seems to be correct for a node that is still sync'ing with the cluster after being unavailable. .. but how do i know ;) thx!
How know node is fully up?
How can i tell that a node is completely up and taking reads and writes? - at startup? - after new bootstrap? - after a node has been unavailable for some time and rejoins the cluster? i see the INFO [main] [CassandraDaemon.java:141] Cassandra starting up... message in the log, but it seems to have happened way too fast after i simulated a crash. using tpstats i don't see any ROW-READ-STAGE completed, but lots of ROW-MUTATION-STAGE completed which seems to be correct for a node that is still sync'ing with the cluster after being unavailable. .. but how do i know ;) thx!
another OOM
this time i simulated node 1 crashing, waited a few minutes, then restarted it. after a while node 2 OOM'ed. same 2 node cluster with RF=2, W=1, R=1. i up'ed the RAM to 6G this time. cluster contains ~126,281,657 data elements containing about 298G on one node's disk thx! system-errors.log Description: system-errors.log
RE: hard disk size
it does help after a compaction but it seems to compact quite a bit under load so the disk is always very active keeping read performance lower than desirable. after i get the cluster restarted because of the OOM i can give some numbers. From: Jonathan Ellis [jbel...@gmail.com] Sent: Friday, December 18, 2009 2:04 PM To: cassandra-user@incubator.apache.org Subject: Re: hard disk size Does nodeprobe compact to help w/ read latency? On Fri, Dec 18, 2009 at 12:38 PM, Brian Burruss bburr...@real.com wrote: what are other folks' database size (on disk) per node? or what are you expecting to grow to per node. trying to size out a cluster and wondering what is a practical limit for cassandra. the read latency is growing as the dataset gets larger and seems like if i got to a terrabyte on one server the latency would possibly be too high. thoughts?
RE: another OOM
i am simulating load by using two virtual machines (on separate boxes than the servers) each running an app that spawns 12 threads; 6 threads doing reads and 6 threads doing writes. so i have a total of 12 read threads, and 12 write threads. between each thread's operation it waits 10ms. the write threads are writing a 2k block of data, and the read threads are reading what is written so every read should return data. right now i'm seeing about 800 ops/sec total throughput for all clients/servers. if i take the 10ms delay out, of course it will go faster but seems to burden cassandra too much. we are trying to prove that cassandra can run and sustain load. we are planning a 10TB system that needs to handle about 10k ops/sec. for my tests i have two machines for servers, each with 16G RAM, 600G 10k SCSI drive, 2x 2-core CPU (total 4 cores per machine). starting JVM with -Xmx6G. the network is 100Mbits. (this is not how the cluster would look in prod, but it's all the hardware i have until first of 2010.) cluster contains ~126,281,657 data elements using about 298G on one node's disk i don't have the commitlog on a separate drive yet. during normal operation, i see the following: - memory is staying fairly low for the size of data, low enough where i didn't monitor it, but i believe it was less than 3G. - global read latency creep up slightly as reported by StorageProxy. - round trip time on the wire as reported by my client creeps up at a steeper slope then global read latency. so there is a discrepancy somewhere with the stats - i have added another JMX data point to cassandra to measure the overall time spent in cassandra - but i got to get the servers started again to see what it reports ;) using node 1 and node 2, simulating a crash of node 1 using kill -9: - node 1 was OOM'ing when trying to restart after a crash, but this seems fixed. it is staying cool and quiet - node 2 is now OOM'ing during restart of node 1. memory steadily grows. last thing i see in log is Starting up server gossip until OOM what bothers me the most is not that i'm getting an OOM, but i can't predict when i'll get it. the fact that restarting a failed node requires more than double the normal operating RAM is a bit of a worry. not sure what else to tell you at the moment. lemme know what i can provide so we can figure this out. thx! From: Jonathan Ellis [jbel...@gmail.com] Sent: Friday, December 18, 2009 3:49 PM To: cassandra-user@incubator.apache.org Subject: Re: another OOM It sounds like you're simply throwing too much load at Cassandra. Adding more machines can help. Look at http://wiki.apache.org/cassandra/Operations for how to track metrics that will tell you how much is too much. Telling us more about your workload would be useful in sanity checking that hypothesis. :) -Jonathan On Fri, Dec 18, 2009 at 4:34 PM, Brian Burruss bburr...@real.com wrote: this time i simulated node 1 crashing, waited a few minutes, then restarted it. after a while node 2 OOM'ed. same 2 node cluster with RF=2, W=1, R=1. i up'ed the RAM to 6G this time. cluster contains ~126,281,657 data elements containing about 298G on one node's disk thx!
RE: OOM Exception
sorry, thought i included everything ;) however, i am using beta2 From: Jonathan Ellis [jbel...@gmail.com] Sent: Wednesday, December 16, 2009 3:18 PM To: cassandra-user@incubator.apache.org Subject: Re: OOM Exception What version are you using? 0.5 beta2 fixes the using-more-memory-on-startup problem. On Wed, Dec 16, 2009 at 5:16 PM, Brian Burruss bburr...@real.com wrote: i'll put my question first: - how can i determine how much RAM is required by cassandra? (for normal operation and restarting server) *** i've attached my storage-conf.xml i've gotten several more OOM exceptions since i mentioned it a week or so ago. i started from a fresh database a couple days ago and have been adding 2k blocks of data keyed off a random integer at the rate of about 400/sec. i have a 2 node cluster, RF=2, Consistency for read/write is ONE. there are ~70,420,082 2k blocks of data in the database. i used the default memory setup of Xmx1G when i started a couple days ago. as the database grew to ~180G (reported by unix du command) both servers OOM'ed at about the same time, within 10 minutes of each other. well needless to say, my cluster is dead. so i upped the memory to 3G and the servers tried to come back up, but one died again with OOM. Before cleaning the disk and starting over a couple days ago, i played the game of jack up the RAM, but eventually i didn't want to up it anymore when i got to 5G. the parameter, SSTable.INDEX_INTERVAL, was discussed a few days ago that would change the number of keys cached in memory, so i could modify that at the cost of read performance, but doing the math, 3G should be plenty of room. it seems like startup requires more RAM than just normal running. so this of course concerns me. i have the hprof files from when the server initially crashed and when it crashed trying to restart if anyone wants them
RE: OOM Exception
attached ... the log starts when i restarted server. notice that not too far into it is when the other node went down because of OOM and i restarted it as well. From: Jonathan Ellis [jbel...@gmail.com] Sent: Wednesday, December 16, 2009 4:53 PM To: cassandra-user@incubator.apache.org Subject: Re: OOM Exception sorry, i meant the system.log the 2nd time (clear it out before replaying so it's not confused w/ other info, pls) On Wed, Dec 16, 2009 at 5:39 PM, Brian Burruss bburr...@real.com wrote: is this what you want? they are big - i'd rather not spam everyone with them. if you need them or the hprof files i can tar them and send them to you. thx! [bburr...@gen-app02 cassandra]$ ls -l ~/cassandra/btoddb/commitlog/ total 597228 -rw-rw-r-- 1 bburruss bburruss 134219796 Dec 16 13:52 CommitLog-1260995895123.log -rw-rw-r-- 1 bburruss bburruss 134218547 Dec 16 13:52 CommitLog-1260997811317.log -rw-rw-r-- 1 bburruss bburruss 134218331 Dec 16 13:52 CommitLog-1260998497744.log -rw-rw-r-- 1 bburruss bburruss 134219677 Dec 16 13:53 CommitLog-1261000330587.log -rw-rw-r-- 1 bburruss bburruss 74055680 Dec 16 14:49 CommitLog-1261000439079.log [bburr...@gen-app02 cassandra]$ From: Jonathan Ellis [jbel...@gmail.com] Sent: Wednesday, December 16, 2009 3:29 PM To: cassandra-user@incubator.apache.org Subject: Re: OOM Exception How large are the log files being replayed? Can you attach the log from a replay attempt? On Wed, Dec 16, 2009 at 5:21 PM, Brian Burruss bburr...@real.com wrote: sorry, thought i included everything ;) however, i am using beta2 From: Jonathan Ellis [jbel...@gmail.com] Sent: Wednesday, December 16, 2009 3:18 PM To: cassandra-user@incubator.apache.org Subject: Re: OOM Exception What version are you using? 0.5 beta2 fixes the using-more-memory-on-startup problem. On Wed, Dec 16, 2009 at 5:16 PM, Brian Burruss bburr...@real.com wrote: i'll put my question first: - how can i determine how much RAM is required by cassandra? (for normal operation and restarting server) *** i've attached my storage-conf.xml i've gotten several more OOM exceptions since i mentioned it a week or so ago. i started from a fresh database a couple days ago and have been adding 2k blocks of data keyed off a random integer at the rate of about 400/sec. i have a 2 node cluster, RF=2, Consistency for read/write is ONE. there are ~70,420,082 2k blocks of data in the database. i used the default memory setup of Xmx1G when i started a couple days ago. as the database grew to ~180G (reported by unix du command) both servers OOM'ed at about the same time, within 10 minutes of each other. well needless to say, my cluster is dead. so i upped the memory to 3G and the servers tried to come back up, but one died again with OOM. Before cleaning the disk and starting over a couple days ago, i played the game of jack up the RAM, but eventually i didn't want to up it anymore when i got to 5G. the parameter, SSTable.INDEX_INTERVAL, was discussed a few days ago that would change the number of keys cached in memory, so i could modify that at the cost of read performance, but doing the math, 3G should be plenty of room. it seems like startup requires more RAM than just normal running. so this of course concerns me. i have the hprof files from when the server initially crashed and when it crashed trying to restart if anyone wants them system.log.tar.gz Description: system.log.tar.gz
create only - no update
can the cassandra client (java specifically) specify that a particular put should be create only, do not update? If the value already exists in the database, i want the put to fail. for instance, two users want the exact same username, so they both do a get to determine if the username already exists, it doesn't, so they create. the last one to create wins, correct? thx
RE: read latency creaping up
thx, i'm actually the B. Todd Burruss in that thread .. we changed our email system and well now, i'm just Brian .. long story. anyway, in this case it isn't compaction pendings as i can kill the clients and immediately restart and the latency is back to a reasonable number. i'm still investigating. thx! From: Eric Evans [eev...@rackspace.com] Sent: Monday, December 14, 2009 8:23 AM To: cassandra-user@incubator.apache.org Subject: RE: read latency creaping up On Sun, 2009-12-13 at 13:18 -0800, Brian Burruss wrote: if this isn't a known issue, lemme do some more investigating. my test client becomes more random with reads as time progresses, so possibly this is what causes the latency issue. however, all that being said, the performance really becomes bad after a while. Have a look at the following thread: http://thread.gmane.org/gmane.comp.db.cassandra.user/1402 -- Eric Evans eev...@rackspace.com
Re: read latency creaping up
Well not sure how that would affect he latency as reported by the Cassandra server using nodeprobe cfstats Jonathan Ellis jbel...@gmail.com wrote: possibly the clients are running into memory pressure? On Mon, Dec 14, 2009 at 4:27 PM, Brian Burruss bburr...@real.com wrote: thx, i'm actually the B. Todd Burruss in that thread .. we changed our email system and well now, i'm just Brian .. long story. anyway, in this case it isn't compaction pendings as i can kill the clients and immediately restart and the latency is back to a reasonable number. i'm still investigating. thx! From: Eric Evans [eev...@rackspace.com] Sent: Monday, December 14, 2009 8:23 AM To: cassandra-user@incubator.apache.org Subject: RE: read latency creaping up On Sun, 2009-12-13 at 13:18 -0800, Brian Burruss wrote: if this isn't a known issue, lemme do some more investigating. my test client becomes more random with reads as time progresses, so possibly this is what causes the latency issue. however, all that being said, the performance really becomes bad after a while. Have a look at the following thread: http://thread.gmane.org/gmane.comp.db.cassandra.user/1402 -- Eric Evans eev...@rackspace.com
RE: read latency creaping up
i agree. i don't know anything about thrift, and i don't know how it keeps connections open or manages resources from a client or server perspective, but this situation suggests that maybe killing the clients is forcing the server to free something. how's that sound :) From: Jonathan Ellis [jbel...@gmail.com] Sent: Monday, December 14, 2009 3:12 PM To: cassandra-user@incubator.apache.org Subject: Re: read latency creaping up hmm, me neither but, I can't think how restarting the client would, either :) On Mon, Dec 14, 2009 at 4:59 PM, Brian Burruss bburr...@real.com wrote: Well not sure how that would affect he latency as reported by the Cassandra server using nodeprobe cfstats Jonathan Ellis jbel...@gmail.com wrote: possibly the clients are running into memory pressure? On Mon, Dec 14, 2009 at 4:27 PM, Brian Burruss bburr...@real.com wrote: thx, i'm actually the B. Todd Burruss in that thread .. we changed our email system and well now, i'm just Brian .. long story. anyway, in this case it isn't compaction pendings as i can kill the clients and immediately restart and the latency is back to a reasonable number. i'm still investigating. thx! From: Eric Evans [eev...@rackspace.com] Sent: Monday, December 14, 2009 8:23 AM To: cassandra-user@incubator.apache.org Subject: RE: read latency creaping up On Sun, 2009-12-13 at 13:18 -0800, Brian Burruss wrote: if this isn't a known issue, lemme do some more investigating. my test client becomes more random with reads as time progresses, so possibly this is what causes the latency issue. however, all that being said, the performance really becomes bad after a while. Have a look at the following thread: http://thread.gmane.org/gmane.comp.db.cassandra.user/1402 -- Eric Evans eev...@rackspace.com
Re: read latency creaping up
I plan to investigate that next. There isn't a straightforward disconnect that I could easily see. Ian Holsman i...@holsman.net wrote: can you make it so that the client restarts the connection every 30m or so ? It could be an issue in thrift or something with long-lived connections. On Dec 15, 2009, at 10:16 AM, Brian Burruss wrote: i agree. i don't know anything about thrift, and i don't know how it keeps connections open or manages resources from a client or server perspective, but this situation suggests that maybe killing the clients is forcing the server to free something. how's that sound :) From: Jonathan Ellis [jbel...@gmail.com] Sent: Monday, December 14, 2009 3:12 PM To: cassandra-user@incubator.apache.org Subject: Re: read latency creaping up hmm, me neither but, I can't think how restarting the client would, either :) On Mon, Dec 14, 2009 at 4:59 PM, Brian Burruss bburr...@real.com wrote: Well not sure how that would affect he latency as reported by the Cassandra server using nodeprobe cfstats Jonathan Ellis jbel...@gmail.com wrote: possibly the clients are running into memory pressure? On Mon, Dec 14, 2009 at 4:27 PM, Brian Burruss bburr...@real.com wrote: thx, i'm actually the B. Todd Burruss in that thread .. we changed our email system and well now, i'm just Brian .. long story. anyway, in this case it isn't compaction pendings as i can kill the clients and immediately restart and the latency is back to a reasonable number. i'm still investigating. thx! From: Eric Evans [eev...@rackspace.com] Sent: Monday, December 14, 2009 8:23 AM To: cassandra-user@incubator.apache.org Subject: RE: read latency creaping up On Sun, 2009-12-13 at 13:18 -0800, Brian Burruss wrote: if this isn't a known issue, lemme do some more investigating. my test client becomes more random with reads as time progresses, so possibly this is what causes the latency issue. however, all that being said, the performance really becomes bad after a while. Have a look at the following thread: http://thread.gmane.org/gmane.comp.db.cassandra.user/1402 -- Eric Evans eev...@rackspace.com -- Ian Holsman i...@holsman.net
RE: OOM Exception
another OOM exception. the only thing interesting about my testing is that there are 2 servers, RF=2, W=1, R=1 ... there is 248G of data on each server. I have -Xmx3G assigned to each server 2009-12-12 22:04:37,436 ERROR [pool-1-thread-309] [Cassandra.java:734] Internal error processing get java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.service.StorageProxy.weakReadLocal(StorageProxy.java:523) at org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:373) at org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92) at org.apache.cassandra.service.CassandraServer.multigetColumns(CassandraServer.java:265) at org.apache.cassandra.service.CassandraServer.multigetInternal(CassandraServer.java:320) at org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:253) at org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:724) at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:712) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) From: Brian Burruss Sent: Saturday, December 12, 2009 7:45 AM To: cassandra-user@incubator.apache.org Subject: OOM Exception this happened after cassandra was running for a couple of days. I have -Xmx3G on JVM. is there any other info you need so this makes sense? thx! 2009-12-11 21:38:37,216 ERROR [HINTED-HANDOFF-POOL:1] [DebuggableThreadPoolExecutor.java:157] Error in ThreadPoolExecutor java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:151) at org.apache.cassandra.io.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:144) at org.apache.cassandra.io.SSTableWriter.init(SSTableWriter.java:53) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:911) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:855) at org.apache.cassandra.db.ColumnFamilyStore.doMajorCompactionInternal(ColumnFamilyStore.java:698) at org.apache.cassandra.db.ColumnFamilyStore.doMajorCompaction(ColumnFamilyStore.java:670) at org.apache.cassandra.db.HintedHandOffManager.deliverAllHints(HintedHandOffManager.java:190) at org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:75) at org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:249) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636)
RE: OOM Exception
i don't have a problem increasing memory, just need a rough formula for how much to expect to use. anyone recommend a good heap walker? i can then tell you what is taking the memory. thx! Node 1 -- Pool NameActive Pending Completed FILEUTILS-DELETE-POOL 0 0158 MESSAGING-SERVICE-POOL0 07244758 STREAM-STAGE 0 0 0 RESPONSE-STAGE0 0 0 ROW-READ-STAGE1 01320344 LB-OPERATIONS 0 0 0 COMMITLOG 1 0 19969914 GMFD 0 0 47153 MESSAGE-DESERIALIZER-POOL 0 0 21229742 LB-TARGET 0 0 0 CONSISTENCY-MANAGER 0 0 0 ROW-MUTATION-STAGE0 0 20599671 MESSAGE-STREAMING-POOL0 0 0 LOAD-BALANCER-STAGE 0 0 0 FLUSH-SORTER-POOL 0 0 39 MEMTABLE-POST-FLUSHER 0 0 39 COMPACTION-POOL 0 0 69 FLUSH-WRITER-POOL 0 0 39 HINTED-HANDOFF-POOL 0 0 35 Node 2 -- FILEUTILS-DELETE-POOL 0 0 4 MESSAGING-SERVICE-POOL1 01762737 STREAM-STAGE 0 0 0 RESPONSE-STAGE0 0 21255031 ROW-READ-STAGE2 21321911 LB-OPERATIONS 0 0 0 COMMITLOG 1 0 19949350 GMFD 0 0 43647 MESSAGE-DESERIALIZER-POOL 0 3 21254776 LB-TARGET 0 0 0 CONSISTENCY-MANAGER 0 01321501 ROW-MUTATION-STAGE5 6 20487288 MESSAGE-STREAMING-POOL0 0 0 LOAD-BALANCER-STAGE 0 0 0 FLUSH-SORTER-POOL 0 0 41 MEMTABLE-POST-FLUSHER 0 0 41 COMPACTION-POOL 1 4 48 FLUSH-WRITER-POOL 0 0 41 HINTED-HANDOFF-POOL 1 9 0 From: Jonathan Ellis [jbel...@gmail.com] Sent: Sunday, December 13, 2009 12:47 PM To: cassandra-user@incubator.apache.org Subject: Re: OOM Exception On Sun, Dec 13, 2009 at 2:20 PM, gabriele renzi rff@gmail.com wrote: Is there a reason for having this as an hardcoded value instead of configuration value? Just that nobody has needed it to be configurable, yet.
RE: read latency creaping up
if this isn't a known issue, lemme do some more investigating. my test client becomes more random with reads as time progresses, so possibly this is what causes the latency issue. however, all that being said, the performance really becomes bad after a while. From: Brian Burruss Sent: Sunday, December 13, 2009 1:14 PM To: cassandra-user@incubator.apache.org Subject: read latency creaping up i've noticed the longer i let my test clients run the higher the read latency becomes. if i kill the clients, the latency drops back down to a reasonable value. write latency isn't affected. here are two cfstats listings from the same machine, the first before and the second after i killed the clients. i do not touch the servers. this seems odd. if the clients are doing some bad, it seems that the server's latency wouldn't be affected. is this a thrift issue? before - Keyspace: uds Read Count: 2003 Read Latency: 59.807 ms. Write Count: 65145 Write Latency: 0.047 ms. Pending Tasks: 0 Column Family: bucket Memtable Columns Count: 168751 Memtable Data Size: 351695390 Memtable Switch Count: 94 Read Count: 2003 Read Latency: 59.807 ms. Write Count: 65176 Write Latency: 0.047 ms. Pending Tasks: 0 after --- Keyspace: uds Read Count: 53249 Read Latency: 3.653 ms. Write Count: 52888 Write Latency: 0.035 ms. Pending Tasks: 0 Column Family: bucket Memtable Columns Count: 383297 Memtable Data Size: 798864312 Memtable Switch Count: 94 Read Count: 53271 Read Latency: 3.649 ms. Write Count: 52937 Write Latency: 0.035 ms. Pending Tasks: 0
OOM Exception
this happened after cassandra was running for a couple of days. I have -Xmx3G on JVM. is there any other info you need so this makes sense? thx! 2009-12-11 21:38:37,216 ERROR [HINTED-HANDOFF-POOL:1] [DebuggableThreadPoolExecutor.java:157] Error in ThreadPoolExecutor java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:151) at org.apache.cassandra.io.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:144) at org.apache.cassandra.io.SSTableWriter.init(SSTableWriter.java:53) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:911) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:855) at org.apache.cassandra.db.ColumnFamilyStore.doMajorCompactionInternal(ColumnFamilyStore.java:698) at org.apache.cassandra.db.ColumnFamilyStore.doMajorCompaction(ColumnFamilyStore.java:670) at org.apache.cassandra.db.HintedHandOffManager.deliverAllHints(HintedHandOffManager.java:190) at org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:75) at org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:249) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636)
RE: exception during startup
i applied both patches and restarted cassandra. all seems well. memory stayed lower, but i'm not sure if this is because cassandra didn't have as many commit logs to replay or not as the data has changed since my initial Exception. thx From: Jonathan Ellis [jbel...@gmail.com] Sent: Monday, December 07, 2009 6:34 PM To: cassandra-user@incubator.apache.org Subject: Re: exception during startup Patch attached to the jira issue. Please give it a try if you are running trunk or the 0.5 beta. If this is data you care about, you should make a copy of the commitlog files, just in case. :) On Mon, Dec 7, 2009 at 8:11 PM, Jonathan Ellis jbel...@gmail.com wrote: on log replay, cassandra cheats and puts all the changes into memory before writing to disk, bypassing the normal is this memtable full yet checks. this is an optimization but IMO it's misguided because it can lead to OOM on replay when you wouldn't OOM for the same set of changes during normal operation. I've created https://issues.apache.org/jira/browse/CASSANDRA-609 to fix this, but the fix may be a little involved, so if you can temporarily give Cassandra more memory to finish the replay, that is the easiest workaround and you can set it back the way it was once replay completes successfully. On Mon, Dec 7, 2009 at 6:59 PM, Brian Burruss bburr...@real.com wrote: wanted to pass this along ... i have 2G RAM allocated to cassandra. should it need more? what are the factors that determine the amount of memory required? thx! 2009-12-07 16:56:30,787 ERROR [main] [CassandraDaemon.java:184] Exception encountered during startup. java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:317) at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:65) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:90) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:166)
exception during startup
wanted to pass this along ... i have 2G RAM allocated to cassandra. should it need more? what are the factors that determine the amount of memory required? thx! 2009-12-07 16:56:30,787 ERROR [main] [CassandraDaemon.java:184] Exception encountered during startup. java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:317) at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:65) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:90) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:166)