Glad to hear "that" bug is fixed ;) Can the configuration params like memtable size be changed between server starts without clearing the data?
Jonathan Ellis <[email protected]> wrote: You're OOMing after log replay finishes there. So I can still maintain that beta2 fixed the "replay uses more memory" bug :) It looks like you're running out of memory when the other node restarts, and it needs to read the hinted rows into memory to send them over. I suggest halving your MemtableSizeInMB, 1.5GB is pretty large. On Wed, Dec 16, 2009 at 7:01 PM, Brian Burruss <[email protected]> wrote: > attached ... the log starts when i restarted server. notice that not too far > into it is when the other node went down because of OOM and i restarted it as > well. > > ________________________________________ > From: Jonathan Ellis [[email protected]] > Sent: Wednesday, December 16, 2009 4:53 PM > To: [email protected] > Subject: Re: OOM Exception > > sorry, i meant the system.log the 2nd time (clear it out before > replaying so it's not confused w/ other info, pls) > > On Wed, Dec 16, 2009 at 5:39 PM, Brian Burruss <[email protected]> wrote: >> is this what you want? they are big - i'd rather not spam everyone with >> them. if you need them or the hprof files i can tar them and send them to >> you. >> >> thx! >> >> >> [bburr...@gen-app02 cassandra]$ ls -l ~/cassandra/btoddb/commitlog/ >> total 597228 >> -rw-rw-r-- 1 bburruss bburruss 134219796 Dec 16 13:52 >> CommitLog-1260995895123.log >> -rw-rw-r-- 1 bburruss bburruss 134218547 Dec 16 13:52 >> CommitLog-1260997811317.log >> -rw-rw-r-- 1 bburruss bburruss 134218331 Dec 16 13:52 >> CommitLog-1260998497744.log >> -rw-rw-r-- 1 bburruss bburruss 134219677 Dec 16 13:53 >> CommitLog-1261000330587.log >> -rw-rw-r-- 1 bburruss bburruss 74055680 Dec 16 14:49 >> CommitLog-1261000439079.log >> [bburr...@gen-app02 cassandra]$ >> >> ________________________________________ >> From: Jonathan Ellis [[email protected]] >> Sent: Wednesday, December 16, 2009 3:29 PM >> To: [email protected] >> Subject: Re: OOM Exception >> >> How large are the log files being replayed? >> >> Can you attach the log from a replay attempt? >> >> On Wed, Dec 16, 2009 at 5:21 PM, Brian Burruss <[email protected]> wrote: >>> sorry, thought i included everything ;) >>> >>> however, i am using beta2 >>> >>> ________________________________________ >>> From: Jonathan Ellis [[email protected]] >>> Sent: Wednesday, December 16, 2009 3:18 PM >>> To: [email protected] >>> Subject: Re: OOM Exception >>> >>> What version are you using? 0.5 beta2 fixes the >>> using-more-memory-on-startup problem. >>> >>> On Wed, Dec 16, 2009 at 5:16 PM, Brian Burruss <[email protected]> wrote: >>>> i'll put my question first: >>>> >>>> - how can i determine how much RAM is required by cassandra? (for normal >>>> operation and restarting server) >>>> >>>> *** i've attached my storage-conf.xml >>>> >>>> i've gotten several more OOM exceptions since i mentioned it a week or so >>>> ago. i started from a fresh database a couple days ago and have been >>>> adding 2k blocks of data keyed off a random integer at the rate of about >>>> 400/sec. i have a 2 node cluster, RF=2, Consistency for read/write is >>>> ONE. there are ~70,420,082 2k blocks of data in the database. >>>> >>>> i used the default memory setup of Xmx1G when i started a couple days ago. >>>> as the database grew to ~180G (reported by unix du command) both servers >>>> OOM'ed at about the same time, within 10 minutes of each other. well >>>> needless to say, my cluster is dead. so i upped the memory to 3G and the >>>> servers tried to come back up, but one died again with OOM. >>>> >>>> Before cleaning the disk and starting over a couple days ago, i played the >>>> game of "jack up the RAM", but eventually i didn't want to up it anymore >>>> when i got to 5G. the parameter, SSTable.INDEX_INTERVAL, was discussed a >>>> few days ago that would change the number of "keys" cached in memory, so i >>>> could modify that at the cost of read performance, but doing the math, 3G >>>> should be plenty of room. >>>> >>>> it seems like startup requires more RAM than just normal running. >>>> >>>> so this of course concerns me. >>>> >>>> i have the hprof files from when the server initially crashed and when it >>>> crashed trying to restart if anyone wants them >>>> >>> >> >
