RE: Unable to start one Cassandra node: OutOfMemoryError
Glad to help :P From: Mikhail Strebkov [mailto:streb...@gmail.com] Sent: 10 December 2015 22:35 To: user@cassandra.apache.org Subject: Re: Unable to start one Cassandra node: OutOfMemoryError Steve, thanks a ton! Removing compactions_in_progress helped! Now the node is running again. p.s. Sorry for referring to you by the last name in my last email, I got confused. On Thu, Dec 10, 2015 at 2:09 AM, Walsh, Stephen <stephen.wa...@aspect.com<mailto:stephen.wa...@aspect.com>> wrote: 8GB is the max recommended for heap size and that’s if you have 32GB or more available. We use 6GB on our 16GB machines and its very stable The out of memory could be coming from cassandra reloading compactions_in_progress into memory, you can check this from the log files if needs be. You can safely delete this folder inside the data directory. This can happen if you didn’t stop cassandra with a drain command and wait for the compactions to finish. Last time we hit it – was due to testing HA when we forced killed an entire cluster. Steve From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com<mailto:jeff.ji...@crowdstrike.com>] Sent: 10 December 2015 02:49 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: Re: Unable to start one Cassandra node: OutOfMemoryError 8G is probably too small for a G1 heap. Raise your heap or try CMS instead. 71% of your heap is collections – may be a weird data model quirk, but try CMS first and see if that behaves better. From: Mikhail Strebkov Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" Date: Wednesday, December 9, 2015 at 5:26 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" Subject: Unable to start one Cassandra node: OutOfMemoryError Hi everyone, While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't start with OutOfMemoryError. We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector with 8 GiB heap. Average node size is 300 GiB. I looked at the heap dump with YourKit profiler (www.yourkit.com<http://www.yourkit.com>) and it was quite hard since it's so big, but can't get much out of it: http://i.imgur.com/fIRImma.png As far as I understand the report, there are 1,332,812 instances of org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all of them are still strongly reachable? Please help me to debug this. I don't know even where to start. I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 nodes running 4.7.1 at the same time. Thanks, Mikhail This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments. This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments.
Re: Unable to start one Cassandra node: OutOfMemoryError
Can a core Cassandra committer verify if removing the compactions_in_progress folder is indeed to desired and recommended solution to this problem, or whether it might in fact be a bug that this workaround is needed at all? Thanks! -- Jack Krupansky On Thu, Dec 10, 2015 at 5:34 PM, Mikhail Strebkov <streb...@gmail.com> wrote: > Steve, thanks a ton! Removing compactions_in_progress helped! Now the node > is running again. > > p.s. Sorry for referring to you by the last name in my last email, I got > confused. > > On Thu, Dec 10, 2015 at 2:09 AM, Walsh, Stephen <stephen.wa...@aspect.com> > wrote: > >> 8GB is the max recommended for heap size and that’s if you have 32GB or >> more available. >> >> >> >> We use 6GB on our 16GB machines and its very stable >> >> >> >> The out of memory could be coming from cassandra reloading >> compactions_in_progress into memory, you can check this from the log files >> if needs be. >> >> You can safely delete this folder inside the data directory. >> >> >> >> This can happen if you didn’t stop cassandra with a drain command and >> wait for the compactions to finish. >> >> Last time we hit it – was due to testing HA when we forced killed an >> entire cluster. >> >> >> >> Steve >> >> >> >> >> >> >> >> *From:* Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] >> *Sent:* 10 December 2015 02:49 >> *To:* user@cassandra.apache.org >> *Subject:* Re: Unable to start one Cassandra node: OutOfMemoryError >> >> >> >> 8G is probably too small for a G1 heap. Raise your heap or try CMS >> instead. >> >> >> >> 71% of your heap is collections – may be a weird data model quirk, but >> try CMS first and see if that behaves better. >> >> >> >> >> >> >> >> *From: *Mikhail Strebkov >> *Reply-To: *"user@cassandra.apache.org" >> *Date: *Wednesday, December 9, 2015 at 5:26 PM >> *To: *"user@cassandra.apache.org" >> *Subject: *Unable to start one Cassandra node: OutOfMemoryError >> >> >> >> Hi everyone, >> >> >> >> While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra >> 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't >> start with OutOfMemoryError. >> >> We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector >> with 8 GiB heap. >> >> Average node size is 300 GiB. >> >> >> >> I looked at the heap dump with YourKit profiler (www.yourkit.com) and it >> was quite hard since it's so big, but can't get much out of it: >> http://i.imgur.com/fIRImma.png >> >> >> >> As far as I understand the report, there are 1,332,812 instances of >> org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all >> of them are still strongly reachable? >> >> >> >> Please help me to debug this. I don't know even where to start. >> >> I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 >> nodes running 4.7.1 at the same time. >> >> >> >> Thanks, >> >> Mikhail >> >> >> >> >> This email (including any attachments) is proprietary to Aspect Software, >> Inc. and may contain information that is confidential. If you have received >> this message in error, please do not read, copy or forward this message. >> Please notify the sender immediately, delete it from your system and >> destroy any copies. You may not further disclose or distribute this email >> or its attachments. >> > >
Re: Unable to start one Cassandra node: OutOfMemoryError
On Tue, Dec 15, 2015 at 4:41 PM, Jack Krupanskywrote: > Can a core Cassandra committer verify if removing the compactions_in_progress > folder is indeed to desired and recommended solution to this problem, or > whether it might in fact be a bug that this workaround is needed at all? > Thanks! > I can't speak to the question directly, but.. compactions_in_progress is removed upstream. https://issues.apache.org/jira/browse/CASSANDRA-7066 =Rob
Re: Unable to start one Cassandra node: OutOfMemoryError
Jeff, CMS GC didn't help. Thinking about it, I don't see how can it help if there are 8GB of strongly reachable objects from the GC roots. Walsh, thanks for your suggestion, I checked the log and there are some compactions_in_progress but total size of those is ~300 MiB as far as I understand. Here is the log of the last unsuccessful start: https://gist.github.com/kluyg/7b9955d34def947f5e0a On Thu, Dec 10, 2015 at 2:09 AM, Walsh, Stephen <stephen.wa...@aspect.com> wrote: > 8GB is the max recommended for heap size and that’s if you have 32GB or > more available. > > > > We use 6GB on our 16GB machines and its very stable > > > > The out of memory could be coming from cassandra reloading > compactions_in_progress into memory, you can check this from the log files > if needs be. > > You can safely delete this folder inside the data directory. > > > > This can happen if you didn’t stop cassandra with a drain command and wait > for the compactions to finish. > > Last time we hit it – was due to testing HA when we forced killed an > entire cluster. > > > > Steve > > > > > > > > *From:* Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] > *Sent:* 10 December 2015 02:49 > *To:* user@cassandra.apache.org > *Subject:* Re: Unable to start one Cassandra node: OutOfMemoryError > > > > 8G is probably too small for a G1 heap. Raise your heap or try CMS instead. > > > > 71% of your heap is collections – may be a weird data model quirk, but try > CMS first and see if that behaves better. > > > > > > > > *From: *Mikhail Strebkov > *Reply-To: *"user@cassandra.apache.org" > *Date: *Wednesday, December 9, 2015 at 5:26 PM > *To: *"user@cassandra.apache.org" > *Subject: *Unable to start one Cassandra node: OutOfMemoryError > > > > Hi everyone, > > > > While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra > 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't > start with OutOfMemoryError. > > We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector > with 8 GiB heap. > > Average node size is 300 GiB. > > > > I looked at the heap dump with YourKit profiler (www.yourkit.com) and it > was quite hard since it's so big, but can't get much out of it: > http://i.imgur.com/fIRImma.png > > > > As far as I understand the report, there are 1,332,812 instances of > org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all > of them are still strongly reachable? > > > > Please help me to debug this. I don't know even where to start. > > I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 > nodes running 4.7.1 at the same time. > > > > Thanks, > > Mikhail > > > > > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. >
Re: Unable to start one Cassandra node: OutOfMemoryError
Steve, thanks a ton! Removing compactions_in_progress helped! Now the node is running again. p.s. Sorry for referring to you by the last name in my last email, I got confused. On Thu, Dec 10, 2015 at 2:09 AM, Walsh, Stephen <stephen.wa...@aspect.com> wrote: > 8GB is the max recommended for heap size and that’s if you have 32GB or > more available. > > > > We use 6GB on our 16GB machines and its very stable > > > > The out of memory could be coming from cassandra reloading > compactions_in_progress into memory, you can check this from the log files > if needs be. > > You can safely delete this folder inside the data directory. > > > > This can happen if you didn’t stop cassandra with a drain command and wait > for the compactions to finish. > > Last time we hit it – was due to testing HA when we forced killed an > entire cluster. > > > > Steve > > > > > > > > *From:* Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] > *Sent:* 10 December 2015 02:49 > *To:* user@cassandra.apache.org > *Subject:* Re: Unable to start one Cassandra node: OutOfMemoryError > > > > 8G is probably too small for a G1 heap. Raise your heap or try CMS instead. > > > > 71% of your heap is collections – may be a weird data model quirk, but try > CMS first and see if that behaves better. > > > > > > > > *From: *Mikhail Strebkov > *Reply-To: *"user@cassandra.apache.org" > *Date: *Wednesday, December 9, 2015 at 5:26 PM > *To: *"user@cassandra.apache.org" > *Subject: *Unable to start one Cassandra node: OutOfMemoryError > > > > Hi everyone, > > > > While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra > 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't > start with OutOfMemoryError. > > We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector > with 8 GiB heap. > > Average node size is 300 GiB. > > > > I looked at the heap dump with YourKit profiler (www.yourkit.com) and it > was quite hard since it's so big, but can't get much out of it: > http://i.imgur.com/fIRImma.png > > > > As far as I understand the report, there are 1,332,812 instances of > org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all > of them are still strongly reachable? > > > > Please help me to debug this. I don't know even where to start. > > I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 > nodes running 4.7.1 at the same time. > > > > Thanks, > > Mikhail > > > > > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. >
Re: Unable to start one Cassandra node: OutOfMemoryError
Dealt with that recently, and the only solution that made it work was to increase heap sizes. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: @cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo <http://linkedin.com/in/carlosjuzarterolo>* Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649 www.pythian.com On Thu, Dec 10, 2015 at 10:14 PM, Mikhail Strebkov <streb...@gmail.com> wrote: > Jeff, CMS GC didn't help. Thinking about it, I don't see how can it help > if there are 8GB of strongly reachable objects from the GC roots. > > Walsh, thanks for your suggestion, I checked the log and there are some > compactions_in_progress but total size of those is ~300 MiB as far as I > understand. > > Here is the log of the last unsuccessful start: > https://gist.github.com/kluyg/7b9955d34def947f5e0a > > > On Thu, Dec 10, 2015 at 2:09 AM, Walsh, Stephen <stephen.wa...@aspect.com> > wrote: > >> 8GB is the max recommended for heap size and that’s if you have 32GB or >> more available. >> >> >> >> We use 6GB on our 16GB machines and its very stable >> >> >> >> The out of memory could be coming from cassandra reloading >> compactions_in_progress into memory, you can check this from the log files >> if needs be. >> >> You can safely delete this folder inside the data directory. >> >> >> >> This can happen if you didn’t stop cassandra with a drain command and >> wait for the compactions to finish. >> >> Last time we hit it – was due to testing HA when we forced killed an >> entire cluster. >> >> >> >> Steve >> >> >> >> >> >> >> >> *From:* Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] >> *Sent:* 10 December 2015 02:49 >> *To:* user@cassandra.apache.org >> *Subject:* Re: Unable to start one Cassandra node: OutOfMemoryError >> >> >> >> 8G is probably too small for a G1 heap. Raise your heap or try CMS >> instead. >> >> >> >> 71% of your heap is collections – may be a weird data model quirk, but >> try CMS first and see if that behaves better. >> >> >> >> >> >> >> >> *From: *Mikhail Strebkov >> *Reply-To: *"user@cassandra.apache.org" >> *Date: *Wednesday, December 9, 2015 at 5:26 PM >> *To: *"user@cassandra.apache.org" >> *Subject: *Unable to start one Cassandra node: OutOfMemoryError >> >> >> >> Hi everyone, >> >> >> >> While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra >> 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't >> start with OutOfMemoryError. >> >> We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector >> with 8 GiB heap. >> >> Average node size is 300 GiB. >> >> >> >> I looked at the heap dump with YourKit profiler (www.yourkit.com) and it >> was quite hard since it's so big, but can't get much out of it: >> http://i.imgur.com/fIRImma.png >> >> >> >> As far as I understand the report, there are 1,332,812 instances of >> org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all >> of them are still strongly reachable? >> >> >> >> Please help me to debug this. I don't know even where to start. >> >> I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 >> nodes running 4.7.1 at the same time. >> >> >> >> Thanks, >> >> Mikhail >> >> >> >> >> This email (including any attachments) is proprietary to Aspect Software, >> Inc. and may contain information that is confidential. If you have received >> this message in error, please do not read, copy or forward this message. >> Please notify the sender immediately, delete it from your system and >> destroy any copies. You may not further disclose or distribute this email >> or its attachments. >> > > -- --
RE: Unable to start one Cassandra node: OutOfMemoryError
8GB is the max recommended for heap size and that’s if you have 32GB or more available. We use 6GB on our 16GB machines and its very stable The out of memory could be coming from cassandra reloading compactions_in_progress into memory, you can check this from the log files if needs be. You can safely delete this folder inside the data directory. This can happen if you didn’t stop cassandra with a drain command and wait for the compactions to finish. Last time we hit it – was due to testing HA when we forced killed an entire cluster. Steve From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] Sent: 10 December 2015 02:49 To: user@cassandra.apache.org Subject: Re: Unable to start one Cassandra node: OutOfMemoryError 8G is probably too small for a G1 heap. Raise your heap or try CMS instead. 71% of your heap is collections – may be a weird data model quirk, but try CMS first and see if that behaves better. From: Mikhail Strebkov Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" Date: Wednesday, December 9, 2015 at 5:26 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" Subject: Unable to start one Cassandra node: OutOfMemoryError Hi everyone, While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't start with OutOfMemoryError. We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector with 8 GiB heap. Average node size is 300 GiB. I looked at the heap dump with YourKit profiler (www.yourkit.com<http://www.yourkit.com>) and it was quite hard since it's so big, but can't get much out of it: http://i.imgur.com/fIRImma.png As far as I understand the report, there are 1,332,812 instances of org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all of them are still strongly reachable? Please help me to debug this. I don't know even where to start. I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 nodes running 4.7.1 at the same time. Thanks, Mikhail This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments.
Unable to start one Cassandra node: OutOfMemoryError
Hi everyone, While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't start with OutOfMemoryError. We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector with 8 GiB heap. Average node size is 300 GiB. I looked at the heap dump with YourKit profiler (www.yourkit.com) and it was quite hard since it's so big, but can't get much out of it: http://i.imgur.com/fIRImma.png As far as I understand the report, there are 1,332,812 instances of org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all of them are still strongly reachable? Please help me to debug this. I don't know even where to start. I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 nodes running 4.7.1 at the same time. Thanks, Mikhail
Re: Unable to start one Cassandra node: OutOfMemoryError
8G is probably too small for a G1 heap. Raise your heap or try CMS instead. 71% of your heap is collections – may be a weird data model quirk, but try CMS first and see if that behaves better. From: Mikhail Strebkov Reply-To: "user@cassandra.apache.org" Date: Wednesday, December 9, 2015 at 5:26 PM To: "user@cassandra.apache.org" Subject: Unable to start one Cassandra node: OutOfMemoryError Hi everyone, While upgrading our 5 machines cluster from DSE version 4.7.1 (Cassandra 2.1.8) to DSE version: 4.8.2 (Cassandra 2.1.11) one of the nodes can't start with OutOfMemoryError. We're using HotSpot 64-Bit Server VM/1.8.0_45 and G1 garbage collector with 8 GiB heap. Average node size is 300 GiB. I looked at the heap dump with YourKit profiler (www.yourkit.com) and it was quite hard since it's so big, but can't get much out of it: http://i.imgur.com/fIRImma.png As far as I understand the report, there are 1,332,812 instances of org.apache.cassandra.db.Row which retain 8 GiB. I don't understand why all of them are still strongly reachable? Please help me to debug this. I don't know even where to start. I feel very uncomfortable with 1 node running 4.8.2, 1 node down and 3 nodes running 4.7.1 at the same time. Thanks, Mikhail smime.p7s Description: S/MIME cryptographic signature