We have swap disabled. Can death by paging still happen?
On Thu, Feb 27, 2014 at 11:32 AM, Benedict Elliott Smith < belliottsm...@datastax.com> wrote: > That sounds a lot like death by paging. > > > On 27 February 2014 16:29, Frank Ng <fnt...@gmail.com> wrote: > >> I just caught that a node was down based on running nodetool status on a >> different node. I tried to ssh into the downed node at that time and it >> was very slow logging on. Looking at the gc.log file, there was a ParNew >> that only took 0.09 secs. Yet the overall application threads stop time is >> 315 seconds (5 minutes). Our cluster is handling alot of read requests. >> >> If there were network hiccups, would that cause a delay in the Cassandra >> process when it tries to get to a safepoint? I assume Cassandra has >> threads running with lots of network activity and maybe taking a long time >> to reach a safepoint. >> >> thanks, >> Frank >> >> >> On Fri, Feb 21, 2014 at 4:24 AM, Joel Samuelsson < >> samuelsson.j...@gmail.com> wrote: >> >>> What happens if a ParNew is triggered while CMS is running? Will it wait >>> for the CMS to finish? If so, that would be the eplanation of our long >>> ParNew above. >>> >>> Regards, >>> Joel >>> >>> >>> 2014-02-20 16:29 GMT+01:00 Joel Samuelsson <samuelsson.j...@gmail.com>: >>> >>> Hi Frank, >>>> >>>> We got a (quite) long GC pause today on 2.0.5: >>>> INFO [ScheduledTasks:1] 2014-02-20 13:51:14,528 GCInspector.java (line >>>> 116) GC for ParNew: 1627 ms for 1 collections, 425562984 used; max is >>>> 4253024256 >>>> INFO [ScheduledTasks:1] 2014-02-20 13:51:14,542 GCInspector.java (line >>>> 116) GC for ConcurrentMarkSweep: 3703 ms for 2 collections, 434394920 used; >>>> max is 4253024256 >>>> >>>> Unfortunately it's a production cluster so I have no additional >>>> GC-logging enabled. This may be an indication that upgrading is not the >>>> (complete) solution. >>>> >>>> Regards, >>>> Joel >>>> >>>> >>>> 2014-02-17 13:41 GMT+01:00 Benedict Elliott Smith < >>>> belliottsm...@datastax.com>: >>>> >>>> Hi Ondrej, >>>>> >>>>> It's possible you were hit by the problems in this thread before, but >>>>> it looks potentially like you may have other issues. Of course it may be >>>>> that on G1 you have one issue and CMS another, but 27s is extreme even for >>>>> G1, so it seems unlikely. If you're hitting these pause times in CMS and >>>>> you get some more output from the safepoint tracing, please do contribute >>>>> as I would love to get to the bottom of that, however is it possible >>>>> you're >>>>> experiencing paging activity? Have you made certain the VM memory is >>>>> locked >>>>> (and preferably that paging is entirely disabled, as the bloom filters and >>>>> other memory won't be locked, although that shouldn't cause pauses during >>>>> GC) >>>>> >>>>> Note that mmapped file accesses and other native work shouldn't in >>>>> anyway inhibit GC activity or other safepoint pause times, unless there's >>>>> a >>>>> bug in the VM. These threads will simply enter a safepoint as they return >>>>> to the VM execution context, and are considered safe for the duration they >>>>> are outside. >>>>> >>>>> >>>>> >>>>> >>>>> On 17 February 2014 12:30, Ondřej Černoš <cern...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> we tried to switch to G1 because we observed this behaviour on CMS >>>>>> too (27 seconds pause in G1 is quite an advise not to use it). Pauses >>>>>> with >>>>>> CMS were not easily traceable - JVM stopped even without stop-the-world >>>>>> pause scheduled (defragmentation, remarking). We thought the >>>>>> go-to-safepoint waiting time might have been involved (we saw waiting for >>>>>> safepoint resolution) - especially because access to mmpaped files is not >>>>>> preemptive, afaik, but it doesn't explain tens of seconds waiting times, >>>>>> even slow IO should read our sstables into memory in much less time. We >>>>>> switched to G1 out of desperation - and to try different code paths - not >>>>>> that we'd thought it was a great idea. So I think we were hit by the >>>>>> problem discussed in this thread, just the G1 report wasn't very clear, >>>>>> sorry. >>>>>> >>>>>> regards, >>>>>> ondrej >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Feb 17, 2014 at 11:45 AM, Benedict Elliott Smith < >>>>>> belliottsm...@datastax.com> wrote: >>>>>> >>>>>>> Ondrej, >>>>>>> >>>>>>> It seems like your issue is much less difficult to diagnose: your >>>>>>> collection times are long. At least, the pause you printed the time for >>>>>>> is >>>>>>> all attributable to the G1 pause. >>>>>>> >>>>>>> Note that G1 has not generally performed well with Cassandra in our >>>>>>> testing. There are a number of changes going in soon that may change >>>>>>> that, >>>>>>> but for the time being it is advisable to stick with CMS. With tuning >>>>>>> you >>>>>>> can no doubt bring your pauses down considerably. >>>>>>> >>>>>>> >>>>>>> On 17 February 2014 10:17, Ondřej Černoš <cern...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> we are seeing the same kind of long pauses in Cassandra. We tried >>>>>>>> to switch CMS to G1 without positive result. The stress test is read >>>>>>>> heavy, >>>>>>>> 2 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in >>>>>>>> latency on 99.99 percentil and higher, caused by threads being stopped >>>>>>>> in >>>>>>>> JVM. >>>>>>>> >>>>>>>> The GC in G1 looks like this: >>>>>>>> >>>>>>>> {Heap before GC invocations=4073 (full 1): >>>>>>>> garbage-first heap total 8388608K, used 3602914K >>>>>>>> [0x00000005f5c00000, 0x00000007f5c00000, 0x00000007f5c00000) >>>>>>>> region size 4096K, 142 young (581632K), 11 survivors (45056K) >>>>>>>> compacting perm gen total 28672K, used 27428K [0x00000007f5c00000, >>>>>>>> 0x00000007f7800000, 0x0000000800000000) >>>>>>>> the space 28672K, 95% used [0x00000007f5c00000, >>>>>>>> 0x00000007f76c9108, 0x00000007f76c9200, 0x00000007f7800000) >>>>>>>> No shared spaces configured. >>>>>>>> 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation >>>>>>>> Pause) (young) >>>>>>>> Desired survivor size 37748736 bytes, new threshold 15 (max 15) >>>>>>>> - age 1: 17213632 bytes, 17213632 total >>>>>>>> - age 2: 19391208 bytes, 36604840 total >>>>>>>> , 0.1664300 secs] >>>>>>>> [Parallel Time: 163.9 ms, GC Workers: 2] >>>>>>>> [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, >>>>>>>> Max: 222346218.3, Diff: 0.0] >>>>>>>> [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: >>>>>>>> 1.7, Sum: 13.7] >>>>>>>> [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, >>>>>>>> Sum: 42.6] >>>>>>>> [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, >>>>>>>> Sum: 120] >>>>>>>> [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, >>>>>>>> Sum: 46.5] >>>>>>>> [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: >>>>>>>> 0.1, Sum: 224.6] >>>>>>>> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, >>>>>>>> Sum: 0.1] >>>>>>>> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >>>>>>>> 0.0, Sum: 0.1] >>>>>>>> [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, >>>>>>>> Diff: 0.0, Sum: 327.6] >>>>>>>> [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max: >>>>>>>> 222346382.1, Diff: 0.0] >>>>>>>> [Code Root Fixup: 0.0 ms] >>>>>>>> [Clear CT: 0.4 ms] >>>>>>>> [Other: 2.1 ms] >>>>>>>> [Choose CSet: 0.0 ms] >>>>>>>> [Ref Proc: 1.1 ms] >>>>>>>> [Ref Enq: 0.0 ms] >>>>>>>> [Free CSet: 0.4 ms] >>>>>>>> [Eden: 524.0M(524.0M)->0.0B(476.0M) Survivors: 44.0M->68.0M >>>>>>>> Heap: 3518.5M(8192.0M)->3018.5M(8192.0M)] >>>>>>>> Heap after GC invocations=4074 (full 1): >>>>>>>> garbage-first heap total 8388608K, used 3090914K [0x00000005f5c00000, >>>>>>>> 0x00000007f5c00000, 0x00000007f5c00000) >>>>>>>> region size 4096K, 17 young (69632K), 17 survivors (69632K) >>>>>>>> compacting perm gen total 28672K, used 27428K [0x00000007f5c00000, >>>>>>>> 0x00000007f7800000, 0x0000000800000000) >>>>>>>> the space 28672K, 95% used [0x00000007f5c00000, 0x00000007f76c9108, >>>>>>>> 0x00000007f76c9200, 0x00000007f7800000) >>>>>>>> No shared spaces configured. >>>>>>>> } >>>>>>>> [Times: user=0.35 sys=0.00, real=27.58 secs] >>>>>>>> 222346.219: G1IncCollectionPause [ 111 0 >>>>>>>> 0 ] [ 0 0 0 0 27586 ] 0 >>>>>>>> >>>>>>>> And the total thime for which application threads were stopped is >>>>>>>> 27.58 seconds. >>>>>>>> >>>>>>>> CMS behaves in a similar manner. We thought it would be GC, waiting >>>>>>>> for mmaped files being read from disk (the thread cannot reach >>>>>>>> safepoint >>>>>>>> during this operation), but it doesn't explain the huge time. >>>>>>>> >>>>>>>> We'll try jhiccup to see if it provides any additional information. >>>>>>>> The test was done on mixed aws/openstack environment, openjdk 1.7.0_45, >>>>>>>> cassandra 1.2.11. Upgrading to 2.0.x is no option for us. >>>>>>>> >>>>>>>> regards, >>>>>>>> >>>>>>>> ondrej cernos >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng <fnt...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Sorry, I have not had a chance to file a JIRA ticket. We have not >>>>>>>>> been able to resolve the issue. But since Joel mentioned that >>>>>>>>> upgrading to >>>>>>>>> Cassandra 2.0.X solved it for them, we may need to upgrade. We are >>>>>>>>> currently on Java 1.7 and Cassandra 1.2.8 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Feb 13, 2014 at 12:40 PM, Keith Wright < >>>>>>>>> kwri...@nanigans.com> wrote: >>>>>>>>> >>>>>>>>>> You’re running 2.0.* in production? May I ask what C* version >>>>>>>>>> and OS? Any hardware details would be appreciated as well. Thx! >>>>>>>>>> >>>>>>>>>> From: Joel Samuelsson <samuelsson.j...@gmail.com> >>>>>>>>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>>>>>>>>> Date: Thursday, February 13, 2014 at 11:39 AM >>>>>>>>>> >>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>>>>>>>>> Subject: Re: Intermittent long application pauses on nodes >>>>>>>>>> >>>>>>>>>> We have had similar issues and upgrading C* to 2.0.x and Java to >>>>>>>>>> 1.7 seems to have helped our issues. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2014-02-13 Keith Wright <kwri...@nanigans.com>: >>>>>>>>>> >>>>>>>>>>> Frank did you ever file a ticket for this issue or find the root >>>>>>>>>>> cause? I believe we are seeing the same issues when attempting to >>>>>>>>>>> bootstrap. >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> >>>>>>>>>>> From: Robert Coli <rc...@eventbrite.com> >>>>>>>>>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org >>>>>>>>>>> > >>>>>>>>>>> Date: Monday, February 3, 2014 at 6:10 PM >>>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>>>>>>>>>> Subject: Re: Intermittent long application pauses on nodes >>>>>>>>>>> >>>>>>>>>>> On Mon, Feb 3, 2014 at 8:52 AM, Benedict Elliott Smith < >>>>>>>>>>> belliottsm...@datastax.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> It's possible that this is a JVM issue, but if so there may be >>>>>>>>>>>> some remedial action we can take anyway. There are some more flags >>>>>>>>>>>> we >>>>>>>>>>>> should add, but we can discuss that once you open a ticket. If you >>>>>>>>>>>> could >>>>>>>>>>>> include the strange JMX error as well, that might be helpful. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> It would be appreciated if you could inform this thread of the >>>>>>>>>>> JIRA ticket number, for the benefit of the community and google >>>>>>>>>>> searchers. >>>>>>>>>>> :) >>>>>>>>>>> >>>>>>>>>>> =Rob >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >