Well, paging still happens due to mmapped file I/O, however whilst this
could easily cause a slow login it would struggle to cause a 315s GC pause.

A slow network should also never cause this, though: the network threads
are simply caught by any safepoint on returning to the VM, so don't delay
GC.


On 27 February 2014 17:01, Frank Ng <fnt...@gmail.com> wrote:

> We have swap disabled.  Can death by paging still happen?
>
>
> On Thu, Feb 27, 2014 at 11:32 AM, Benedict Elliott Smith <
> belliottsm...@datastax.com> wrote:
>
>> That sounds a lot like death by paging.
>>
>>
>> On 27 February 2014 16:29, Frank Ng <fnt...@gmail.com> wrote:
>>
>>> I just caught that a node was down based on running nodetool status on a
>>> different node.  I tried to ssh into the downed node at that time and it
>>> was very slow logging on.  Looking at the gc.log file, there was a ParNew
>>> that only took 0.09 secs.  Yet the overall application threads stop time is
>>> 315 seconds (5 minutes).  Our cluster is handling alot of read requests.
>>>
>>> If there were network hiccups, would that cause a delay in the Cassandra
>>> process when it tries to get to a safepoint?  I assume Cassandra has
>>> threads running with lots of network activity and maybe taking a long time
>>> to reach a safepoint.
>>>
>>> thanks,
>>> Frank
>>>
>>>
>>> On Fri, Feb 21, 2014 at 4:24 AM, Joel Samuelsson <
>>> samuelsson.j...@gmail.com> wrote:
>>>
>>>> What happens if a ParNew is triggered while CMS is running? Will it
>>>> wait for the CMS to finish? If so, that would be the eplanation of our long
>>>> ParNew above.
>>>>
>>>> Regards,
>>>> Joel
>>>>
>>>>
>>>> 2014-02-20 16:29 GMT+01:00 Joel Samuelsson <samuelsson.j...@gmail.com>:
>>>>
>>>> Hi Frank,
>>>>>
>>>>> We got a (quite) long GC pause today on 2.0.5:
>>>>>  INFO [ScheduledTasks:1] 2014-02-20 13:51:14,528 GCInspector.java
>>>>> (line 116) GC for ParNew: 1627 ms for 1 collections, 425562984 used; max 
>>>>> is
>>>>> 4253024256
>>>>>  INFO [ScheduledTasks:1] 2014-02-20 13:51:14,542 GCInspector.java
>>>>> (line 116) GC for ConcurrentMarkSweep: 3703 ms for 2 collections, 
>>>>> 434394920
>>>>> used; max is 4253024256
>>>>>
>>>>> Unfortunately it's a production cluster so I have no additional
>>>>> GC-logging enabled. This may be an indication that upgrading is not the
>>>>> (complete) solution.
>>>>>
>>>>> Regards,
>>>>> Joel
>>>>>
>>>>>
>>>>> 2014-02-17 13:41 GMT+01:00 Benedict Elliott Smith <
>>>>> belliottsm...@datastax.com>:
>>>>>
>>>>> Hi Ondrej,
>>>>>>
>>>>>> It's possible you were hit by the problems in this thread before, but
>>>>>> it looks potentially like you may have other issues. Of course it may be
>>>>>> that on G1 you have one issue and CMS another, but 27s is extreme even 
>>>>>> for
>>>>>> G1, so it seems unlikely. If you're hitting these pause times in CMS and
>>>>>> you get some more output from the safepoint tracing, please do contribute
>>>>>> as I would love to get to the bottom of that, however is it possible 
>>>>>> you're
>>>>>> experiencing paging activity? Have you made certain the VM memory is 
>>>>>> locked
>>>>>> (and preferably that paging is entirely disabled, as the bloom filters 
>>>>>> and
>>>>>> other memory won't be locked, although that shouldn't cause pauses during
>>>>>> GC)
>>>>>>
>>>>>> Note that mmapped file accesses and other native work shouldn't in
>>>>>> anyway inhibit GC activity or other safepoint pause times, unless 
>>>>>> there's a
>>>>>> bug in the VM. These threads will simply enter a safepoint as they return
>>>>>> to the VM execution context, and are considered safe for the duration 
>>>>>> they
>>>>>> are outside.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 17 February 2014 12:30, Ondřej Černoš <cern...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> we tried to switch to G1 because we observed this behaviour on CMS
>>>>>>> too (27 seconds pause in G1 is quite an advise not to use it). Pauses 
>>>>>>> with
>>>>>>> CMS were not easily traceable - JVM stopped even without stop-the-world
>>>>>>> pause scheduled (defragmentation, remarking). We thought the
>>>>>>> go-to-safepoint waiting time might have been involved (we saw waiting 
>>>>>>> for
>>>>>>> safepoint resolution) - especially because access to mmpaped files is 
>>>>>>> not
>>>>>>> preemptive, afaik, but it doesn't explain tens of seconds waiting times,
>>>>>>> even slow IO should read our sstables into memory in much less time. We
>>>>>>> switched to G1 out of desperation - and to try different code paths - 
>>>>>>> not
>>>>>>> that we'd thought it was a great idea. So I think we were hit by the
>>>>>>> problem discussed in this thread, just the G1 report wasn't very clear,
>>>>>>> sorry.
>>>>>>>
>>>>>>> regards,
>>>>>>> ondrej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 17, 2014 at 11:45 AM, Benedict Elliott Smith <
>>>>>>> belliottsm...@datastax.com> wrote:
>>>>>>>
>>>>>>>> Ondrej,
>>>>>>>>
>>>>>>>> It seems like your issue is much less difficult to diagnose: your
>>>>>>>> collection times are long. At least, the pause you printed the time 
>>>>>>>> for is
>>>>>>>> all attributable to the G1 pause.
>>>>>>>>
>>>>>>>> Note that G1 has not generally performed well with Cassandra in our
>>>>>>>> testing. There are a number of changes going in soon that may change 
>>>>>>>> that,
>>>>>>>> but for the time being it is advisable to stick with CMS. With tuning 
>>>>>>>> you
>>>>>>>> can no doubt bring your pauses down considerably.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 17 February 2014 10:17, Ondřej Černoš <cern...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> we are seeing the same kind of long pauses in Cassandra. We tried
>>>>>>>>> to switch CMS to G1 without positive result. The stress test is read 
>>>>>>>>> heavy,
>>>>>>>>> 2 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes 
>>>>>>>>> in
>>>>>>>>> latency on 99.99 percentil and higher, caused by threads being 
>>>>>>>>> stopped in
>>>>>>>>> JVM.
>>>>>>>>>
>>>>>>>>> The GC in G1 looks like this:
>>>>>>>>>
>>>>>>>>> {Heap before GC invocations=4073 (full 1):
>>>>>>>>> garbage-first heap   total 8388608K, used 3602914K
>>>>>>>>> [0x00000005f5c00000, 0x00000007f5c00000, 0x00000007f5c00000)
>>>>>>>>>  region size 4096K, 142 young (581632K), 11 survivors (45056K)
>>>>>>>>> compacting perm gen  total 28672K, used 27428K
>>>>>>>>> [0x00000007f5c00000, 0x00000007f7800000, 0x0000000800000000)
>>>>>>>>>   the space 28672K,  95% used [0x00000007f5c00000,
>>>>>>>>> 0x00000007f76c9108, 0x00000007f76c9200, 0x00000007f7800000)
>>>>>>>>> No shared spaces configured.
>>>>>>>>> 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1
>>>>>>>>> Evacuation Pause) (young)
>>>>>>>>> Desired survivor size 37748736 bytes, new threshold 15 (max 15)
>>>>>>>>> - age   1:   17213632 bytes,   17213632 total
>>>>>>>>> - age   2:   19391208 bytes,   36604840 total
>>>>>>>>> , 0.1664300 secs]
>>>>>>>>>   [Parallel Time: 163.9 ms, GC Workers: 2]
>>>>>>>>>      [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3,
>>>>>>>>> Max: 222346218.3, Diff: 0.0]
>>>>>>>>>      [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff:
>>>>>>>>> 1.7, Sum: 13.7]
>>>>>>>>>      [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7,
>>>>>>>>> Sum: 42.6]
>>>>>>>>>         [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22,
>>>>>>>>> Sum: 120]
>>>>>>>>>      [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1,
>>>>>>>>> Sum: 46.5]
>>>>>>>>>      [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff:
>>>>>>>>> 0.1, Sum: 224.6]
>>>>>>>>>      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0,
>>>>>>>>> Sum: 0.1]
>>>>>>>>>      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
>>>>>>>>> 0.0, Sum: 0.1]
>>>>>>>>>      [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8,
>>>>>>>>> Diff: 0.0, Sum: 327.6]
>>>>>>>>>      [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max:
>>>>>>>>> 222346382.1, Diff: 0.0]
>>>>>>>>>   [Code Root Fixup: 0.0 ms]
>>>>>>>>>   [Clear CT: 0.4 ms]
>>>>>>>>>   [Other: 2.1 ms]
>>>>>>>>>      [Choose CSet: 0.0 ms]
>>>>>>>>>      [Ref Proc: 1.1 ms]
>>>>>>>>>      [Ref Enq: 0.0 ms]
>>>>>>>>>      [Free CSet: 0.4 ms]
>>>>>>>>>   [Eden: 524.0M(524.0M)->0.0B(476.0M) Survivors: 44.0M->68.0M
>>>>>>>>> Heap: 3518.5M(8192.0M)->3018.5M(8192.0M)]
>>>>>>>>> Heap after GC invocations=4074 (full 1):
>>>>>>>>> garbage-first heap   total 8388608K, used 3090914K 
>>>>>>>>> [0x00000005f5c00000,
>>>>>>>>> 0x00000007f5c00000, 0x00000007f5c00000)
>>>>>>>>>  region size 4096K, 17 young (69632K), 17 survivors (69632K)
>>>>>>>>> compacting perm gen  total 28672K, used 27428K [0x00000007f5c00000,
>>>>>>>>> 0x00000007f7800000, 0x0000000800000000)
>>>>>>>>>   the space 28672K,  95% used [0x00000007f5c00000, 0x00000007f76c9108,
>>>>>>>>> 0x00000007f76c9200, 0x00000007f7800000)
>>>>>>>>> No shared spaces configured.
>>>>>>>>> }
>>>>>>>>> [Times: user=0.35 sys=0.00, real=27.58 secs]
>>>>>>>>> 222346.219: G1IncCollectionPause             [     111          0
>>>>>>>>>              0    ]      [     0     0     0     0 27586    ]  0
>>>>>>>>>
>>>>>>>>> And the total thime for which application threads were stopped is
>>>>>>>>> 27.58 seconds.
>>>>>>>>>
>>>>>>>>> CMS behaves in a similar manner. We thought it would be GC,
>>>>>>>>> waiting for mmaped files being read from disk (the thread cannot reach
>>>>>>>>> safepoint during this operation), but it doesn't explain the huge 
>>>>>>>>> time.
>>>>>>>>>
>>>>>>>>> We'll try jhiccup to see if it provides any additional
>>>>>>>>> information. The test was done on mixed aws/openstack environment,
>>>>>>>>> openjdk 1.7.0_45, cassandra 1.2.11. Upgrading to 2.0.x is no option 
>>>>>>>>> for us.
>>>>>>>>>
>>>>>>>>> regards,
>>>>>>>>>
>>>>>>>>> ondrej cernos
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng <fnt...@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>> Sorry, I have not had a chance to file a JIRA ticket.  We have
>>>>>>>>>> not been able to resolve the issue.  But since Joel mentioned that
>>>>>>>>>> upgrading to Cassandra 2.0.X solved it for them, we may need to 
>>>>>>>>>> upgrade.
>>>>>>>>>> We are currently on Java 1.7 and Cassandra 1.2.8
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 13, 2014 at 12:40 PM, Keith Wright <
>>>>>>>>>> kwri...@nanigans.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> You’re running 2.0.* in production?  May I ask what C* version
>>>>>>>>>>> and OS?  Any hardware details would be appreciated as well.  Thx!
>>>>>>>>>>>
>>>>>>>>>>> From: Joel Samuelsson <samuelsson.j...@gmail.com>
>>>>>>>>>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org
>>>>>>>>>>> >
>>>>>>>>>>> Date: Thursday, February 13, 2014 at 11:39 AM
>>>>>>>>>>>
>>>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>>>>>>>> Subject: Re: Intermittent long application pauses on nodes
>>>>>>>>>>>
>>>>>>>>>>> We have had similar issues and upgrading C* to 2.0.x and Java to
>>>>>>>>>>> 1.7 seems to have helped our issues.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2014-02-13 Keith Wright <kwri...@nanigans.com>:
>>>>>>>>>>>
>>>>>>>>>>>> Frank did you ever file a ticket for this issue or find the
>>>>>>>>>>>> root cause?  I believe we are seeing the same issues when 
>>>>>>>>>>>> attempting to
>>>>>>>>>>>> bootstrap.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>>
>>>>>>>>>>>> From: Robert Coli <rc...@eventbrite.com>
>>>>>>>>>>>> Reply-To: "user@cassandra.apache.org" <
>>>>>>>>>>>> user@cassandra.apache.org>
>>>>>>>>>>>> Date: Monday, February 3, 2014 at 6:10 PM
>>>>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>>>>>>>>> Subject: Re: Intermittent long application pauses on nodes
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 3, 2014 at 8:52 AM, Benedict Elliott Smith <
>>>>>>>>>>>> belliottsm...@datastax.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's possible that this is a JVM issue, but if so there may be
>>>>>>>>>>>>> some remedial action we can take anyway. There are some more 
>>>>>>>>>>>>> flags we
>>>>>>>>>>>>> should add, but we can discuss that once you open a ticket. If 
>>>>>>>>>>>>> you could
>>>>>>>>>>>>> include the strange JMX error as well, that might be helpful.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> It would be appreciated if you could inform this thread of the
>>>>>>>>>>>> JIRA ticket number, for the benefit of the community and google 
>>>>>>>>>>>> searchers.
>>>>>>>>>>>> :)
>>>>>>>>>>>>
>>>>>>>>>>>> =Rob
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to