Re: JVM with 100 seconds minor GC

Kirk Pepperdine Fri, 22 Dec 2017 11:33:07 -0800

Hi,

Can you send the *whole* log.


Kind regards,
Kirk Pepperdine

> On Dec 22, 2017, at 7:28 AM, Abhishek Agarwal <[email protected]> wrote:
> 
> Thank you for the pointers. We have gc logs but didn't help much. 
> 
> Few corrections:
> Old gen size is ~ 20 Gigs
> Young gen size is ~ 1.5 Gigs
> 
> The skew between young gen and old gen size has been fixed. Here is the 
> snippet of gc logs when time increased to ~ 200 seconds. By the way the 
> service is running on AWS. 
> 
> 
> 59549.736: [GC (Allocation Failure) 59549.736: [ParNew: 
> 1688278K->56721K(1887488K), 0.5518315 secs] 4474842K->3433110K(21480704K), 
> 0.5526896 secs] [Times: user=1.92 sys=0.25, real=0.55 secs]
> 59581.163: [GC (Allocation Failure) 59581.164: [ParNew: 
> 1734545K->43688K(1887488K), 0.4640033 secs] 5110934K->3941007K(21480704K), 
> 0.4650333 secs] [Times: user=1.59 sys=0.26, real=0.46 secs]
> 59617.565: [GC (Allocation Failure) 59617.565: [ParNew: 
> 1721512K->40500K(1887488K), 0.1154788 secs] 5618831K->3937818K(21480704K), 
> 0.1166312 secs] [Times: user=0.44 sys=0.00, real=0.12 secs]
> 59645.910: [GC (Allocation Failure) 59645.911: [ParNew: 
> 1718324K->46789K(1887488K), 0.1918905 secs] 5615642K->3944132K(21480704K), 
> 0.1930935 secs] [Times: user=0.77 sys=0.00, real=0.20 secs]
> 59692.159: [GC (Allocation Failure) 59692.159: [ParNew: 
> 1724613K->45533K(1887488K), 5.4467875 secs] 5621956K->3944000K(21480704K), 
> 5.4480386 secs] [Times: user=21.86 sys=0.00, real=5.45 secs]
> 59716.348: [GC (Allocation Failure) 59716.348: [ParNew: 
> 1723357K->49256K(1887488K), 8.2982698 secs] 5621824K->3948162K(21480704K), 
> 8.2994285 secs] [Times: user=33.47 sys=0.00, real=8.30 secs]
> 59751.501: [GC (Allocation Failure) 59751.502: [ParNew: 
> 1727080K->47156K(1887488K), 9.4546047 secs] 5625986K->3946152K(21480704K), 
> 9.4558187 secs] [Times: user=37.88 sys=0.00, real=9.45 secs]
> 59779.293: [GC (Allocation Failure) 59779.293: [ParNew: 
> 1724980K->31847K(1887488K), 98.3531267 secs] 5623976K->3949810K(21480704K), 
> 98.3542271 secs] [Times: user=393.93 sys=0.00, real=98.35 secs]
> 59933.098: [GC (Allocation Failure) 59933.099: [ParNew: 
> 1709671K->24851K(1887488K), 155.8798281 secs] 5627634K->3947185K(21480704K), 
> 155.8810828 secs] [Times: user=629.60 sys=0.00, real=155.88 secs]
> 60138.148: [GC (Allocation Failure) 60138.148: [ParNew: 
> 1702675K->22690K(1887488K), 162.4118957 secs] 5625009K->3945214K(21480704K), 
> 162.4132521 secs] [Times: user=659.75 sys=0.00, real=162.42 secs]
> 60352.343: [GC (Allocation Failure) 60352.343: [ParNew: 
> 1700514K->26504K(1887488K), 166.2566004 secs] 5623038K->3949247K(21480704K), 
> 166.2577295 secs] [Times: user=672.08 sys=0.00, real=166.25 secs]
> 60576.273: [GC (Allocation Failure) 60576.274: [ParNew: 
> 1704328K->49624K(1887488K), 168.2311204 secs] 5627071K->3972465K(21480704K), 
> 168.2322492 secs] [Times: user=660.16 sys=0.00, real=168.23 secs]
> 60791.458: [GC (Allocation Failure) 60791.459: [ParNew: 
> 1727448K->33138K(1887488K), 174.0129541 secs] 5650289K->3956199K(21480704K), 
> 174.0139678 secs] [Times: user=680.44 sys=0.00, real=174.02 secs]
> 61022.223: [GC (Allocation Failure) 61022.223: [ParNew: 
> 1710962K->29340K(1887488K), 176.6459303 secs] 5634023K->3952483K(21480704K), 
> 176.6474561 secs] [Times: user=690.10 sys=0.00, real=176.64 secs]
> 61246.967: [GC (Allocation Failure) 61246.967: [ParNew: 
> 1707164K->31914K(1887488K), 182.4714040 secs] 5630307K->3955486K(21480704K), 
> 182.4724297 secs] [Times: user=721.68 sys=0.00, real=182.47 secs]
> 61483.620: [GC (Allocation Failure) 61483.621: [ParNew: 
> 1709738K->27879K(1887488K), 187.2219554 secs] 5633310K->3951691K(21480704K), 
> 187.2230896 secs] [Times: user=783.68 sys=0.00, real=187.23 secs]
> 61723.432: [GC (Allocation Failure) 61723.433: [ParNew: 
> 1705703K->22446K(1887488K), 192.4606406 secs] 5629515K->3946498K(21480704K), 
> 192.4618564 secs] [Times: user=721.75 sys=0.00, real=192.46 secs]
> 61965.560: [GC (Allocation Failure) 61965.561: [ParNew: 
> 1700270K->30881K(1887488K), 197.1221819 secs] 5624322K->3955113K(21480704K), 
> 197.1233286 secs] [Times: user=806.53 sys=0.00, real=197.13 secs]
> 62211.173: [GC (Allocation Failure) 62211.174: [ParNew: 
> 1708705K->22526K(1887488K), 199.7864929 secs] 5632937K->3946917K(21480704K), 
> 199.7875671 secs] [Times: user=811.19 sys=0.00, real=199.78 secs]
> 62464.936: [GC (Allocation Failure) 62464.937: [ParNew: 
> 1700350K->47417K(1887488K), 203.1958095 secs] 5624741K->3972025K(21480704K), 
> 203.1969591 secs] [Times: user=789.37 sys=0.00, real=203.20 secs]
> 62682.808: [GC (Allocation Failure) 62682.808: [ParNew: 
> 1724605K->37951K(1887488K), 206.8942846 secs] 5649212K->3962723K(21480704K), 
> 206.8948168 secs] [Times: user=821.45 sys=0.00, real=206.90 secs]
> 
> 
> 
> On Fri, Dec 22, 2017 at 3:12 AM, Kirk Pepperdine <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> 
> If GC logging is giving you grief, you’ve got bigger problems because GC logs 
> a very small amount of data. If this is an issue there will be a hint of it 
> in the GC logs. At any rate, the problem is with GC and turning off GC 
> logging leaves you blind to any possible solutions.
> 
> Kind regards,
> Kirk Pepperdine
> 
>> On Dec 21, 2017, at 9:40 PM, Josh Humphries <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Like Erik said, check that the host isn't swapping. An old-gen GC can end up 
>> scanning parts of the heap that the OS may have paged out if it encountered 
>> memory pressure.
>> 
>> Another issue I've seen that causes this sort of thing is if you have GC 
>> logging turned on and encounter issues in I/O. GC logging is not async. I've 
>> seen the GC logging itself, to a drive that had a SCSI firmware issue, cause 
>> silly-long GC times. You can possibly corroborate by checking kernel logs to 
>> see if there are other signals of disk/driver issues.
>> 
>> 
>> ----
>> Josh Humphries
>> [email protected] <mailto:[email protected]>
>> 
>> On Thu, Dec 21, 2017 at 3:36 PM, Erik Svensson <[email protected] 
>> <mailto:[email protected]>> wrote:
>> If it’s on a virtual host then I’ve seen the same behaviour when the 
>> underlying hw started swapping. I’ve also seen the same when the other hosts 
>> on the hw had high cpu demands (i.e. my host was cpu starved).
>> But, as martin said, gc logging always.
>> 
>> cheer
>> 
>> Erik Svensson
>> D
>> 
>>> On 21 Dec 2017, at 18:25, Abhishek Agarwal <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Hello everyone,
>>> Recently, I had one of the JVMs going into this weird state wherein minor 
>>> gc start kicking almost every minute but what is most surprising that it is 
>>> one minor GC is taking as much as 100 seconds. The heap size is around 12 
>>> gigs and young gen is about 30% of that. As soon as gc starts, 
>>> understandably CPU usage started increasing as well. I am not able to 
>>> figure out how would one minor GC take this much time. The machine does not 
>>> have swapping enabled. I also ran sar and confirmed there was no swapping 
>>> during the issue. We are using parallel GC for young gen and CMS for old 
>>> gen GC. How can I go about troubleshooting this further? Any pointers will 
>>> be appreciated. 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "mechanical-sympathy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected] 
>>> <mailto:[email protected]>.
>>> For more options, visit https://groups.google.com/d/optout 
>>> <https://groups.google.com/d/optout>.
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] 
>> <mailto:[email protected]>.
>> For more options, visit https://groups.google.com/d/optout 
>> <https://groups.google.com/d/optout>.
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] 
>> <mailto:[email protected]>.
>> For more options, visit https://groups.google.com/d/optout 
>> <https://groups.google.com/d/optout>.
> 
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "mechanical-sympathy" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/mechanical-sympathy/AG-iG9JuIfU/unsubscribe 
> <https://groups.google.com/d/topic/mechanical-sympathy/AG-iG9JuIfU/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to 
> [email protected] 
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.
> 
> 
> 
> -- 
> Regards,
> Abhishek Agarwal
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: JVM with 100 seconds minor GC

Reply via email to