Thank you for the pointers. We have gc logs but didn't help much.

Few corrections:
Old gen size is ~ 20 Gigs
Young gen size is ~ 1.5 Gigs

The skew between young gen and old gen size has been fixed. Here is the
snippet of gc logs when time increased to ~ 200 seconds. By the way the
service is running on AWS.


59549.736: [GC (Allocation Failure) 59549.736: [ParNew:
1688278K->56721K(1887488K), 0.5518315 secs] 4474842K->3433110K(21480704K),
0.5526896 secs] [Times: user=1.92 sys=0.25, real=0.55 secs]

59581.163: [GC (Allocation Failure) 59581.164: [ParNew:
1734545K->43688K(1887488K), 0.4640033 secs] 5110934K->3941007K(21480704K),
0.4650333 secs] [Times: user=1.59 sys=0.26, real=0.46 secs]

59617.565: [GC (Allocation Failure) 59617.565: [ParNew:
1721512K->40500K(1887488K), 0.1154788 secs] 5618831K->3937818K(21480704K),
0.1166312 secs] [Times: user=0.44 sys=0.00, real=0.12 secs]

59645.910: [GC (Allocation Failure) 59645.911: [ParNew:
1718324K->46789K(1887488K), 0.1918905 secs] 5615642K->3944132K(21480704K),
0.1930935 secs] [Times: user=0.77 sys=0.00, real=0.20 secs]

59692.159: [GC (Allocation Failure) 59692.159: [ParNew:
1724613K->45533K(1887488K), 5.4467875 secs] 5621956K->3944000K(21480704K),
5.4480386 secs] [Times: user=21.86 sys=0.00, real=5.45 secs]

59716.348: [GC (Allocation Failure) 59716.348: [ParNew:
1723357K->49256K(1887488K), 8.2982698 secs] 5621824K->3948162K(21480704K),
8.2994285 secs] [Times: user=33.47 sys=0.00, real=8.30 secs]

59751.501: [GC (Allocation Failure) 59751.502: [ParNew:
1727080K->47156K(1887488K), 9.4546047 secs] 5625986K->3946152K(21480704K),
9.4558187 secs] [Times: user=37.88 sys=0.00, real=9.45 secs]

59779.293: [GC (Allocation Failure) 59779.293: [ParNew:
1724980K->31847K(1887488K), 98.3531267 secs] 5623976K->3949810K(21480704K),
98.3542271 secs] [Times: user=393.93 sys=0.00, real=98.35 secs]

59933.098: [GC (Allocation Failure) 59933.099: [ParNew:
1709671K->24851K(1887488K), 155.8798281 secs]
5627634K->3947185K(21480704K), 155.8810828 secs] [Times: user=629.60
sys=0.00, real=155.88 secs]

60138.148: [GC (Allocation Failure) 60138.148: [ParNew:
1702675K->22690K(1887488K), 162.4118957 secs]
5625009K->3945214K(21480704K), 162.4132521 secs] [Times: user=659.75
sys=0.00, real=162.42 secs]

60352.343: [GC (Allocation Failure) 60352.343: [ParNew:
1700514K->26504K(1887488K), 166.2566004 secs]
5623038K->3949247K(21480704K), 166.2577295 secs] [Times: user=672.08
sys=0.00, real=166.25 secs]

60576.273: [GC (Allocation Failure) 60576.274: [ParNew:
1704328K->49624K(1887488K), 168.2311204 secs]
5627071K->3972465K(21480704K), 168.2322492 secs] [Times: user=660.16
sys=0.00, real=168.23 secs]

60791.458: [GC (Allocation Failure) 60791.459: [ParNew:
1727448K->33138K(1887488K), 174.0129541 secs]
5650289K->3956199K(21480704K), 174.0139678 secs] [Times: user=680.44
sys=0.00, real=174.02 secs]

61022.223: [GC (Allocation Failure) 61022.223: [ParNew:
1710962K->29340K(1887488K), 176.6459303 secs]
5634023K->3952483K(21480704K), 176.6474561 secs] [Times: user=690.10
sys=0.00, real=176.64 secs]

61246.967: [GC (Allocation Failure) 61246.967: [ParNew:
1707164K->31914K(1887488K), 182.4714040 secs]
5630307K->3955486K(21480704K), 182.4724297 secs] [Times: user=721.68
sys=0.00, real=182.47 secs]

61483.620: [GC (Allocation Failure) 61483.621: [ParNew:
1709738K->27879K(1887488K), 187.2219554 secs]
5633310K->3951691K(21480704K), 187.2230896 secs] [Times: user=783.68
sys=0.00, real=187.23 secs]

61723.432: [GC (Allocation Failure) 61723.433: [ParNew:
1705703K->22446K(1887488K), 192.4606406 secs]
5629515K->3946498K(21480704K), 192.4618564 secs] [Times: user=721.75
sys=0.00, real=192.46 secs]

61965.560: [GC (Allocation Failure) 61965.561: [ParNew:
1700270K->30881K(1887488K), 197.1221819 secs]
5624322K->3955113K(21480704K), 197.1233286 secs] [Times: user=806.53
sys=0.00, real=197.13 secs]

62211.173: [GC (Allocation Failure) 62211.174: [ParNew:
1708705K->22526K(1887488K), 199.7864929 secs]
5632937K->3946917K(21480704K), 199.7875671 secs] [Times: user=811.19
sys=0.00, real=199.78 secs]

62464.936: [GC (Allocation Failure) 62464.937: [ParNew:
1700350K->47417K(1887488K), 203.1958095 secs]
5624741K->3972025K(21480704K), 203.1969591 secs] [Times: user=789.37
sys=0.00, real=203.20 secs]

62682.808: [GC (Allocation Failure) 62682.808: [ParNew:
1724605K->37951K(1887488K), 206.8942846 secs]
5649212K->3962723K(21480704K), 206.8948168 secs] [Times: user=821.45
sys=0.00, real=206.90 secs]



On Fri, Dec 22, 2017 at 3:12 AM, Kirk Pepperdine <[email protected]> wrote:

> Hi,
>
> If GC logging is giving you grief, you’ve got bigger problems because GC
> logs a very small amount of data. If this is an issue there will be a hint
> of it in the GC logs. At any rate, the problem is with GC and turning off
> GC logging leaves you blind to any possible solutions.
>
> Kind regards,
> Kirk Pepperdine
>
> On Dec 21, 2017, at 9:40 PM, Josh Humphries <[email protected]> wrote:
>
> Like Erik said, check that the host isn't swapping. An old-gen GC can end
> up scanning parts of the heap that the OS may have paged out if it
> encountered memory pressure.
>
> Another issue I've seen that causes this sort of thing is if you have GC
> logging turned on and encounter issues in I/O. GC logging is not async.
> I've seen the GC logging itself, to a drive that had a SCSI firmware issue,
> cause silly-long GC times. You can possibly corroborate by checking kernel
> logs to see if there are other signals of disk/driver issues.
>
>
> ----
> *Josh Humphries*
> [email protected]
>
> On Thu, Dec 21, 2017 at 3:36 PM, Erik Svensson <[email protected]> wrote:
>
>> If it’s on a virtual host then I’ve seen the same behaviour when the
>> underlying hw started swapping. I’ve also seen the same when the other
>> hosts on the hw had high cpu demands (i.e. my host was cpu starved).
>> But, as martin said, gc logging always.
>>
>> cheer
>>
>> Erik Svensson
>> D
>>
>> On 21 Dec 2017, at 18:25, Abhishek Agarwal <[email protected]> wrote:
>>
>> Hello everyone,
>> Recently, I had one of the JVMs going into this weird state wherein minor
>> gc start kicking almost every minute but what is most surprising that it is
>> one minor GC is taking as much as 100 seconds. The heap size is around 12
>> gigs and young gen is about 30% of that. As soon as gc starts,
>> understandably CPU usage started increasing as well. I am not able to
>> figure out how would one minor GC take this much time. The machine does not
>> have swapping enabled. I also ran sar and confirmed there was no swapping
>> during the issue. We are using parallel GC for young gen and CMS for old
>> gen GC. How can I go about troubleshooting this further? Any pointers will
>> be appreciated.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "mechanical-sympathy" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/mechanical-sympathy/AG-iG9JuIfU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Regards,
Abhishek Agarwal

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to