Thank you for the pointers. We have gc logs but didn't help much. Few corrections: Old gen size is ~ 20 Gigs Young gen size is ~ 1.5 Gigs
The skew between young gen and old gen size has been fixed. Here is the snippet of gc logs when time increased to ~ 200 seconds. By the way the service is running on AWS. 59549.736: [GC (Allocation Failure) 59549.736: [ParNew: 1688278K->56721K(1887488K), 0.5518315 secs] 4474842K->3433110K(21480704K), 0.5526896 secs] [Times: user=1.92 sys=0.25, real=0.55 secs] 59581.163: [GC (Allocation Failure) 59581.164: [ParNew: 1734545K->43688K(1887488K), 0.4640033 secs] 5110934K->3941007K(21480704K), 0.4650333 secs] [Times: user=1.59 sys=0.26, real=0.46 secs] 59617.565: [GC (Allocation Failure) 59617.565: [ParNew: 1721512K->40500K(1887488K), 0.1154788 secs] 5618831K->3937818K(21480704K), 0.1166312 secs] [Times: user=0.44 sys=0.00, real=0.12 secs] 59645.910: [GC (Allocation Failure) 59645.911: [ParNew: 1718324K->46789K(1887488K), 0.1918905 secs] 5615642K->3944132K(21480704K), 0.1930935 secs] [Times: user=0.77 sys=0.00, real=0.20 secs] 59692.159: [GC (Allocation Failure) 59692.159: [ParNew: 1724613K->45533K(1887488K), 5.4467875 secs] 5621956K->3944000K(21480704K), 5.4480386 secs] [Times: user=21.86 sys=0.00, real=5.45 secs] 59716.348: [GC (Allocation Failure) 59716.348: [ParNew: 1723357K->49256K(1887488K), 8.2982698 secs] 5621824K->3948162K(21480704K), 8.2994285 secs] [Times: user=33.47 sys=0.00, real=8.30 secs] 59751.501: [GC (Allocation Failure) 59751.502: [ParNew: 1727080K->47156K(1887488K), 9.4546047 secs] 5625986K->3946152K(21480704K), 9.4558187 secs] [Times: user=37.88 sys=0.00, real=9.45 secs] 59779.293: [GC (Allocation Failure) 59779.293: [ParNew: 1724980K->31847K(1887488K), 98.3531267 secs] 5623976K->3949810K(21480704K), 98.3542271 secs] [Times: user=393.93 sys=0.00, real=98.35 secs] 59933.098: [GC (Allocation Failure) 59933.099: [ParNew: 1709671K->24851K(1887488K), 155.8798281 secs] 5627634K->3947185K(21480704K), 155.8810828 secs] [Times: user=629.60 sys=0.00, real=155.88 secs] 60138.148: [GC (Allocation Failure) 60138.148: [ParNew: 1702675K->22690K(1887488K), 162.4118957 secs] 5625009K->3945214K(21480704K), 162.4132521 secs] [Times: user=659.75 sys=0.00, real=162.42 secs] 60352.343: [GC (Allocation Failure) 60352.343: [ParNew: 1700514K->26504K(1887488K), 166.2566004 secs] 5623038K->3949247K(21480704K), 166.2577295 secs] [Times: user=672.08 sys=0.00, real=166.25 secs] 60576.273: [GC (Allocation Failure) 60576.274: [ParNew: 1704328K->49624K(1887488K), 168.2311204 secs] 5627071K->3972465K(21480704K), 168.2322492 secs] [Times: user=660.16 sys=0.00, real=168.23 secs] 60791.458: [GC (Allocation Failure) 60791.459: [ParNew: 1727448K->33138K(1887488K), 174.0129541 secs] 5650289K->3956199K(21480704K), 174.0139678 secs] [Times: user=680.44 sys=0.00, real=174.02 secs] 61022.223: [GC (Allocation Failure) 61022.223: [ParNew: 1710962K->29340K(1887488K), 176.6459303 secs] 5634023K->3952483K(21480704K), 176.6474561 secs] [Times: user=690.10 sys=0.00, real=176.64 secs] 61246.967: [GC (Allocation Failure) 61246.967: [ParNew: 1707164K->31914K(1887488K), 182.4714040 secs] 5630307K->3955486K(21480704K), 182.4724297 secs] [Times: user=721.68 sys=0.00, real=182.47 secs] 61483.620: [GC (Allocation Failure) 61483.621: [ParNew: 1709738K->27879K(1887488K), 187.2219554 secs] 5633310K->3951691K(21480704K), 187.2230896 secs] [Times: user=783.68 sys=0.00, real=187.23 secs] 61723.432: [GC (Allocation Failure) 61723.433: [ParNew: 1705703K->22446K(1887488K), 192.4606406 secs] 5629515K->3946498K(21480704K), 192.4618564 secs] [Times: user=721.75 sys=0.00, real=192.46 secs] 61965.560: [GC (Allocation Failure) 61965.561: [ParNew: 1700270K->30881K(1887488K), 197.1221819 secs] 5624322K->3955113K(21480704K), 197.1233286 secs] [Times: user=806.53 sys=0.00, real=197.13 secs] 62211.173: [GC (Allocation Failure) 62211.174: [ParNew: 1708705K->22526K(1887488K), 199.7864929 secs] 5632937K->3946917K(21480704K), 199.7875671 secs] [Times: user=811.19 sys=0.00, real=199.78 secs] 62464.936: [GC (Allocation Failure) 62464.937: [ParNew: 1700350K->47417K(1887488K), 203.1958095 secs] 5624741K->3972025K(21480704K), 203.1969591 secs] [Times: user=789.37 sys=0.00, real=203.20 secs] 62682.808: [GC (Allocation Failure) 62682.808: [ParNew: 1724605K->37951K(1887488K), 206.8942846 secs] 5649212K->3962723K(21480704K), 206.8948168 secs] [Times: user=821.45 sys=0.00, real=206.90 secs] On Fri, Dec 22, 2017 at 3:12 AM, Kirk Pepperdine <[email protected]> wrote: > Hi, > > If GC logging is giving you grief, you’ve got bigger problems because GC > logs a very small amount of data. If this is an issue there will be a hint > of it in the GC logs. At any rate, the problem is with GC and turning off > GC logging leaves you blind to any possible solutions. > > Kind regards, > Kirk Pepperdine > > On Dec 21, 2017, at 9:40 PM, Josh Humphries <[email protected]> wrote: > > Like Erik said, check that the host isn't swapping. An old-gen GC can end > up scanning parts of the heap that the OS may have paged out if it > encountered memory pressure. > > Another issue I've seen that causes this sort of thing is if you have GC > logging turned on and encounter issues in I/O. GC logging is not async. > I've seen the GC logging itself, to a drive that had a SCSI firmware issue, > cause silly-long GC times. You can possibly corroborate by checking kernel > logs to see if there are other signals of disk/driver issues. > > > ---- > *Josh Humphries* > [email protected] > > On Thu, Dec 21, 2017 at 3:36 PM, Erik Svensson <[email protected]> wrote: > >> If it’s on a virtual host then I’ve seen the same behaviour when the >> underlying hw started swapping. I’ve also seen the same when the other >> hosts on the hw had high cpu demands (i.e. my host was cpu starved). >> But, as martin said, gc logging always. >> >> cheer >> >> Erik Svensson >> D >> >> On 21 Dec 2017, at 18:25, Abhishek Agarwal <[email protected]> wrote: >> >> Hello everyone, >> Recently, I had one of the JVMs going into this weird state wherein minor >> gc start kicking almost every minute but what is most surprising that it is >> one minor GC is taking as much as 100 seconds. The heap size is around 12 >> gigs and young gen is about 30% of that. As soon as gc starts, >> understandably CPU usage started increasing as well. I am not able to >> figure out how would one minor GC take this much time. The machine does not >> have swapping enabled. I also ran sar and confirmed there was no swapping >> during the issue. We are using parallel GC for young gen and CMS for old >> gen GC. How can I go about troubleshooting this further? Any pointers will >> be appreciated. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "mechanical-sympathy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "mechanical-sympathy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > You received this message because you are subscribed to the Google Groups > "mechanical-sympathy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to a topic in the > Google Groups "mechanical-sympathy" group. > To unsubscribe from this topic, visit https://groups.google.com/d/ > topic/mechanical-sympathy/AG-iG9JuIfU/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- Regards, Abhishek Agarwal -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
