+1 on disabling THP Charlie
> On Oct 29, 2016, at 10:07 AM, Vitaly Davidovich <vita...@gmail.com> wrote: > > David, > > Ask them to turn off THP - it's a known source of large latency due to the > kernel doing page defragmentation; your app takes a page fault, and boom - > the kernel may start doing defragmentation to make a huge page available. > You can search online for THP issues. The symptoms are similar to yours - > very high sys time. > > If they turn it off and still get same lengthy parnew pauses, then it's > clearly something else but at least we'll eliminate THP as the culprit. > >> On Saturday, October 29, 2016, David Ely <david....@unboundid.com> wrote: >> Thank you for the response. Yes. meminfo (see full output below) shows ~80GB >> of AnonHugePages, which is pretty close to the size of the JVM (full output >> below). Looking back through previous information that we have from this >> customer, transparent huge pages have been turned on for years. We've asked >> them for anything else that might have changed in this environment. >> >> Are there any other JVM options that we could enable that would shed light >> on what's going on within the ParNew? Would -XX:+PrintTLAB -XX:+PrintPLAB >> -XX:PrintFLSStatistics=1 show anything useful? >> >> David >> >> >> MemTotal: 264396572 kB >> MemFree: 2401576 kB >> Buffers: 381564 kB >> Cached: 172673120 kB >> SwapCached: 0 kB >> Active: 163439836 kB >> Inactive: 90737452 kB >> Active(anon): 76910848 kB >> Inactive(anon): 4212580 kB >> Active(file): 86528988 kB >> Inactive(file): 86524872 kB >> Unevictable: 0 kB >> Mlocked: 0 kB >> SwapTotal: 16236540 kB >> SwapFree: 16236540 kB >> Dirty: 14552 kB >> Writeback: 0 kB >> AnonPages: 81111768 kB >> Mapped: 31312 kB >> Shmem: 212 kB >> Slab: 6078732 kB >> SReclaimable: 5956052 kB >> SUnreclaim: 122680 kB >> KernelStack: 41296 kB >> PageTables: 171324 kB >> NFS_Unstable: 0 kB >> Bounce: 0 kB >> WritebackTmp: 0 kB >> CommitLimit: 148434824 kB >> Committed_AS: 93124984 kB >> VmallocTotal: 34359738367 kB >> VmallocUsed: 686780 kB >> VmallocChunk: 34225639420 kB >> HardwareCorrupted: 0 kB >> AnonHugePages: 80519168 kB >> HugePages_Total: 0 >> HugePages_Free: 0 >> HugePages_Rsvd: 0 >> HugePages_Surp: 0 >> Hugepagesize: 2048 kB >> DirectMap4k: 5132 kB >> DirectMap2M: 1957888 kB >> DirectMap1G: 266338304 kB >> >> >>> On Fri, Oct 28, 2016 at 8:04 PM, Vitaly Davidovich <vita...@gmail.com> >>> wrote: >>> >>> >>>> On Friday, October 28, 2016, David Ely <david....@unboundid.com> wrote: >>>> While typical ParNew GC times are 50ms, our application is occasionally >>>> hitting ParNew times that are over 15 seconds for one of our customers, >>>> and we have no idea why. Looking at the full GC log file: >>>> >>>> 382250 ParNew GCs are < 1 second >>>> 9303 are 100ms to 1 second >>>> 1267 are 1 second to 2 seconds >>>> 99 are 2 seconds to 10 seconds >>>> 24 are > 10 seconds, 48 seconds being the max >>>> >>>> The long ones are somewhat bursty as you can see from looking at the line >>>> numbers in the GC log: >>>> >>>> $ egrep -n '(ParNew.*real=[1-9][0-9]\)' gc.log.0 >>>> >>>> 12300:2016-10-21T01:03:20.380+0000: 20278.069: >>>> [GC2016-10-21T01:03:20.380+0000: 20278.070: [ParNew: >>>> 1697741K->10024K(1887488K), 16.9913450 secs] >>>> 33979542K->32817239K(84724992K), 16.9921050 secs] [Times: user=541.32 >>>> sys=14.37, real=16.99 secs] >>>> 43730:2016-10-21T14:12:25.050+0000: 67622.740: >>>> [GC2016-10-21T14:12:25.051+0000: 67622.740: [ParNew: >>>> 1728194K->33817K(1887488K), 12.7508470 secs] >>>> 49737924K->48320707K(84724992K), 12.7517840 secs] [Times: user=405.89 >>>> sys=11.05, real=12.75 secs] >>>> 44079:2016-10-21T14:18:55.172+0000: 68012.862: >>>> [GC2016-10-21T14:18:55.173+0000: 68012.862: [ParNew: >>>> 1698371K->26958K(1887488K), 12.7384460 secs] >>>> 50339815K->48930730K(84724992K), 12.7392360 secs] [Times: user=406.58 >>>> sys=11.29, real=12.73 secs] >>>> 50151:2016-10-21T17:10:14.471+0000: 78292.160: >>>> [GC2016-10-21T17:10:14.471+0000: 78292.161: [ParNew: >>>> 1713813K->40968K(1887488K), 18.6593320 secs] >>>> 49366906K->47959129K(84724992K), 18.6602550 secs] [Times: user=590.03 >>>> sys=17.45, real=18.66 secs] >>>> 56073:2016-10-21T19:59:36.847+0000: 88454.536: >>>> [GC2016-10-21T19:59:36.847+0000: 88454.537: [ParNew: >>>> 1685720K->20763K(1887488K), 16.0840200 secs] >>>> 50704025K->49302131K(84724992K), 16.0848810 secs] [Times: user=487.00 >>>> sys=16.84, real=16.09 secs] >>>> 78987:2016-10-22T05:49:25.623+0000: 123843.312: >>>> [GC2016-10-22T05:49:25.623+0000: 123843.313: [ParNew: >>>> 1709771K->22678K(1887488K), 10.9933380 secs] >>>> 43323834K->41914203K(84724992K), 10.9943060 secs] [Times: user=349.67 >>>> sys=9.84, real=10.99 secs] >>>> 79104:2016-10-22T05:59:26.382+0000: 124444.071: >>>> [GC2016-10-22T05:59:26.382+0000: 124444.072: [ParNew: >>>> 1697024K->22260K(1887488K), 11.5490390 secs] >>>> 44558499K->43145880K(84724992K), 11.5499650 secs] [Times: user=367.73 >>>> sys=10.01, real=11.55 secs] >>>> 79504:2016-10-22T06:09:36.983+0000: 125054.672: >>>> [GC2016-10-22T06:09:36.984+0000: 125054.673: [ParNew: >>>> 1688112K->4769K(1887488K), 14.1528810 secs] >>>> 46684947K->45263748K(84724992K), 14.1539860 secs] [Times: user=452.28 >>>> sys=12.71, real=14.15 secs] >>>> 79772:2016-10-22T06:30:36.130+0000: 126313.819: >>>> [GC2016-10-22T06:30:36.130+0000: 126313.820: [ParNew: >>>> 1725520K->35893K(1887488K), 14.4479670 secs] >>>> 48989739K->47563879K(84724992K), 14.4488810 secs] [Times: user=461.60 >>>> sys=13.04, real=14.45 secs] >>>> 80087:2016-10-22T06:37:07.202+0000: 126704.891: >>>> [GC2016-10-22T06:37:07.202+0000: 126704.892: [ParNew: >>>> 1698021K->23440K(1887488K), 15.7039920 secs] >>>> 50517163K->49105987K(84724992K), 15.7050040 secs] [Times: user=497.65 >>>> sys=14.75, real=15.70 secs] >>>> 89969:2016-10-22T13:54:27.978+0000: 152945.667: >>>> [GC2016-10-22T13:54:27.978+0000: 152945.668: [ParNew: >>>> 1834914K->15978K(1887488K), 11.5637150 secs] >>>> 48716340K->47307673K(84724992K), 11.5645440 secs] [Times: user=367.77 >>>> sys=10.01, real=11.57 secs] >>>> 90200:2016-10-22T14:05:02.717+0000: 153580.407: >>>> [GC2016-10-22T14:05:02.718+0000: 153580.407: [ParNew: >>>> 1684626K->7078K(1887488K), 17.3424650 secs] >>>> 50361539K->48947648K(84724992K), 17.3433490 secs] [Times: user=554.39 >>>> sys=15.81, real=17.34 secs] >>>> 90299:2016-10-22T14:14:30.521+0000: 154148.210: >>>> [GC2016-10-22T14:14:30.521+0000: 154148.211: [ParNew: >>>> 1690850K->6078K(1887488K), 13.1699350 secs] >>>> 51455784K->50033156K(84724992K), 13.1708900 secs] [Times: user=419.55 >>>> sys=11.54, real=13.17 secs] >>>> 261329:2016-10-26T00:06:44.499+0000: 448882.189: >>>> [GC2016-10-26T00:06:44.500+0000: 448882.189: [ParNew: >>>> 1705614K->22224K(1887488K), 17.5831730 secs] >>>> 40683698K->39525817K(84724992K), 17.5843270 secs] [Times: user=561.85 >>>> sys=14.79, real=17.58 secs] >>>> 261935:2016-10-26T00:13:34.277+0000: 449291.967: >>>> [GC2016-10-26T00:13:34.278+0000: 449291.967: [ParNew: >>>> 1690085K->26707K(1887488K), 13.9331790 secs] >>>> 43792178K->42655000K(84724992K), 13.9340780 secs] [Times: user=446.36 >>>> sys=11.45, real=13.93 secs] >>>> 262143:2016-10-26T00:20:09.397+0000: 449687.087: >>>> [GC2016-10-26T00:20:09.398+0000: 449687.087: [ParNew: >>>> 1696593K->27078K(1887488K), 40.3344500 secs] >>>> 45588644K->44444949K(84724992K), 40.3355430 secs] [Times: user=1248.15 >>>> sys=43.07, real=40.33 secs] >>>> 262275:2016-10-26T00:27:02.196+0000: 450099.886: >>>> [GC2016-10-26T00:27:02.197+0000: 450099.886: [ParNew: >>>> 1683406K->17853K(1887488K), 17.7472360 secs] >>>> 46908499K->45506131K(84724992K), 17.7482260 secs] [Times: user=567.03 >>>> sys=16.10, real=17.75 secs] >>>> 262282:2016-10-26T00:27:29.448+0000: 450127.138: >>>> [GC2016-10-26T00:27:29.449+0000: 450127.138: [ParNew: >>>> 1687737K->10499K(1887488K), 35.4934000 secs] >>>> 47195678K->46044477K(84724992K), 35.4943230 secs] [Times: user=1131.34 >>>> sys=31.87, real=35.49 secs] >>>> 262631:2016-10-26T00:34:17.632+0000: 450535.321: >>>> [GC2016-10-26T00:34:17.632+0000: 450535.321: [ParNew: >>>> 1687590K->10226K(1887488K), 21.4043600 secs] >>>> 49431427K->48018504K(84724992K), 21.4052230 secs] [Times: user=682.50 >>>> sys=19.46, real=21.41 secs] >>>> 262844:2016-10-26T00:41:08.118+0000: 450945.808: >>>> [GC2016-10-26T00:41:08.119+0000: 450945.808: [ParNew: >>>> 1692928K->11302K(1887488K), 48.2899260 secs] >>>> 51073216K->49915878K(84724992K), 48.2909550 secs] [Times: user=1493.17 >>>> sys=53.55, real=48.28 secs] >>>> 345421:2016-10-27T04:17:59.617+0000: 550357.306: >>>> [GC2016-10-27T04:17:59.618+0000: 550357.307: [ParNew: >>>> 1695052K->22991K(1887488K), 33.8707510 secs] >>>> 46334738K->45187822K(84724992K), 33.8718980 secs] [Times: user=1081.31 >>>> sys=30.59, real=33.86 secs] >>>> 345510:2016-10-27T04:24:11.721+0000: 550729.411: >>>> [GC2016-10-27T04:24:11.722+0000: 550729.411: [ParNew: >>>> 1705080K->20401K(1887488K), 18.9795540 secs] >>>> 47388073K->45965537K(84724992K), 18.9805410 secs] [Times: user=606.94 >>>> sys=17.25, real=18.98 secs] >>>> 345514:2016-10-27T04:24:36.695+0000: 550754.385: >>>> [GC2016-10-27T04:24:36.696+0000: 550754.385: [ParNew: >>>> 1707810K->32640K(1887488K), 30.9728200 secs] >>>> 47656489K->46506725K(84724992K), 30.9737300 secs] [Times: user=917.67 >>>> sys=33.07, real=30.97 secs] >>>> 345777:2016-10-27T04:31:30.102+0000: 551167.791: >>>> [GC2016-10-27T04:31:30.102+0000: 551167.791: [ParNew: >>>> 1704518K->30860K(1887488K), 38.0976720 secs] >>>> 49570144K->48422333K(84724992K), 38.0984950 secs] [Times: user=1215.89 >>>> sys=34.79, real=38.09 secs] >>>> >>>> Context around a single instance is fairly normal: >>>> >>>> 345773-2016-10-27T04:31:28.032+0000: 551165.721: >>>> [GC2016-10-27T04:31:28.033+0000: 551165.722: [ParNew: >>>> 1685858K->8851K(1887488K), 0.0480250 secs] >>>> 49545909K->47870050K(84724992K), 0.0490200 secs] [Times: user=1.47 >>>> sys=0.02, real=0.05 secs] >>>> 345774-2016-10-27T04:31:28.635+0000: 551166.324: >>>> [GC2016-10-27T04:31:28.636+0000: 551166.325: [ParNew: >>>> 1686675K->10456K(1887488K), 0.0463570 secs] >>>> 49547874K->47872545K(84724992K), 0.0473410 secs] [Times: user=1.41 >>>> sys=0.04, real=0.05 secs] >>>> 345775-2016-10-27T04:31:29.205+0000: 551166.894: >>>> [GC2016-10-27T04:31:29.205+0000: 551166.894: [ParNew: >>>> 1688280K->12733K(1887488K), 0.0487100 secs] >>>> 49550369K->47876404K(84724992K), 0.0496310 secs] [Times: user=1.47 >>>> sys=0.04, real=0.05 secs] >>>> 345776-2016-10-27T04:31:29.798+0000: 551167.487: >>>> [GC2016-10-27T04:31:29.798+0000: 551167.488: [ParNew: >>>> 1690557K->26694K(1887488K), 0.0471170 secs] >>>> 49554228K->47892320K(84724992K), 0.0481180 secs] [Times: user=1.40 >>>> sys=0.02, real=0.05 secs] >>>> 345777:2016-10-27T04:31:30.102+0000: 551167.791: >>>> [GC2016-10-27T04:31:30.102+0000: 551167.791: [ParNew: >>>> 1704518K->30860K(1887488K), 38.0976720 secs] >>>> 49570144K->48422333K(84724992K), 38.0984950 secs] [Times: user=1215.89 >>>> sys=34.79, real=38.09 secs] >>>> 345778-2016-10-27T04:32:08.449+0000: 551206.139: >>>> [GC2016-10-27T04:32:08.450+0000: 551206.139: [ParNew: >>>> 1708684K->122033K(1887488K), 0.0664280 secs] >>>> 50100157K->48528020K(84724992K), 0.0672860 secs] [Times: user=1.60 >>>> sys=0.05, real=0.07 secs] >>>> 345779-2016-10-27T04:32:09.090+0000: 551206.779: >>>> [GC2016-10-27T04:32:09.091+0000: 551206.780: [ParNew: >>>> 1799857K->42169K(1887488K), 0.0688910 secs] >>>> 50205844K->48541030K(84724992K), 0.0696110 secs] [Times: user=1.70 >>>> sys=0.03, real=0.07 secs] >>>> 345780-2016-10-27T04:32:09.802+0000: 551207.491: >>>> [GC2016-10-27T04:32:09.802+0000: 551207.491: [ParNew: >>>> 1719993K->43790K(1887488K), 0.0508540 secs] >>>> 50218854K->48542651K(84724992K), 0.0516000 secs] [Times: user=1.54 >>>> sys=0.03, real=0.05 secs] >>>> 345781-2016-10-27T04:32:10.536+0000: 551208.226: >>>> [GC2016-10-27T04:32:10.537+0000: 551208.226: [ParNew: >>>> 1721614K->30389K(1887488K), 0.0668100 secs] >>>> 50220475K->48545932K(84724992K), 0.0675470 secs] [Times: user=1.81 >>>> sys=0.03, real=0.06 secs] >>>> 345782-2016-10-27T04:32:11.137+0000: 551208.826: >>>> [GC2016-10-27T04:32:11.137+0000: 551208.826: [ParNew: >>>> 1708213K->18631K(1887488K), 0.0632570 secs] >>>> 50223756K->48540797K(84724992K), 0.0639650 secs] [Times: user=1.95 >>>> sys=0.01, real=0.06 secs] >>>> 345783-2016-10-27T04:32:11.642+0000: 551209.332: >>>> [GC2016-10-27T04:32:11.643+0000: 551209.332: [ParNew: >>>> 1696455K->19415K(1887488K), 0.0509260 secs] >>>> 50218621K->48545033K(84724992K), 0.0516780 secs] [Times: user=1.55 >>>> sys=0.03, real=0.05 secs] >>>> >>>> Since the user times are high as well, I don't think this could be >>>> swapping. >>> >>> Can you ask the customer if they're using transparent hugepages (THP)? >>>> >>>> Here are the hard-earned set of JVM arguments that we're using: >>>> >>>> -d64 -server -Xmx81g -Xms81g -XX:MaxNewSize=2g \ >>>> -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled \ >>>> -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled \ >>>> -XX:+CMSScavengeBeforeRemark -XX:RefDiscoveryPolicy=1 \ >>>> -XX:ParallelCMSThreads=12 -XX:CMSMaxAbortablePrecleanTime=3600000 \ >>>> -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseParNewGC >>>> -XX:+UseBiasedLocking \ >>>> -XX:MaxTenuringThreshold=2 -XX:+UseCompressedOops -XX:PermSize=256M \ >>>> -XX:MaxPermSize=256M -XX:+HeapDumpOnOutOfMemoryError \ >>>> -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseMembar -XX:+UseLargePages \ >>>> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintCommandLineFlags \ >>>> -XX:+UseGCLogFileRotation \ >>>> -XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=100m \ >>>> -Xloggc:${INSTANCE_ROOT}/logs/jvm/gc.log >>>> >>>> This is on Linux with Java 1.7.0_72. >>>> >>>> Does this look familiar to anyone? Alternatively, are there some more JVM >>>> options that we could include to get more information? >>>> >>>> One of the first things that we'll try is to move to a later JVM, but it >>>> will be easier to get the customer to do that if we can point to a >>>> specific issue that has been addressed. >>>> >>>> Thanks for your help. >>>> >>>> David >>> >>> >>> -- >>> Sent from my phone >> > > > -- > Sent from my phone > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
_______________________________________________ hotspot-gc-use mailing list hotspot-gc-use@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use