My total heap size is 12G and my young gen is 1G. I am getting 200ms pauses every few seconds - the machine has 4 cores (its m1.xlarge on ec2), if I set the young gen low. The one thing I noted today was that the pauses are spaced @10 seconds until a minor compaction kicks in - and then then the pauses happen every 3-4 seconds. I found this to be highly correlated on a region server. These minor compactions are not huge - they are typically compacting 3 files - upto 200M in total size. They are lasting for 20-30 seconds. Once they are over the GC pauses recover back to the 10 second frequency.
I have the following settings enabled (index.cacheonwrite and bloom.cacheonwrite) to cache the index/bloom blocks upon hfile writes though i find it unlikely that they could be impacting my setup. On Thu, Jan 24, 2013 at 11:46 PM, Jack Levin <[email protected]> wrote: > Generally, the larger the flush to harder the GC will work. Flush more > often to avoid this. What is your total heap size set at? > On Jan 24, 2013 9:02 PM, "Varun Sharma" <[email protected]> wrote: > > > I do have significant block cache churn and this issue is typical > > correlated with a huge increase in read latencies - could that be the > > reason for this - mslab should be taking care of the memstore related > heap > > fragmentation ? Has anyone seen issues with block cache churn ? > > > > On Thu, Jan 24, 2013 at 6:30 PM, Varun Sharma <[email protected]> > wrote: > > > > > I am curious how reducing the memstore size would help - I have 6 > regions > > > and 3G total for memstore - so I would like max out on that by having a > > > bigger flush size per region. Are you asking me to have more regions > and > > > smaller memstore flush size instead ? How is that likely to help > > > > > > > > > On Thu, Jan 24, 2013 at 6:07 PM, 谢良 <[email protected]> wrote: > > > > > >> Hi Varun, > > >> > > >> Please note if you try to increase new generation size, then the > > "ParNew" > > >> time will be up accordingly, and CMS YGC is also a STW. > > >> could you have a try to reduce memstore size to a smaller value, e.g. > > >> 128m or 256m ? > > >> > > >> Regards, > > >> Liang > > >> ________________________________________ > > >> 发件人: Varun Sharma [[email protected]] > > >> 发送时间: 2013年1月25日 1:40 > > >> 收件人: [email protected] > > >> 主题: GC pause issues > > >> > > >> Hi, > > >> > > >> I have a region server which has the following logs. As you can see > from > > >> the log, ParNew is sufficiently big (450M) and there are heavy writes > > >> going > > >> in. I am seeing 200ms pauses which eventually build up and there is a > > >> promotion failure. There is a parnew collection every 2-3 seconds so > it > > >> fills up real fast. My memstore size is bigger 512m for flushes and 4 > > >> regions per server. (overall size is 3G for all memstores) - I have > > mslab > > >> enabled > > >> > > >> 2013-01-24T13:08:16.870+0000: 63533.964: [GC 63533.964: [ParNew: > > >> 471841K->52416K(471872K), 0.2251880 secs] > 8733008K->8445039K(12727104K), > > >> 0.2254100 secs] [Times: user=0.50 sys=0.18, real=0.22 secs] > > >> 2013-01-24T13:08:19.546+0000: 63536.639: [GC 63536.639: [ParNew: > > >> 461593K->52416K(471872K), 0.2812690 secs] > 8854216K->8557572K(12727104K), > > >> 0.2814870 secs] [Times: user=0.66 sys=0.09, real=0.29 secs] > > >> 013-01-24T13:08:21.824+0000: 63538.917: [GC 63538.918: [ParNew: > > >> 442836K->52416K(471872K), 0.2781490 secs] > 8947992K->8705355K(12727104K), > > >> 0.2783810 secs] [Times: user=0.58 sys=0.14, real=0.28 secs] > > >> 2013-01-24T13:08:22.122+0000: 63539.216: [GC [1 CMS-initial-mark: > > >> 8652939K(12255232K)] 8752914K(12727104K), 0.0365000 secs] [Times: > > >> user=0.02 > > >> sys=0.00, real=0.04 secs] > > >> 2013-01-24T13:08:22.159+0000: 63539.253: [CMS-concurrent-mark-start] > > >> 2013-01-24T13:08:24.953+0000: 63542.047: [GC 63542.047: [ParNew: > > >> 471872K->52251K(471872K), 0.1611970 secs] > 9124811K->8831437K(12727104K), > > >> 0.1614180 secs] [Times: user=0.37 sys=0.19, real=0.16 secs] > > >> 2013-01-24T13:08:26.434+0000: 63543.527: [CMS-concurrent-mark: > > 4.105/4.268 > > >> secs] [Times: user=5.36 sys=0.34, real=4.27 secs] > > >> 2013-01-24T13:08:26.434+0000: 63543.527: > [CMS-concurrent-preclean-start] > > >> 2013-01-24T13:08:26.597+0000: 63543.691: [CMS-concurrent-preclean: > > >> 0.133/0.163 secs] [Times: user=0.16 sys=0.05, real=0.17 secs] > > >> 2013-01-24T13:08:26.597+0000: 63543.691: > > >> [CMS-concurrent-abortable-preclean-start] > > >> 2013-01-24T13:08:27.401+0000: 63544.495: > > >> [CMS-concurrent-abortable-preclean: 0.792/0.804 secs] [Times: > user=1.46 > > >> sys=0.16, real=0.80 secs] > > >> 2013-01-24T13:08:27.403+0000: 63544.496: [GC[YG occupancy: 274458 K > > >> (471872 > > >> K)]63544.496: [Rescan (parallel) , 0.0540730 secs]63544.551: [weak > refs > > >> processing, 0.0001700 secs] [1 CMS-remark: 8779186K(12255232K)] > > >> 9053645K(12727104K), 0.0544410 secs] [Times: user=0.20 sys=0.01, > > real=0.06 > > >> secs] > > >> 2013-01-24T13:08:27.458+0000: 63544.551: [CMS-concurrent-sweep-start] > > >> 2013-01-24T13:08:27.955+0000: 63545.048: [GC 63545.049: [ParNew: > > >> 471707K->44566K(471872K), 0.1371770 secs] > 9044714K->8701862K(12727104K), > > >> 0.1374060 secs] [Times: user=0.35 sys=0.12, real=0.14 secs] > > >> 2013-01-24T13:08:29.285+0000: 63546.378: [GC 63546.478: [ParNew: > > >> 445714K->52416K(471872K), 0.5626120 secs] > 8648805K->8410223K(12727104K), > > >> 0.5628610 secs] [Times: user=0.91 sys=0.08, real=0.66 secs] > > >> 2013-01-24T13:08:32.308+0000: 63549.401: [GC 63549.402: [ParNew: > > >> 471872K->52416K(471872K), 0.2300560 secs] > 8247804K->7976043K(12727104K), > > >> 0.2302900 secs] [Times: user=0.62 sys=0.17, real=0.23 secs] > > >> 2013-01-24T13:08:34.844+0000: 63551.938: [GC 63551.938: [ParNew > > (promotion > > >> failed): 471872K->471872K(471872K), 0.2788500 secs]63552.217: > > >> [CMS2013-01-24T13:08:37.256+0000: 63554.349: [CMS-concurrent-sweep: > > >> 8.473/9.798 secs] [*Times: user=15.26 sys=1.11, real=9.80 secs*] > > >> > > >> What might be advised here - should I up the size to 1G for the new > > >> generation ? > > >> > > >> Thanks > > >> Varun > > >> > > > > > > > > >
