Re: GC makes ryan sad

Stack Sat, 25 Apr 2009 14:23:53 -0700

Just to say that in ryan case we are at an extreme in that cell andkey size are very small.

Ryan, why you think heap fragmentation an issue when objects are small(and I do not think you are reading -- right)


Is there an issue for the meta issue do you know?

Good on you Ryan





On Apr 25, 2009, at 13:23, Ryan Rawson <[email protected]> wrote:

Heap seems a _bit_ bigger than perhaps it should be, but it justgives the
CMS more work to do.
The hidden problem might be heap fragmentation, since Java uses lotsof
small objects, we might end up with an increasing heap size.
Here is a GC log during the CMS-sweep (when it's reclaiming frommain heap):
42374.524: [CMS-concurrent-sweep-start]
42374.576: [GC 42374.576: [ParNew: 5560K->576K(5568K), 0.0115800 secs]
2709982K->2706003K(2947416K), 0.0117020 secs] [Times: user=0.04sys=0.00,
real=0.01 secs]
42374.616: [GC 42374.616: [ParNew: 5568K->576K(5568K), 0.0147850 secs]
2703025K->2701054K(2947416K), 0.0148980 secs] [Times: user=0.08sys=0.00,
real=0.01 secs]
42374.685: [GC 42374.685: [ParNew: 5568K->574K(5568K), 0.0104920 secs]
2690723K->2687015K(2947416K), 0.0106060 secs] [Times: user=0.07sys=0.00,
real=0.01 secs]
42374.751: [GC 42374.752: [ParNew: 5566K->574K(5568K), 0.0118220 secs]
2672236K->2668439K(2947416K), 0.0119340 secs] [Times: user=0.08sys=0.00,
real=0.01 secs]
42375.483: [GC 42375.483: [ParNew: 5508K->575K(5568K), 0.0090170 secs]
2329262K->2325313K(2947416K), 0.0091370 secs] [Times: user=0.07sys=0.00,
real=0.02 secs]
42375.614: [GC 42375.615: [ParNew: 5504K->467K(5568K), 0.0071550 secs]
2275783K->2271206K(2947416K), 0.0072510 secs] [Times: user=0.06sys=0.00,
real=0.00 secs]
42375.746: [GC 42375.746: [ParNew: 5451K->428K(5568K), 0.0064950 secs]
2210537K->2205514K(2947416K), 0.0066040 secs] [Times: user=0.04sys=0.00,
real=0.01 secs]
42375.885: [GC 42375.885: [ParNew: 5420K->362K(5568K), 0.0080160 secs]
2149436K->2144384K(2947416K), 0.0081120 secs] [Times: user=0.07sys=0.00,
real=0.01 secs]
42376.006: [GC 42376.006: [ParNew: 5354K->548K(5568K), 0.0064720 secs]
2097343K->2092544K(2947416K), 0.0065760 secs] [Times: user=0.06sys=0.00,
real=0.01 secs]
42376.120: [GC 42376.120: [ParNew: 5540K->576K(5568K), 0.0089420 secs]
2044984K->2040242K(2947416K), 0.0090420 secs] [Times: user=0.06sys=0.00,
real=0.01 secs]
42376.336: [GC 42376.336: [ParNew: 5568K->576K(5568K), 0.0092920 secs]
1948100K->1943594K(2947416K), 0.0093910 secs] [Times: user=0.06sys=0.01,
real=0.01 secs]
42376.885: [GC 42376.885: [ParNew: 5540K->576K(5568K), 0.0119010 secs]
1677094K->1673092K(2947416K), 0.0120230 secs] [Times: user=0.05sys=0.00,
real=0.01 secs]
42377.328: [CMS-concurrent-sweep: 2.679/2.804 secs] [Times: user=4.75
sys=0.10, real=2.80 secs]
basically we go from 2700MB -> 1670 MB, losing about 1100MB. Tookabout2.8s and there were 12 minor collections during that time,decreasing thesize of the ParNew by 10x each time. The GC pauses for ParNew arehigher
than I want, about 6ms each, up to a high of 11ms.
My next thought/experiment is to move to a slightly large ParNew,maybe
12MB, thus preventing more garbage from entering heap.

However, HBase won't take a rolling restart - as soon as you touch the
.META. regionserver, you're screwed.
On Sat, Apr 25, 2009 at 1:05 PM, Chad Walters <[email protected]>wrote:
Cool -- that seems like it could be a reasonable set of tradeoffs for
latency sensitive uses.

How big is the memory increase?

Do you know the chief source of the allocations that are GC'd?

Chad


On 4/25/09 12:47 PM, "Ryan Rawson" <[email protected]> wrote:
The incremental GC doesn't help on a multicore system, it'sdesigned tobreak up the giant concurrent phase (which I've seen last 5minutes) onsystems with 1 or 2 CPUs to prevent the concurrent GC from starvingthe app
for CPU.  My machines have 8 cores, so this won't really help.
I talked to a friend, I realize the problem is the ParNew isgrowing toobig... It starts out at about 15 MB and grows 10x to 150MB. Myfriendsuggested that I size ParNew to the L2 cache size, which is 6 MB onmy
Intel
Xeon CPUs - not sure if the L2 size is shared between pairs ofcores or
not.

Switching to a much smaller ParNew has several effects:
- GC pauses are small, tiny, 1ms ish
- GC of ParNew happens 1-10x a second
- Overall memory use goes up as more stuff enters the general heapand hasto wait for the CMS to reclaim it. I watched the CMS prune 1 GB ofgarbage
in 3 seconds.
However, the crippling pauses that were keeping my cluster fromperforming
are gone.  At the cost of higher ram usage.

On Sat, Apr 25, 2009 at 10:09 AM, Chad Walters
<[email protected]>wrote:
Done any experimentation with the incremental GC?

Chad

On 4/23/09 6:28 PM, "Ryan Rawson" <[email protected]> wrote:

From a log of 0.20 in action:

11061.089: [GC 11061.089: [ParNew: 153331K->17024K(153344K), 0.2249580
secs]
3396126K->3309484K(4579960K) icms_dc=0 , 0.2251020 secs] [Times:
user=0.66
sys=0.04, real=0.23 secs]
11062.439: [GC 11062.439: [ParNew: 153344K->17024K(153344K), 0.1625480
secs]
3445804K->3335150K(4579960K) icms_dc=0 , 0.1627020 secs] [Times:
user=0.37
sys=0.02, real=0.17 secs]
Yes, in 1.4 seconds we had a 225ms pause and a 162ms pause. Thisis not
atypical.  Most pauses are in the 80-150ms area.

-ryan

Re: GC makes ryan sad

Reply via email to