[Repost... this time from my chromium account... so it is posted more
broadly]

---------- Forwarded message ----------
Date: Wed, Sep 30, 2009 at 10:39 AM
Subject: [Memory] in TCMalloc, more careful handling of VirtualAlloc commit
via SystemAlloc
To: Mike Belshe <[email protected]>, Anton Muhin <[email protected]>, James
Robinson <[email protected]>
Cc: Chromium-dev <[email protected]>


If you're not interested in TCMalloc customization for Chromium, you should
stop reading now.
This post is meant to gather some discussion on a topic before I code and
land a change.

MOTIVATION

We believe poor memory utilization is at the heart of a lot of jank
problems.  Such problems may be difficult to repro in short controlled
benchmarks, but our users are telling us we have problems, so we know we
have problems.  As a result, we need to be more conservative in memory
utilization and handling.

SUMMARY OF CHANGE

I'm thinking of changing our TCMalloc so that when a span is freed into
TCMalloc's free list, and it gets coalesced with an adjacent span that is
already decommitted, that the coalesced span should be entirely *decommitted
*(as opposed to our current customized performance of *committing *the
entire span).

This proposed policy was put in place previously by Mike, but (reportedly)
caused a 3-5% perf regression in V8.  I believe AntonM changed that policy
to what we have currently, where always ensure full commitment of a
coalesced span (regaining V8 performance on a benchmark).

WHY CHANGE?

The problematic scenario I'm anticipating (and may currently be burning us)
is:

a) A (renderer) process allocates a lot of memory, and achieves a
significant high water mark of memory used.

b) The process deallocates a lot of memory, and it flows into the TCMalloc
free list. [We still have a lot of memory attributed to that process, and
the app as a whole shows as using that memory.]

c) We eventually decide to decommit a lot of our free memory.  Currently
this happens when we switch away from a tab. [This saves us from further
swapping out the unused memory].

Now comes the evil problem.

d) We return to the tab which has a giant free list of spans, most of which
are decommitted.  [The good news is that the memory is still  decommitted]

e) We allocate  a block of memory, such as 32k chunk.  This memory is pulled
from a decommitted span, and ONLY the allocated chunk is committed. [That
sounds good]

f) We free the block of memory from (e).  What ever span is adjacent to that
block is committed <potential oops>.  Hence, if we he took (e) from a 200Meg
span, the act of freeing (e) will cause a 200Meg commitment!?!  This in turn
would not only require touching (and having VirtualAlloc clear to zero) all
allocated memory in the large span, it will also immediately put memory
pressure on the OS, and force as much as 200Megs of other apps to be swapped
out to disk :-(.

I'm wary that our recent fix that allows spans to be (correctly) coalesced
independent of their size should cause it to be easier to coalesce spans.
 Worse yet, as we proceed to further optimize TCMalloc, one measure of
success will be that the list of spans will be fragmented less and less, and
we'll have larger and larger coalesced singular spans.  Any large "reserved"
but not "commited" span will be a jank time-bomb waiting to blow up if the
process every allocates/frees from such a large span :-(.


WHAT IS THE PLAN GOING FORWARD (or how can we do better, and regain
performance, etc.)

We have at least the following plausible alternative ways to move forward
with TCMalloc.  The overall goal is to avoid wasteful decommits, and at the
same time avoid heap-wide flailing between minimal and maximal span
commitment states.

Each free-span is currently the maximal contiguous region of memory that
TCMalloc is controlling, but has been deallocated.  Currently spans have to
be totally committed, or totally decommitted.  There is no mixture
supported.

a) We could re-architect the span handling to allow spans to be combinations
of committed and decommitted regions.

b) We could vary out policy on what to do with a coalesced span, based on
span size and memory pressure.  For example: We can consistently monitor the
in-use vs free (but committed) ratio.  We can try to stay in some
"acceptable" region by varying our policy.

c) We could actually return to the OS some portions of spans that we have
decommitted.  We could then let the OS give us back these regions if we need
memory.  Until we get them back, we would not be at risk of doing
unnecessary commits.  Decisions about when to return to the OS can be made
based on span size and memory pressure.

d) We can change the interval and forcing function for decommitting spans
that are in our free list.

In each of the above cases, we need benchmark data on user-class machines to
show costs of these changes.  Until we understand the memory impact, we need
to move forward conservatively in our action, and be vigilant for thrashing
scenarios.


Comments??

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: [email protected] 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to