[Repost... this time from my chromium account... so it is posted more broadly]
---------- Forwarded message ---------- Date: Wed, Sep 30, 2009 at 10:39 AM Subject: [Memory] in TCMalloc, more careful handling of VirtualAlloc commit via SystemAlloc To: Mike Belshe <[email protected]>, Anton Muhin <[email protected]>, James Robinson <[email protected]> Cc: Chromium-dev <[email protected]> If you're not interested in TCMalloc customization for Chromium, you should stop reading now. This post is meant to gather some discussion on a topic before I code and land a change. MOTIVATION We believe poor memory utilization is at the heart of a lot of jank problems. Such problems may be difficult to repro in short controlled benchmarks, but our users are telling us we have problems, so we know we have problems. As a result, we need to be more conservative in memory utilization and handling. SUMMARY OF CHANGE I'm thinking of changing our TCMalloc so that when a span is freed into TCMalloc's free list, and it gets coalesced with an adjacent span that is already decommitted, that the coalesced span should be entirely *decommitted *(as opposed to our current customized performance of *committing *the entire span). This proposed policy was put in place previously by Mike, but (reportedly) caused a 3-5% perf regression in V8. I believe AntonM changed that policy to what we have currently, where always ensure full commitment of a coalesced span (regaining V8 performance on a benchmark). WHY CHANGE? The problematic scenario I'm anticipating (and may currently be burning us) is: a) A (renderer) process allocates a lot of memory, and achieves a significant high water mark of memory used. b) The process deallocates a lot of memory, and it flows into the TCMalloc free list. [We still have a lot of memory attributed to that process, and the app as a whole shows as using that memory.] c) We eventually decide to decommit a lot of our free memory. Currently this happens when we switch away from a tab. [This saves us from further swapping out the unused memory]. Now comes the evil problem. d) We return to the tab which has a giant free list of spans, most of which are decommitted. [The good news is that the memory is still decommitted] e) We allocate a block of memory, such as 32k chunk. This memory is pulled from a decommitted span, and ONLY the allocated chunk is committed. [That sounds good] f) We free the block of memory from (e). What ever span is adjacent to that block is committed <potential oops>. Hence, if we he took (e) from a 200Meg span, the act of freeing (e) will cause a 200Meg commitment!?! This in turn would not only require touching (and having VirtualAlloc clear to zero) all allocated memory in the large span, it will also immediately put memory pressure on the OS, and force as much as 200Megs of other apps to be swapped out to disk :-(. I'm wary that our recent fix that allows spans to be (correctly) coalesced independent of their size should cause it to be easier to coalesce spans. Worse yet, as we proceed to further optimize TCMalloc, one measure of success will be that the list of spans will be fragmented less and less, and we'll have larger and larger coalesced singular spans. Any large "reserved" but not "commited" span will be a jank time-bomb waiting to blow up if the process every allocates/frees from such a large span :-(. WHAT IS THE PLAN GOING FORWARD (or how can we do better, and regain performance, etc.) We have at least the following plausible alternative ways to move forward with TCMalloc. The overall goal is to avoid wasteful decommits, and at the same time avoid heap-wide flailing between minimal and maximal span commitment states. Each free-span is currently the maximal contiguous region of memory that TCMalloc is controlling, but has been deallocated. Currently spans have to be totally committed, or totally decommitted. There is no mixture supported. a) We could re-architect the span handling to allow spans to be combinations of committed and decommitted regions. b) We could vary out policy on what to do with a coalesced span, based on span size and memory pressure. For example: We can consistently monitor the in-use vs free (but committed) ratio. We can try to stay in some "acceptable" region by varying our policy. c) We could actually return to the OS some portions of spans that we have decommitted. We could then let the OS give us back these regions if we need memory. Until we get them back, we would not be at risk of doing unnecessary commits. Decisions about when to return to the OS can be made based on span size and memory pressure. d) We can change the interval and forcing function for decommitting spans that are in our free list. In each of the above cases, we need benchmark data on user-class machines to show costs of these changes. Until we understand the memory impact, we need to move forward conservatively in our action, and be vigilant for thrashing scenarios. Comments?? --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
