now from right account, sorry On Wed, Sep 30, 2009 at 9:48 PM, Anton Muhin <[email protected]> wrote: > On Wed, Sep 30, 2009 at 9:39 PM, Jim Roskind <[email protected]> wrote: >> If you're not interested in TCMalloc customization for Chromium, you should >> stop reading now. >> This post is meant to gather some discussion on a topic before I code and >> land a change. >> MOTIVATION >> We believe poor memory utilization is at the heart of a lot of jank >> problems. Such problems may be difficult to repro in short controlled >> benchmarks, but our users are telling us we have problems, so we know we >> have problems. As a result, we need to be more conservative in memory >> utilization and handling. >> SUMMARY OF CHANGE >> I'm thinking of changing our TCMalloc so that when a span is freed into >> TCMalloc's free list, and it gets coalesced with an adjacent span that is >> already decommitted, that the coalesced span should be entirely decommitted >> (as opposed to our current customized performance of committing the entire >> span). >> This proposed policy was put in place previously by Mike, but (reportedly) >> caused a 3-5% perf regression in V8. I believe AntonM changed that policy >> to what we have currently, where always ensure full commitment of a >> coalesced span (regaining V8 performance on a benchmark). > > The immediate question and plea. Question: how can we estimate > performance implications of the change? Yes, we have some internal > benchmarks which could be used for that (they release memory heavily). > Anything else? > > Plea: please, do not regress DOM performance unless there are really > compelling reasons. And even in this case :) > >> WHY CHANGE? >> The problematic scenario I'm anticipating (and may currently be burning us) >> is: >> a) A (renderer) process allocates a lot of memory, and achieves a >> significant high water mark of memory used. >> b) The process deallocates a lot of memory, and it flows into the TCMalloc >> free list. [We still have a lot of memory attributed to that process, and >> the app as a whole shows as using that memory.] >> c) We eventually decide to decommit a lot of our free memory. Currently >> this happens when we switch away from a tab. [This saves us from further >> swapping out the unused memory]. >> Now comes the evil problem. >> d) We return to the tab which has a giant free list of spans, most of which >> are decommitted. [The good news is that the memory is still decommitted] >> e) We allocate a block of memory, such as 32k chunk. This memory is pulled >> from a decommitted span, and ONLY the allocated chunk is committed. [That >> sounds good] >> f) We free the block of memory from (e). What ever span is adjacent to that >> block is committed <potential oops>. Hence, if we he took (e) from a 200Meg >> span, the act of freeing (e) will cause a 200Meg commitment!?! This in turn >> would not only require touching (and having VirtualAlloc clear to zero) all >> allocated memory in the large span, it will also immediately put memory >> pressure on the OS, and force as much as 200Megs of other apps to be swapped >> out to disk :-(. > > I'm not sure about swapping unless you touch those now committed > pages, but only experiment will tell. > >> I'm wary that our recent fix that allows spans to be (correctly) coalesced >> independent of their size should cause it to be easier to coalesce spans. >> Worse yet, as we proceed to further optimize TCMalloc, one measure of >> success will be that the list of spans will be fragmented less and less, and >> we'll have larger and larger coalesced singular spans. Any large "reserved" >> but not "commited" span will be a jank time-bomb waiting to blow up if the >> process every allocates/frees from such a large span :-(. >> >> WHAT IS THE PLAN GOING FORWARD (or how can we do better, and regain >> performance, etc.) >> We have at least the following plausible alternative ways to move forward >> with TCMalloc. The overall goal is to avoid wasteful decommits, and at the >> same time avoid heap-wide flailing between minimal and maximal span >> commitment states. >> Each free-span is currently the maximal contiguous region of memory that >> TCMalloc is controlling, but has been deallocated. Currently spans have to >> be totally committed, or totally decommitted. There is no mixture >> supported. >> a) We could re-architect the span handling to allow spans to be combinations >> of committed and decommitted regions. >> b) We could vary out policy on what to do with a coalesced span, based on >> span size and memory pressure. For example: We can consistently monitor the >> in-use vs free (but committed) ratio. We can try to stay in some >> "acceptable" region by varying our policy. >> c) We could actually return to the OS some portions of spans that we have >> decommitted. We could then let the OS give us back these regions if we need >> memory. Until we get them back, we would not be at risk of doing >> unnecessary commits. Decisions about when to return to the OS can be made >> based on span size and memory pressure. >> d) We can change the interval and forcing function for decommitting spans >> that are in our free list. >> In each of the above cases, we need benchmark data on user-class machines to >> show costs of these changes. Until we understand the memory impact, we need >> to move forward conservatively in our action, and be vigilant for thrashing >> scenarios. >> >> Comments?? > > As a close attempt you may have a look at > http://codereview.chromium.org/256013/show > > That allows spans with a mix of committed/decommitted pages (but only > in returned list) as committing seems to live fine if some pages are > already committed. > > That has some minor performance benefit, but I didn't investigate it > in details yet. > > just my 2 cents, > anton. >
--~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
