Just an update.  The guy to blame was CreateDIBSection used in
skia/ext/bitmap_platform_win.cc

According to http://msdn.microsoft.com/en-us/library/dd183494(VS.85).aspx :

If hSection is NULL, the system allocates memory for the DIB. In this
case, the CreateDIBSection function ignores the dwOffset parameter. An
application cannot later obtain a handle to this memory. The
dshSection member of the DIBSECTION structure filled in by calling the
GetObject function will be NULL.

And it looks like this call doesn't go to VirtualAlloc (but I am not 100% sure).

yours,
anton.

On Fri, Nov 6, 2009 at 9:53 PM, Ricardo Vargas <rvar...@google.com> wrote:
> Not really. VADs are really small, not a page... and I've been using a
> kernel debugger with your code in a stand alone program :)
> Kernel debugger should be supported for Windows 7, it's just a little more
> complicated. I guess we can talk offline.
>
> On Fri, Nov 6, 2009 at 10:48 AM, Anton Muhin <ant...@chromium.org> wrote:
>>
>> Yep, Ricardo, I saw your spreadsheet.
>>
>> Actually I think I know what goes on: it is VADs allocation (and yes,
>> I found it in Internals book :).
>>
>> Currently I'm trying to verify this hypothesis (there are some
>> troubles as it looks like local kernel debugging is not supported on
>> Windows 7 and perfmon doesn't trace VADs size).   And I am mostly
>> concerned if we could control amount of this overhead (and ideally
>> reduce it).
>>
>> Just FYI.  This issue is easy to reproduce w/ a standalone program.
>> My next goal is to emulate with a standalone program the big
>> difference between requested and committed bytes I observed.
>>
>> yours,
>> anton.
>>
>>
>> On Fri, Nov 6, 2009 at 9:35 PM, Ricardo Vargas <rvar...@chromium.org>
>> wrote:
>> > +jim
>> > We've been spending some time lately trying to answer the same question.
>> > The
>> > memory reported by task manager (or any other perf counter based tool)
>> > includes not only VirtualAlloc but any memory mapped section (DLLs and
>> > data
>> > files, plus unnamed sections), plus the n heaps managed by windows (not
>> > everybody running in the browser uses tcmalloc). See the document that I
>> > sent a couple of weeks ago about DLLs and their heaps.
>> > kernel32!VirtualAlloc &Co are just thin wrappers that invoke the
>> > kernel's
>> > memory management functions. For details of how Window handles memory
>> > take a
>> > look at the Windows Internals book. I'm pretty sure that the differences
>> > that we are seeing are just a problem of accounting in our user mode
>> > code,
>> > not related to the internal memory handling of Windows.
>> >
>> > On Fri, Nov 6, 2009 at 5:50 AM, Anton Muhin <ant...@chromium.org> wrote:
>> >>
>> >> On Thu, Nov 5, 2009 at 11:06 PM, Ricardo Vargas <rvar...@chromium.org>
>> >> wrote:
>> >> >
>> >> >
>> >> > On Thu, Nov 5, 2009 at 7:32 AM, Anton Muhin <ant...@chromium.org>
>> >> > wrote:
>> >> >>
>> >> >> On Wed, Nov 4, 2009 at 3:39 AM, Ricardo Vargas <rvar...@google.com>
>> >> >> wrote:
>> >> >> > I don't see the post to chromium-dev so...
>> >> >>
>> >> >> Sorry, responding to chromium-dev.
>> >> >>
>> >> >> > Playing with your code the only delta that I see is one page (for
>> >> >> > each 2
>> >> >> > MB). This ratio works well when the allocation size increases (and
>> >> >> > drops
>> >> >> > to
>> >> >> > zero for small allocations), and as far as I can see corresponds
>> >> >> > to
>> >> >> > the
>> >> >> > memory needed to keep an extra page table for the memory that you
>> >> >> > reserve
>> >> >> > (the cost of reserved memory is small, but not zero).
>> >> >>
>> >> >> Yes, I understand it now.  I wonder if you (or anyone else) knows
>> >> >> details like when it goes out (and why it happens for tcmalloc which
>> >> >> doesn't frees memory), is there a way to keep it smaller or it just
>> >> >> proportional to amount of reserved memory, is it the same for all
>> >> >> Windows (XP, Vista, 7).
>> >> >
>> >> > To be fair, the page table is not supposed to be created just for
>> >> > reserved
>> >> > memory, only for commited, and in fact the page directory entry is
>> >> > empty
>> >> > until commit. But still, either the page is "reserved" (for lack of a
>> >> > better
>> >> > word) or the process is just charged for it.
>> >> > I don't see a reason for the system doing something different for
>> >> > tcmalloc.
>> >> > Are you sure that this test really represents the difference that you
>> >> > see
>> >> > with the actual browser?. The amount of memory will always be a
>> >> > preset
>> >> > percentage of the requested memory, but it changes slightly with the
>> >> > cpu
>> >> > mode (32/PAE/64).
>> >>
>> >> I strongly suspect that.  I started to look into it when I noticed the
>> >> difference between committed size as reported by tcmalloc + V8 and one
>> >> reported by task manager.  Alas, I cannot be 100% sure I don't miss
>> >> some other consumer (but I ran the browser with breakpoint in
>> >> kernel32!VirtualAlloc and saw no other guy).
>> >>
>> >> Ricardo, any good pointers where I can learn more about allocator
>> >> behind VirtualAlloc/VirtualFree?
>> >>
>> >> thanks a lot and yours,
>> >> anton.
>> >>
>> >>
>> >> >>
>> >> >> > One caveat: initialize the message buffers before the first call
>> >> >> > to
>> >> >> > GetProcessMemoryInfo to avoid seeing a ws increase if the stack
>> >> >> > has
>> >> >> > to
>> >> >> > grow.
>> >> >>
>> >> >> Thanks a lot, very good point.
>> >> >>
>> >> >> yours,
>> >> >> anton.
>> >> >>
>> >> >> > On Tue, Nov 3, 2009 at 10:57 AM, Anton Muhin <ant...@chromium.org>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Resending to chromium-dev.
>> >> >> >>
>> >> >> >> yours,
>> >> >> >> anton.
>> >> >> >>
>> >> >> >> On Tue, Nov 3, 2009 at 9:43 PM, Anton Muhin <ant...@google.com>
>> >> >> >> wrote:
>> >> >> >> > Dear colleagues,
>> >> >> >> >
>> >> >> >> > That looks like common idiom (in both tcmalloc and V8) to count
>> >> >> >> > amount
>> >> >> >> > of committed/allocated bytes just adding amount of requested
>> >> >> >> > size
>> >> >> >> > (modulo alignment) is not quite correct.
>> >> >> >> >
>> >> >> >> > At least on my Vista box the following code sometimes reports
>> >> >> >> > overcommitted bytes:
>> >> >> >> >
>> >> >> >> >  HANDLE ph = GetCurrentProcess();
>> >> >> >> >  PROCESS_MEMORY_COUNTERS pmc;
>> >> >> >> >  GetProcessMemoryInfo(ph, &pmc, sizeof(pmc));
>> >> >> >> >  size_t before = pmc.PagefileUsage;
>> >> >> >> >
>> >> >> >> >  int size = 1 << 21;
>> >> >> >> >  void* p0 = VirtualAlloc(0, size/2, MEM_RESERVE,
>> >> >> >> > PAGE_READWRITE);
>> >> >> >> >  void* p1 = VirtualAlloc(0, size/2, MEM_RESERVE,
>> >> >> >> > PAGE_READWRITE);
>> >> >> >> >
>> >> >> >> >  GetProcessMemoryInfo(ph, &pmc, sizeof(pmc));
>> >> >> >> >  size_t after = pmc.PagefileUsage;
>> >> >> >> >  int delta = after - before;
>> >> >> >> >
>> >> >> >> >  char msg[1024] = {0};
>> >> >> >> >  sprintf_s(msg, sizeof(msg), "delta: %d, size: %d, ratio:
>> >> >> >> > %.3f\n",
>> >> >> >> > delta, size, double(size)/double(delta));
>> >> >> >> >  OutputDebugStringA(msg);
>> >> >> >> >
>> >> >> >> >  VirtualAlloc(p0, size/2, MEM_COMMIT, PAGE_READWRITE);
>> >> >> >> >  VirtualAlloc(p1, size/2, MEM_COMMIT, PAGE_READWRITE);
>> >> >> >> >  GetProcessMemoryInfo(ph, &pmc, sizeof(pmc));
>> >> >> >> >  delta = pmc.PagefileUsage - before;
>> >> >> >> >
>> >> >> >> >  char msg2[1024] = {0};
>> >> >> >> >  sprintf_s(msg2, sizeof(msg2), "after commit delta: %d, size:
>> >> >> >> > %d,
>> >> >> >> > ratio: %.3f\n", delta, size, double(size)/double(delta));
>> >> >> >> >  OutputDebugStringA(msg2);
>> >> >> >> >
>> >> >> >> >  VirtualFree(p0, 0, MEM_DECOMMIT);
>> >> >> >> >  VirtualFree(p1, 0, MEM_DECOMMIT);
>> >> >> >> >  GetProcessMemoryInfo(ph, &pmc, sizeof(pmc));
>> >> >> >> >  delta = pmc.PagefileUsage - before;
>> >> >> >> >
>> >> >> >> >  char msg3[1024] = {0};
>> >> >> >> >  sprintf_s(msg3, sizeof(msg3), "after decommit delta: %d, size:
>> >> >> >> > %d,
>> >> >> >> > ratio: %.3f\n", delta, size, double(size)/double(delta));
>> >> >> >> >  OutputDebugStringA(msg3);
>> >> >> >> >
>> >> >> >> >  VirtualFree(p0, 0, MEM_RELEASE);
>> >> >> >> >  VirtualFree(p1, 0, MEM_RELEASE);
>> >> >> >> >  GetProcessMemoryInfo(ph, &pmc, sizeof(pmc));
>> >> >> >> >  delta = pmc.PagefileUsage - before;
>> >> >> >> >
>> >> >> >> >  char msg4[1024] = {0};
>> >> >> >> >  sprintf_s(msg4, sizeof(msg4), "after release delta: %d, size:
>> >> >> >> > %d,
>> >> >> >> > ratio: %.3f\n", delta, size, double(size)/double(delta));
>> >> >> >> >  OutputDebugStringA(msg4);
>> >> >> >> >
>> >> >> >> > (the code was injected into the very beginning of wWinMain in
>> >> >> >> > chrome_exe_main.cc and there seems no other threads that might
>> >> >> >> > commit
>> >> >> >> > something in between).
>> >> >> >> >
>> >> >> >> > The problem is actually with reserving (and releasing) bytes:
>> >> >> >> > from
>> >> >> >> > time to time reserving bytes commits some amount of memory
>> >> >> >> > (probably
>> >> >> >> > for windows-internal structures).  The ratios are 0.1--0.4%.
>> >> >> >> >  If I
>> >> >> >> > make size twice as big, overcommitment happens each run.
>> >> >> >> >
>> >> >> >> > This amount seems relatively small, but at least in some cases
>> >> >> >> > this
>> >> >> >> > happens very often and allocated sizes are big, so this .3% sum
>> >> >> >> > up
>> >> >> >> > to
>> >> >> >> > very big values (but see below).
>> >> >> >> >
>> >> >> >> > Another problem is I do not know when those system committed
>> >> >> >> > bytes
>> >> >> >> > get
>> >> >> >> > released.  In the snippet above it happens in VirtualFree(*,
>> >> >> >> > MEM_RELEASE), but in tcmalloc we never do MEM_RELEASE, but
>> >> >> >> > still
>> >> >> >> > the
>> >> >> >> > delta between WinAPI reported committed bytes and tcmalloc
>> >> >> >> > reported
>> >> >> >> > bytes sometimes drops and drops notably.  I tried to grep for
>> >> >> >> > all
>> >> >> >> > VirtualFree's, but looks like the only one operational resides
>> >> >> >> > in
>> >> >> >> > v8's
>> >> >> >> > platform_win32.cc and tcmalloc is by all means the main memory
>> >> >> >> > consumer.
>> >> >> >> >
>> >> >> >> > The best (but rather lame) hypothesis I have so far is this
>> >> >> >> > amount
>> >> >> >> > gets GCed somehow by windows itself.
>> >> >> >> >
>> >> >> >> > And the last but not the least: this delta between WinAPI and
>> >> >> >> > tcmalloc
>> >> >> >> > + v8 stats might be pretty big, something like 100MB, so it
>> >> >> >> > might
>> >> >> >> > be
>> >> >> >> > worth further investigation.
>> >> >> >> >
>> >> >> >> > Does anybody know what exactly goes here?  Am I missing
>> >> >> >> > something?
>> >> >> >> >  Is
>> >> >> >> > there a way to control amount of those overcommitted bytes e.g.
>> >> >> >> > if
>> >> >> >> > I
>> >> >> >> > reserve a huge block once is it better than reserving N, but N
>> >> >> >> > times
>> >> >> >> > smaller blocks? (I did a simple experiment for splitting blocks
>> >> >> >> > into
>> >> >> >> > halves, see the snippet, and in this setting number of
>> >> >> >> > overcommitted
>> >> >> >> > bytes is the same, but for split case those bytes sometimes are
>> >> >> >> > not
>> >> >> >> > committed at all).
>> >> >> >> >
>> >> >> >> > And overall, do we want to maintain this committed bytes stats
>> >> >> >> > given
>> >> >> >> > that they might be that imprecise?
>> >> >> >> >
>> >> >> >> > (BTW, there is a small bug in the current tcmalloc reporting of
>> >> >> >> > committed bytes, it should account for pagemap as well).
>> >> >> >> >
>> >> >> >> > yours,
>> >> >> >> > anton.
>> >> >> >> >
>> >> >> >>
>> >> >> >> --
>> >> >> >>
>> >> >> >> You received this message because you are subscribed to the
>> >> >> >> Google
>> >> >> >> Groups
>> >> >> >> "Chrome-team" group.
>> >> >> >> To post to this group, send email to chrome-t...@google.com.
>> >> >> >> To unsubscribe from this group, send email to
>> >> >> >> chrome-team+unsubscr...@google.com.
>> >> >> >> For more options, visit this group at
>> >> >> >> http://groups.google.com/a/google.com/group/chrome-team.
>> >> >> >>
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>

-- 
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev

Reply via email to