On Tue, Jul 9, 2019 at 9:46 AM Tim Peters <tim.pet...@gmail.com> wrote:
>
> >  At last, all size classes has1~3 used/cached memory blocks.
>
> No doubt part of it, but hard to believe it's most of it.  If the loop
> count above really is 10240, then there's only about 80K worth of
> pointers in the final `buf`.

You are right.  List.append is not the major part of memory consumer
of "large" class (8KiB+1 ~ 512KiB).   They are several causes of large
size alloc:

* bm_logging uses StringIO.seek(0); StringIO.truncate() to reset buffer.
  So internal buffer of StringIO become Py_UCS4 array instead of a list
  of strings from the 2nd loop.  This buffer is using same policy to list
  for increase capacity.  `size + size >> 8 + (size < 9 ? 3 : 6)`.
  Actually, when I use `-n 1` option, memory usage is only 9MiB.
* The intern dict.
* Many modules are loaded, and FileIO.readall() is used to read pyc files.
  This creates and deletes various size of bytes objects.
* logging module uses several regular expressions.  `b'\0' * 0xff00` is
  used in sre_compile.
  https://github.com/python/cpython/blob/master/Lib/sre_compile.py#L320


>
> But does it really matter? ;-)  mimalloc "should have" done MADV_FREE
> on the pages holding the older `buf` instances, so it's not like the
> app is demanding to hold on to the RAM (albeit that it may well show
> up in the app's RSS unless/until the OS takes the RAM away).
>

mimalloc doesn't call madvice for each free().  Each size classes
keeps a 64KiB "page".
And several pages (4KiB) in the "page" are committed but not used.

I dumped all "mimalloc page" stat.
https://paper.dropbox.com/doc/mimalloc-on-CPython--Agg3g6XhoX77KLLmN43V48cfAg-fFyIm8P9aJpymKQN0scpp#:uid=671467140288877659659079&h2=memory-usage-of-logging_format

For example:

bin block_size   used capacity reserved
 29       2560      1       22       25 (14 pages are committed, 2560
bytes are in use)
 29       2560     14       25       25 (16 pages are committed,
2560*14 bytes are in use)
 29       2560     11       25       25
 31       3584      1        5       18 (5 pages are committed, 3584
bytes are in use)
 33       5120      1        4       12
 33       5120      2       12       12
 33       5120      2       12       12
 37      10240      3       11      409
 41      20480      1        6      204
 57     327680      1        2       12

* committed pages can be calculated by `ceil(block_size * capacity /
4096)` roughly.

There are dozen of unused memory block and committed pages in each size classes.
This caused 10MiB+ memory usage overhead on logging_format and logging_simple
benchmarks.


>> I was more intrigued by your first (speed) comparison:
>
> > - spectral_norm: 202 ms +- 5 ms -> 176 ms +- 3 ms: 1.15x faster (-13%)
>
> Now _that's_ interesting ;-)  Looks like spectral_norm recycles many
> short-lived Python floats at a swift pace.  So memory management
> should account for a large part of its runtime (the arithmetic it does
> is cheap in comparison), and obmalloc and mimalloc should both excel
> at recycling mountains of small objects.  Why is mimalloc
> significantly faster?
[snip]
>  obmalloc's `address_in_range()` is definitely a major overhead in its
> fastest `free()` path, but then mimalloc has to figure out which
> thread is doing the freeing (looks cheaper than address_in_range, but
> not free).  Perhaps the layers of indirection that have been wrapped
> around obmalloc over the years are to blame?  Perhaps mimalloc's
> larger (16x) pools and arenas let it stay in its fastest paths more
> often?  I don't know why, but it would be interesting to find out :-)

Totally agree.  I'll investigate this next.

Regards,
-- 
Inada Naoki  <songofaca...@gmail.com>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MXEE2NOEDAP72RFVTC7H4GJSE2CHP3SX/

Reply via email to