[Neil Schemenauer <nas-pyt...@arctrix.com>]
> I've done a little testing the pool overhead.  I have an application
> that uses many small dicts as holders of data.  The function:
>
>     sys._debugmallocstats()
>
> is useful to get stats for the obmalloc pools.  Total data allocated
> by obmalloc is 262 MB.  At the 4*PAGE_SIZE pool size, the wasted
> space due to partly filled pools is only 0.18%.  For 16*PAGE_SIZE
> pools, 0.71%.
>
> I have a set of stats for another program.  In that case, total
> memory allocated by obmalloc is 14 MB.  For 4*PAGE_SIZE pools,
> wasted space is 0.78% of total.  At 16*PAGE_SIZE, it is 2.4%.

Definitely a tradeoff here:  increasing the number of pages per pool
is a pure win for objects of very popular size classes, but a pure
loss for objects of unpopular size classes.  New comments about
POOL_SIZE in the patch briefly hint at that.

> Based on that small set of data, using 4*PAGE_SIZE seems
> conservative.

Which is a nice way of saying "pulled out of Tim's ass" ;-)

> As I'm sure you realize, making pools bigger will
> waste actual memory, not just virtual address space because you
> write the arena pointer to each OS page.

Yes, but mostly no!  It remains the case that obmalloc neither writes
nor reads anything in an arena until a malloc/realloc call actually
needs to return a pointer to a never-before accessed page.  Writing
the arena index is NOT done to pages when the arena is allocated, or
even when a new pool is carved off an arena, but lazily, one page at a
time, lowest address to highest, as fresh pages actually need to be
returned to the caller.

So arena pointers aren't actually written more frequently than when
using pools of just one page, as is done now (which _still_ writes the
arena index into each page).

Subtlety:  currently, if you pass a nonsense address to
address_in_range that happens to be in one of the arena's pools,
address_in_range() says "yes".  However, after the patch, it may well
return "no" if the address is in one of the pool's pages that hasn't
yet been returned to a malloc/realloc caller (in which case the arena
index hasn't yet been stored at the start of the page).

I don't care about that, because it's 100% a logic error to pass an
address to free/realloc that wasn't obtained from a previous
malloc/realloc call.  So it's a change in undefined behavior.

> I want to do performance profiling using Linux perf.  That should
> show where the hotspot instructions in the obmalloc code.  Maybe
> that will be useful to you.

Would be good to know, yes!  But may also depend on the app.

> Another thought about address_in_range(): some operating systems
> allow you to allocate memory a specific alignments.  Or, you can
> even allocate a chunk of memory at a fixed memory location if you do
> the correct magic incantation.  I noticed that Go does that.  I
> imagine doing that has a bunch of associated challenges with it.
> However, if we could control the alignment and memory location of
> obmalloc arenas, we would not have the segv problem of
> address_in_range().  It's probably not worth going down that path
> due to the problems involved.

I'm unclear on how that could be exploited.  It's not addresses that
come from arenas that create segv headaches, it's addresses that come
from the system's malloc/realloc.

If we were, e.g., to pick a maximum amount of address space obmalloc
can use in advance, we could live with a single arena of that size,
and just check whether an address is within it.  All the problems come
from that address space may alternate, any number of times, between
addresses in obmalloc arenas and addresses from the system malloc's
internals.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YUWFWPKN67EOGF5F7QFGB7IFRH5K2PFW/

Reply via email to