[Neil Schemenauer <nas-pyt...@arctrix.com>] > I've done a little testing the pool overhead. I have an application > that uses many small dicts as holders of data. The function: > > sys._debugmallocstats() > > is useful to get stats for the obmalloc pools. Total data allocated > by obmalloc is 262 MB. At the 4*PAGE_SIZE pool size, the wasted > space due to partly filled pools is only 0.18%. For 16*PAGE_SIZE > pools, 0.71%. > > I have a set of stats for another program. In that case, total > memory allocated by obmalloc is 14 MB. For 4*PAGE_SIZE pools, > wasted space is 0.78% of total. At 16*PAGE_SIZE, it is 2.4%.
Definitely a tradeoff here: increasing the number of pages per pool is a pure win for objects of very popular size classes, but a pure loss for objects of unpopular size classes. New comments about POOL_SIZE in the patch briefly hint at that. > Based on that small set of data, using 4*PAGE_SIZE seems > conservative. Which is a nice way of saying "pulled out of Tim's ass" ;-) > As I'm sure you realize, making pools bigger will > waste actual memory, not just virtual address space because you > write the arena pointer to each OS page. Yes, but mostly no! It remains the case that obmalloc neither writes nor reads anything in an arena until a malloc/realloc call actually needs to return a pointer to a never-before accessed page. Writing the arena index is NOT done to pages when the arena is allocated, or even when a new pool is carved off an arena, but lazily, one page at a time, lowest address to highest, as fresh pages actually need to be returned to the caller. So arena pointers aren't actually written more frequently than when using pools of just one page, as is done now (which _still_ writes the arena index into each page). Subtlety: currently, if you pass a nonsense address to address_in_range that happens to be in one of the arena's pools, address_in_range() says "yes". However, after the patch, it may well return "no" if the address is in one of the pool's pages that hasn't yet been returned to a malloc/realloc caller (in which case the arena index hasn't yet been stored at the start of the page). I don't care about that, because it's 100% a logic error to pass an address to free/realloc that wasn't obtained from a previous malloc/realloc call. So it's a change in undefined behavior. > I want to do performance profiling using Linux perf. That should > show where the hotspot instructions in the obmalloc code. Maybe > that will be useful to you. Would be good to know, yes! But may also depend on the app. > Another thought about address_in_range(): some operating systems > allow you to allocate memory a specific alignments. Or, you can > even allocate a chunk of memory at a fixed memory location if you do > the correct magic incantation. I noticed that Go does that. I > imagine doing that has a bunch of associated challenges with it. > However, if we could control the alignment and memory location of > obmalloc arenas, we would not have the segv problem of > address_in_range(). It's probably not worth going down that path > due to the problems involved. I'm unclear on how that could be exploited. It's not addresses that come from arenas that create segv headaches, it's addresses that come from the system's malloc/realloc. If we were, e.g., to pick a maximum amount of address space obmalloc can use in advance, we could live with a single arena of that size, and just check whether an address is within it. All the problems come from that address space may alternate, any number of times, between addresses in obmalloc arenas and addresses from the system malloc's internals. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YUWFWPKN67EOGF5F7QFGB7IFRH5K2PFW/