On 01/10/2018 15:51, Steven Schveighoffer wrote:
On 10/1/18 3:21 AM, Rainer Schuetze wrote:
A profiler reveals that most of the time is spent in "sweeping" the
memory, i.e. looking for allocations no longer referenced. The
existing implementation checks every page which causes a linear growth
of required CPU resources with used memory.
Ouch! I hadn't thought of that. But we aren't checking actual *pages*,
right, we are checking bits to see where allocations are present?
It is not touching the pages, but a byte array classifying the page as
free, used and first of a used block.
I also remember that the way the bitsets work, they are always allocated
for every 16 bytes, regardless of the block size for that page/pool. I
didn't love that feature but maybe it is fixed by now.
Not for the pool for "large" objects, it uses bits per page there. For
small object pools, it's still as you remember.
I would imagine that checking 64 pages at once should be possible by
logic-anding the allocated and unmarked bits to check things as quickly
as possible.
With my patch, the number of pages of allocated blocks and regions of
consecutive free pages are known (by consulting another array), so no
bit-scanning is necessary.