On Sat, Dec 09, 2023 at 12:50:31PM +0000, Chris Webb wrote:
> Before 326d7c1, the shrinker used freeram and totalram from a struct
> sysinfo (constructed from /proc/meminfo) to target 25% free physical
> memory. As well as the slowness of repeatedly reading /proc/meminfo,
> this was a problem as freeram rises when the system starts to swap.
> We don't want swapping to reduce our estimate of memory pressure.
> 
> To work around this, in 326d7c1 the shrinker started to use the total
> allocated heap from a glibc-specific interface mallinfo2(), aiming to
> shrink such that our heap is less than 80% of physical memory, unless
> overall free memory is less than 6% so that becomes the determining factor.
> 
> Unfortunately, a sign error in the calculation means this heuristic
> never worked. It would shrink aggressively when the process was small,
> and not at all when the process grew beyond 80% of physical RAM. Only the
> fallback test ensuring the free physical RAM doesn't fall below 6% would
> actually kick in under memory pressure. It also breaks portability to
> anything other than recent glibc.
> 
> Later, in 2440469 the mallinfo2() was replaced with the older mallinfo()
> to improve compatibility with older glibc. This is even more problematic:
> it's still not portable but also struct mallinfo has (signed) int fields
> which overflow for large processes on 32-bit machines with a 3G/1G split.
> 
> Rather than trying to use libc-specific debug interfaces and our own heap
> to inform the shrinker, use the information about free and total swap
> we already have from sysinfo(2) to explicitly compensate for swapping
> in our estimate of free physical memory. Target free memory of 6% of
> physical RAM adjusted for zero swap use when calculating the pressure
> on the shrinker, based on the effective behaviour of 326d7c1 in practice
> given the sign error.
> 
> As well as fixing portability to non-glibc systems, this loosens the
> assumption that we are the only process using significant memory when
> setting the shrinker target. It wouldn't be unreasonable to run two
> fsck jobs against independent devices on a large RAM machine and want to
> balance physical RAM between them.
> 
> Signed-off-by: Chris Webb <[email protected]>

Nice, applied.

Reply via email to