On Sat, Dec 09, 2023 at 12:50:31PM +0000, Chris Webb wrote: > Before 326d7c1, the shrinker used freeram and totalram from a struct > sysinfo (constructed from /proc/meminfo) to target 25% free physical > memory. As well as the slowness of repeatedly reading /proc/meminfo, > this was a problem as freeram rises when the system starts to swap. > We don't want swapping to reduce our estimate of memory pressure. > > To work around this, in 326d7c1 the shrinker started to use the total > allocated heap from a glibc-specific interface mallinfo2(), aiming to > shrink such that our heap is less than 80% of physical memory, unless > overall free memory is less than 6% so that becomes the determining factor. > > Unfortunately, a sign error in the calculation means this heuristic > never worked. It would shrink aggressively when the process was small, > and not at all when the process grew beyond 80% of physical RAM. Only the > fallback test ensuring the free physical RAM doesn't fall below 6% would > actually kick in under memory pressure. It also breaks portability to > anything other than recent glibc. > > Later, in 2440469 the mallinfo2() was replaced with the older mallinfo() > to improve compatibility with older glibc. This is even more problematic: > it's still not portable but also struct mallinfo has (signed) int fields > which overflow for large processes on 32-bit machines with a 3G/1G split. > > Rather than trying to use libc-specific debug interfaces and our own heap > to inform the shrinker, use the information about free and total swap > we already have from sysinfo(2) to explicitly compensate for swapping > in our estimate of free physical memory. Target free memory of 6% of > physical RAM adjusted for zero swap use when calculating the pressure > on the shrinker, based on the effective behaviour of 326d7c1 in practice > given the sign error. > > As well as fixing portability to non-glibc systems, this loosens the > assumption that we are the only process using significant memory when > setting the shrinker target. It wouldn't be unreasonable to run two > fsck jobs against independent devices on a large RAM machine and want to > balance physical RAM between them. > > Signed-off-by: Chris Webb <[email protected]>
Nice, applied.
