On 3/28/25 18:45, Matteo Croce wrote:
the biggest differences are that "mu" follows the "du" semantics,
analyzing directories recursively and printing cumulative numbers.
And it only uses cachestat() without falling back to mincore.
Anyway I had prepared a draft util-linux port, it's available here:
https://github.com/util-linux/util-linux/pull/3493
I looked at adding a flag for this to toybox/busybox "readahead" which
seems the most closely related command, because most things don't do
cache micromanagement. (They just let the kernel get on with it.)
But the problem is the syscall was added recently enough (commit
cf264e1329fb in may 2023) that a lot of systems people are trying to
build against still won't have it, so I'd need a config check for "does
your libc wrap this syscall, or at least provide __NR_thingy for it in
asm-generic/unistd.h). Musl only added it in commit dd690c490951 last
month, and hasn't had a release since then. Still seems a bit unripe.
And given that even "readahead" isn't installed by default on debian,
not a huge demand for it?
What's the use case here? We've had sync() and echo 3 >
/proc/sys/vm/drop_caches forever to _clean_ the cache, but when do you
need to know how much data is being used (not even pinned) by cached
files in a given directory? What do you do with this information? Does
it tell you clean/dirty? Are you making any attempt to distinguish tmpfs
(always 100% pinned in cache) from files with a backing store? Or any
metric of slow vs fast backing store?
Regards,
Rob
P.S. I'm still confused why readahead() needs to be a separate syscall
when mmap(MAP_POPULATE) exists, but I guess the syscall was invented
before the mmap flag and got kept around for historical reasons? You'd
think this new capability would have been some variant of
SEEK_DATA/SEEK_HOLE so you could get the properly granular data, but I
guess that's still in the future and then we'll still be stuck with
backwards compatibility with this other syscall forever.)