I have published a repository related to debugging/understanding ZFS on NetBSD:
https://codeberg.org/gdt/netbsd-zfs This has a patch that - adds comments - rototills ARC sizing (smaller) - disables prefetching - printf of arc eviction behavior (when interesting) a script to create many files and rm them a program to allocate lots of RAM, to provoke memory pressure My current thinking (a little fuzzy, and influenced by many helpful comments) is: The ARC is a cache from a 128-bit "Disk Virtual Address" to contents. I think this DVA space addresses all pools, or all disks backing pools. The ARC divides the world into "data" being bits that are file contents, and "metadata", which is the kithen sink of other data. ARC management is basically ok, except for the concept of lots of things in the ARC that cannot be feed by the evict thread. One has to be really careful not to printf at speed. Perhaps because of the framebuffer, the kernel becomes CPU bound and never really recovers. With the printfs as enabled in the patch in the repo, things are ok. /etc/daily is rough on the system. With not enough RAM and maxvnodes too high (default maxvnodes on a 6G VM), it will lock up the system. vnodes/znodes have "dnode_t" (from a pool) and this memory is accounted for as being "in" the ARC. But it isn't really in, in that the ARC drain routines can't free it. Because of this, the ARC is mostly emptied under pressure, for things that can be freed. But things that cannot be freed are most of it. I suspect that keeping the dnode_t in vnodes may not be needed. Perhaps the on-disk directory information is in the ARC because it is addressed by DVA. So it's not clear if what's being avoided is the parsing work, vs reading from disk. Having kern.maxvnodes high leads to trouble. Lowering it, and disabling prefetch, makes things much better. FreeBSD has a dnlc drain routine. Probably we need a way to reduce the number of zfs vnodes under pressure. But maybe it isn't really just about zfs; perhaps vnodes in general should be freed. Despite all of the above, it seems that over time memory is leaked, and the system becomes memory stressed. Eventually, as in a week or so, even a system with 32G of RAM locks up, if one does cvs updates of src, pkgsrc, pkg_rr, building releases, etc. I suspect the dnode process. If you are running zfs on even a 32G system, with most of your data in zfs, and you cvs update pkgsrc/src, rebuild packages, and rebuild NetBSD via build.sh, and **you can keep the system running for 30 days** please let me know. Please include memory size and kern.maxvnodes. Right now on a 32GB system, I have kern.maxvnodes 300000, and dnode_t is higher than that, at 319625 objects, via vmstat -m: Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle dnode_t 632 1286110 0 966485 87779 32622 55157 87779 0 inf 1886 It might be that with that maxvnodes value, lower ARC, and no prefetch, the system will now stay up. I'm 4 days in. (In contrast, systems not running zfs stay up until I reboot them.)