'set but unused' breaks drm-*-kmod
Seems this new requirement breaks kmod builds too .. The first of many errors was (I stopped chasing them all for lack of time) .. --- amdgpu_cs.o --- /usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.7.19_3/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1210:26: error: variable 'priority' set but not used [-Werror,-Wunused-but-set-variable] enum drm_sched_priority priority; ^ 1 error generated. *** [amdgpu_cs.o] Error code 1
Re: nullfs and ZFS issues
On Wed, Apr 20, 2022 at 11:39:44AM +0200, Alexander Leidinger wrote: | Quoting Doug Ambrisko (from Mon, 18 Apr 2022 | 16:32:38 -0700): | | > With nullfs, nocache and settings max vnodes to a low number I can | | Where is nocache documented? I don't see it in mount_nullfs(8), | mount(8) or nullfs(5). I didn't find it but it is in: src/sys/fs/nullfs/null_vfsops.c: if (vfs_getopt(mp->mnt_optnew, "nocache", NULL, NULL) == 0 || Also some file systems disable it via MNTK_NULL_NOCACHE | I tried a nullfs mount with nocache and it doesn't show up in the | output of "mount". Yep, I saw that as well. I could tell by dropping into ddb and then do a show mount on the FS and look at the count. That is why I added the vnode count to mount -v so I could see the usage without dropping into ddb. Doug A.
Re: nullfs and ZFS issues
On Wed, Apr 20, 2022 at 11:43:10AM +0200, Mateusz Guzik wrote: | On 4/19/22, Doug Ambrisko wrote: | > On Tue, Apr 19, 2022 at 11:47:22AM +0200, Mateusz Guzik wrote: | > | Try this: https://people.freebsd.org/~mjg/vnlru_free_pick.diff | > | | > | this is not committable but should validate whether it works fine | > | > As a POC it's working. I see the vnode count for the nullfs and | > ZFS go up. The ARC cache also goes up until it exceeds the ARC max. | > size tten the vnodes for nullfs and ZFS goes down. The ARC cache goes | > down as well. This all repeats over and over. The systems seems | > healthy. No excessive running of arc_prune or arc_evict. | > | > My only comment is that the vnode freeing seems a bit agressive. | > Going from ~15,000 to ~200 vnode for nullfs and the same for ZFS. | > The ARC drops from 70M to 7M (max is set at 64M) for this unit | > test. | > | | Can you check what kind of shrinking is requested by arc to begin | with? I imagine encountering a nullfs vnode may end up recycling 2 | instead of 1, but even repeated a lot it does not explain the above. I dug it into a bit more and think there could be a bug in: module/zfs/arc.c arc_evict_meta_balanced(uint64_t meta_used) prune += zfs_arc_meta_prune; //arc_prune_async(prune); arc_prune_async(zfs_arc_meta_prune); Since arc_prune_async, is queuing up a run of arc_prune_task for each call it is actually already accumulating the zfs_arc_meta_prune amount. It makes the count to vnlru_free_impl get really big quickly since it is looping via restart. 1 HELLO arc_prune_task 164 ticks 2147465958 count 2048 dmesg | grep arc_prune_task | uniq -c 14 HELLO arc_prune_task 164 ticks -2147343772 count 100 50 HELLO arc_prune_task 164 ticks -2147343771 count 100 46 HELLO arc_prune_task 164 ticks -2147343770 count 100 49 HELLO arc_prune_task 164 ticks -2147343769 count 100 44 HELLO arc_prune_task 164 ticks -2147343768 count 100 116 HELLO arc_prune_task 164 ticks -2147343767 count 100 1541 HELLO arc_prune_task 164 ticks -2147343766 count 100 53 HELLO arc_prune_task 164 ticks -2147343101 count 100 100 HELLO arc_prune_task 164 ticks -2147343100 count 100 75 HELLO arc_prune_task 164 ticks -2147343099 count 100 52 HELLO arc_prune_task 164 ticks -2147343098 count 100 50 HELLO arc_prune_task 164 ticks -2147343097 count 100 51 HELLO arc_prune_task 164 ticks -2147343096 count 100 783 HELLO arc_prune_task 164 ticks -2147343095 count 100 884 HELLO arc_prune_task 164 ticks -2147343094 count 100 Note I shrunk vfs.zfs.arc.meta_prune=100 to see how that might help. Changing it to 1, helps more! I see less agressive swings. I added printf("HELLO %s %d ticks %d count %ld\n",__FUNCTION__,__LINE__,ticks,nr_scan); to arc_prune_task. Adjusting both sysctl vfs.zfs.arc.meta_adjust_restarts=1 sysctl vfs.zfs.arc.meta_prune=100 without changing arc_prune_async(prune) helps avoid excessive swings. Thanks, Doug A. | > | On 4/19/22, Mateusz Guzik wrote: | > | > On 4/19/22, Mateusz Guzik wrote: | > | >> On 4/19/22, Doug Ambrisko wrote: | > | >>> I've switched my laptop to use nullfs and ZFS. Previously, I used | > | >>> localhost NFS mounts instead of nullfs when nullfs would complain | > | >>> that it couldn't mount. Since that check has been removed, I've | > | >>> switched to nullfs only. However, every so often my laptop would | > | >>> get slow and the the ARC evict and prune thread would consume two | > | >>> cores 100% until I rebooted. I had a 1G max. ARC and have increased | > | >>> it to 2G now. Looking into this has uncovered some issues: | > | >>> -nullfs would prevent vnlru_free_vfsops from doing anything | > | >>> when called from ZFS arc_prune_task | > | >>> -nullfs would hang onto a bunch of vnodes unless mounted with | > | >>> nocache | > | >>> -nullfs and nocache would break untar. This has been fixed | > now. | > | >>> | > | >>> With nullfs, nocache and settings max vnodes to a low number I can | > | >>> keep the ARC around the max. without evict and prune consuming | > | >>> 100% of 2 cores. This doesn't seem like the best solution but it | > | >>> better then when the ARC starts spinning. | > | >>> | > | >>> Looking into this issue with bhyve and a md drive for testing I | > create | > | >>> a brand new zpool mounted as /test and then nullfs mount /test to | > /mnt. | > | >>> I loop through untaring the Linux kernel into the nullfs mount, rm | > -rf | > | >>> it | > | >>> and repeat. I set the ARC to the smallest value I can. Untarring | > the | > | >>> Linux kernel was enough to get the ARC evict and prune to spin since | > | >>> they couldn't evict/prune anything. | > | >>> | > | >>> Looking at vnlru_free_vfsops called from ZFS arc_prune_task I see it | > | >>> static
Re: nullfs and ZFS issues
On 4/19/22, Doug Ambrisko wrote: > On Tue, Apr 19, 2022 at 11:47:22AM +0200, Mateusz Guzik wrote: > | Try this: https://people.freebsd.org/~mjg/vnlru_free_pick.diff > | > | this is not committable but should validate whether it works fine > > As a POC it's working. I see the vnode count for the nullfs and > ZFS go up. The ARC cache also goes up until it exceeds the ARC max. > size tten the vnodes for nullfs and ZFS goes down. The ARC cache goes > down as well. This all repeats over and over. The systems seems > healthy. No excessive running of arc_prune or arc_evict. > > My only comment is that the vnode freeing seems a bit agressive. > Going from ~15,000 to ~200 vnode for nullfs and the same for ZFS. > The ARC drops from 70M to 7M (max is set at 64M) for this unit > test. > Can you check what kind of shrinking is requested by arc to begin with? I imagine encountering a nullfs vnode may end up recycling 2 instead of 1, but even repeated a lot it does not explain the above. > > | On 4/19/22, Mateusz Guzik wrote: > | > On 4/19/22, Mateusz Guzik wrote: > | >> On 4/19/22, Doug Ambrisko wrote: > | >>> I've switched my laptop to use nullfs and ZFS. Previously, I used > | >>> localhost NFS mounts instead of nullfs when nullfs would complain > | >>> that it couldn't mount. Since that check has been removed, I've > | >>> switched to nullfs only. However, every so often my laptop would > | >>> get slow and the the ARC evict and prune thread would consume two > | >>> cores 100% until I rebooted. I had a 1G max. ARC and have increased > | >>> it to 2G now. Looking into this has uncovered some issues: > | >>> - nullfs would prevent vnlru_free_vfsops from doing anything > | >>> when called from ZFS arc_prune_task > | >>> - nullfs would hang onto a bunch of vnodes unless mounted with > | >>> nocache > | >>> - nullfs and nocache would break untar. This has been fixed > now. > | >>> > | >>> With nullfs, nocache and settings max vnodes to a low number I can > | >>> keep the ARC around the max. without evict and prune consuming > | >>> 100% of 2 cores. This doesn't seem like the best solution but it > | >>> better then when the ARC starts spinning. > | >>> > | >>> Looking into this issue with bhyve and a md drive for testing I > create > | >>> a brand new zpool mounted as /test and then nullfs mount /test to > /mnt. > | >>> I loop through untaring the Linux kernel into the nullfs mount, rm > -rf > | >>> it > | >>> and repeat. I set the ARC to the smallest value I can. Untarring > the > | >>> Linux kernel was enough to get the ARC evict and prune to spin since > | >>> they couldn't evict/prune anything. > | >>> > | >>> Looking at vnlru_free_vfsops called from ZFS arc_prune_task I see it > | >>> static int > | >>> vnlru_free_impl(int count, struct vfsops *mnt_op, struct vnode > *mvp) > | >>> { > | >>> ... > | >>> > | >>> for (;;) { > | >>> ... > | >>> vp = TAILQ_NEXT(vp, v_vnodelist); > | >>> ... > | >>> > | >>> /* > | >>> * Don't recycle if our vnode is from different type > | >>> * of mount point. Note that mp is type-safe, the > | >>> * check does not reach unmapped address even if > | >>> * vnode is reclaimed. > | >>> */ > | >>> if (mnt_op != NULL && (mp = vp->v_mount) != NULL && > | >>> mp->mnt_op != mnt_op) { > | >>> continue; > | >>> } > | >>> ... > | >>> > | >>> The vp ends up being the nulfs mount and then hits the continue > | >>> even though the passed in mvp is on ZFS. If I do a hack to > | >>> comment out the continue then I see the ARC, nullfs vnodes and > | >>> ZFS vnodes grow. When the ARC calls arc_prune_task that calls > | >>> vnlru_free_vfsops and now the vnodes go down for nullfs and ZFS. > | >>> The ARC cache usage also goes down. Then they increase again until > | >>> the ARC gets full and then they go down again. So with this hack > | >>> I don't need nocache passed to nullfs and I don't need to limit > | >>> the max vnodes. Doing multiple untars in parallel over and over > | >>> doesn't seem to cause any issues for this test. I'm not saying > | >>> commenting out continue is the fix but a simple POC test. > | >>> > | >> > | >> I don't see an easy way to say "this is a nullfs vnode holding onto a > | >> zfs vnode". Perhaps the routine can be extrended with issuing a nullfs > | >> callback, if the module is loaded. > | >> > | >> In the meantime I think a good enough(tm) fix would be to check that > | >> nothing was freed and fallback to good old regular clean up without > | >> filtering by vfsops. This would be very similar to what you are doing > | >> with your hack. > | >> > | > > | > Now that I wrote this perhaps an acceptable hack would be to extend > | > struct mount with a pointer to "lower layer" mount
Re: nullfs and ZFS issues
Quoting Doug Ambrisko (from Mon, 18 Apr 2022 16:32:38 -0700): With nullfs, nocache and settings max vnodes to a low number I can Where is nocache documented? I don't see it in mount_nullfs(8), mount(8) or nullfs(5). I tried a nullfs mount with nocache and it doesn't show up in the output of "mount". Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0x8F31830F9F2772BF pgpGalMcrXooX.pgp Description: Digitale PGP-Signatur