'set but unused' breaks drm-*-kmod

2022-04-20 Thread Michael Butler

Seems this new requirement breaks kmod builds too ..

The first of many errors was (I stopped chasing them all for lack of 
time) ..


--- amdgpu_cs.o ---
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.7.19_3/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1210:26: 
error: variable 'priority' set but not used 
[-Werror,-Wunused-but-set-variable]

enum drm_sched_priority priority;
^
1 error generated.
*** [amdgpu_cs.o] Error code 1



Re: nullfs and ZFS issues

2022-04-20 Thread Doug Ambrisko
On Wed, Apr 20, 2022 at 11:39:44AM +0200, Alexander Leidinger wrote:
| Quoting Doug Ambrisko  (from Mon, 18 Apr 2022  
| 16:32:38 -0700):
| 
| > With nullfs, nocache and settings max vnodes to a low number I can
| 
| Where is nocache documented? I don't see it in mount_nullfs(8),  
| mount(8) or nullfs(5).

I didn't find it but it is in:
src/sys/fs/nullfs/null_vfsops.c:  if (vfs_getopt(mp->mnt_optnew, 
"nocache", NULL, NULL) == 0 ||

Also some file systems disable it via MNTK_NULL_NOCACHE

| I tried a nullfs mount with nocache and it doesn't show up in the  
| output of "mount".

Yep, I saw that as well.  I could tell by dropping into ddb and then
do a show mount on the FS and look at the count.  That is why I added
the vnode count to mount -v so I could see the usage without dropping
into ddb.

Doug A.



Re: nullfs and ZFS issues

2022-04-20 Thread Doug Ambrisko
On Wed, Apr 20, 2022 at 11:43:10AM +0200, Mateusz Guzik wrote:
| On 4/19/22, Doug Ambrisko  wrote:
| > On Tue, Apr 19, 2022 at 11:47:22AM +0200, Mateusz Guzik wrote:
| > | Try this: https://people.freebsd.org/~mjg/vnlru_free_pick.diff
| > |
| > | this is not committable but should validate whether it works fine
| >
| > As a POC it's working.  I see the vnode count for the nullfs and
| > ZFS go up.  The ARC cache also goes up until it exceeds the ARC max.
| > size tten the vnodes for nullfs and ZFS goes down.  The ARC cache goes
| > down as well.  This all repeats over and over.  The systems seems
| > healthy.  No excessive running of arc_prune or arc_evict.
| >
| > My only comment is that the vnode freeing seems a bit agressive.
| > Going from ~15,000 to ~200 vnode for nullfs and the same for ZFS.
| > The ARC drops from 70M to 7M (max is set at 64M) for this unit
| > test.
| >
| 
| Can you check what kind of shrinking is requested by arc to begin
| with? I imagine encountering a nullfs vnode may end up recycling 2
| instead of 1, but even repeated a lot it does not explain the above.

I dug it into a bit more and think there could be a bug in:
module/zfs/arc.c
arc_evict_meta_balanced(uint64_t meta_used)
prune += zfs_arc_meta_prune;
//arc_prune_async(prune);
arc_prune_async(zfs_arc_meta_prune);

Since arc_prune_async, is queuing up a run of arc_prune_task for each
call it is actually already accumulating the zfs_arc_meta_prune
amount.  It makes the count to vnlru_free_impl get really big quickly
since it is looping via restart.

   1 HELLO arc_prune_task 164   ticks 2147465958 count 2048

dmesg | grep arc_prune_task | uniq -c
  14 HELLO arc_prune_task 164   ticks -2147343772 count 100
  50 HELLO arc_prune_task 164   ticks -2147343771 count 100
  46 HELLO arc_prune_task 164   ticks -2147343770 count 100
  49 HELLO arc_prune_task 164   ticks -2147343769 count 100
  44 HELLO arc_prune_task 164   ticks -2147343768 count 100
 116 HELLO arc_prune_task 164   ticks -2147343767 count 100
1541 HELLO arc_prune_task 164   ticks -2147343766 count 100
  53 HELLO arc_prune_task 164   ticks -2147343101 count 100
 100 HELLO arc_prune_task 164   ticks -2147343100 count 100
  75 HELLO arc_prune_task 164   ticks -2147343099 count 100
  52 HELLO arc_prune_task 164   ticks -2147343098 count 100
  50 HELLO arc_prune_task 164   ticks -2147343097 count 100
  51 HELLO arc_prune_task 164   ticks -2147343096 count 100
 783 HELLO arc_prune_task 164   ticks -2147343095 count 100
 884 HELLO arc_prune_task 164   ticks -2147343094 count 100

Note I shrunk vfs.zfs.arc.meta_prune=100 to see how that might
help.  Changing it to 1, helps more!  I see less agressive
swings.

I added
printf("HELLO %s %d   ticks %d count 
%ld\n",__FUNCTION__,__LINE__,ticks,nr_scan);

to arc_prune_task.

Adjusting both
sysctl vfs.zfs.arc.meta_adjust_restarts=1
sysctl vfs.zfs.arc.meta_prune=100

without changing arc_prune_async(prune) helps avoid excessive swings.

Thanks,

Doug A.

| > | On 4/19/22, Mateusz Guzik  wrote:
| > | > On 4/19/22, Mateusz Guzik  wrote:
| > | >> On 4/19/22, Doug Ambrisko  wrote:
| > | >>> I've switched my laptop to use nullfs and ZFS.  Previously, I used
| > | >>> localhost NFS mounts instead of nullfs when nullfs would complain
| > | >>> that it couldn't mount.  Since that check has been removed, I've
| > | >>> switched to nullfs only.  However, every so often my laptop would
| > | >>> get slow and the the ARC evict and prune thread would consume two
| > | >>> cores 100% until I rebooted.  I had a 1G max. ARC and have increased
| > | >>> it to 2G now.  Looking into this has uncovered some issues:
| > | >>>  -nullfs would prevent vnlru_free_vfsops from doing 
anything
| > | >>>   when called from ZFS arc_prune_task
| > | >>>  -nullfs would hang onto a bunch of vnodes unless mounted 
with
| > | >>>   nocache
| > | >>>  -nullfs and nocache would break untar.  This has been 
fixed
| > now.
| > | >>>
| > | >>> With nullfs, nocache and settings max vnodes to a low number I can
| > | >>> keep the ARC around the max. without evict and prune consuming
| > | >>> 100% of 2 cores.  This doesn't seem like the best solution but it
| > | >>> better then when the ARC starts spinning.
| > | >>>
| > | >>> Looking into this issue with bhyve and a md drive for testing I
| > create
| > | >>> a brand new zpool mounted as /test and then nullfs mount /test to
| > /mnt.
| > | >>> I loop through untaring the Linux kernel into the nullfs mount, rm
| > -rf
| > | >>> it
| > | >>> and repeat.  I set the ARC to the smallest value I can.  Untarring
| > the
| > | >>> Linux kernel was enough to get the ARC evict and prune to spin since
| > | >>> they couldn't evict/prune anything.
| > | >>>
| > | >>> Looking at vnlru_free_vfsops called from ZFS arc_prune_task I see it
| > | >>>   static 

Re: nullfs and ZFS issues

2022-04-20 Thread Mateusz Guzik
On 4/19/22, Doug Ambrisko  wrote:
> On Tue, Apr 19, 2022 at 11:47:22AM +0200, Mateusz Guzik wrote:
> | Try this: https://people.freebsd.org/~mjg/vnlru_free_pick.diff
> |
> | this is not committable but should validate whether it works fine
>
> As a POC it's working.  I see the vnode count for the nullfs and
> ZFS go up.  The ARC cache also goes up until it exceeds the ARC max.
> size tten the vnodes for nullfs and ZFS goes down.  The ARC cache goes
> down as well.  This all repeats over and over.  The systems seems
> healthy.  No excessive running of arc_prune or arc_evict.
>
> My only comment is that the vnode freeing seems a bit agressive.
> Going from ~15,000 to ~200 vnode for nullfs and the same for ZFS.
> The ARC drops from 70M to 7M (max is set at 64M) for this unit
> test.
>

Can you check what kind of shrinking is requested by arc to begin
with? I imagine encountering a nullfs vnode may end up recycling 2
instead of 1, but even repeated a lot it does not explain the above.

>
> | On 4/19/22, Mateusz Guzik  wrote:
> | > On 4/19/22, Mateusz Guzik  wrote:
> | >> On 4/19/22, Doug Ambrisko  wrote:
> | >>> I've switched my laptop to use nullfs and ZFS.  Previously, I used
> | >>> localhost NFS mounts instead of nullfs when nullfs would complain
> | >>> that it couldn't mount.  Since that check has been removed, I've
> | >>> switched to nullfs only.  However, every so often my laptop would
> | >>> get slow and the the ARC evict and prune thread would consume two
> | >>> cores 100% until I rebooted.  I had a 1G max. ARC and have increased
> | >>> it to 2G now.  Looking into this has uncovered some issues:
> | >>>  -  nullfs would prevent vnlru_free_vfsops from doing anything
> | >>> when called from ZFS arc_prune_task
> | >>>  -  nullfs would hang onto a bunch of vnodes unless mounted with
> | >>> nocache
> | >>>  -  nullfs and nocache would break untar.  This has been fixed
> now.
> | >>>
> | >>> With nullfs, nocache and settings max vnodes to a low number I can
> | >>> keep the ARC around the max. without evict and prune consuming
> | >>> 100% of 2 cores.  This doesn't seem like the best solution but it
> | >>> better then when the ARC starts spinning.
> | >>>
> | >>> Looking into this issue with bhyve and a md drive for testing I
> create
> | >>> a brand new zpool mounted as /test and then nullfs mount /test to
> /mnt.
> | >>> I loop through untaring the Linux kernel into the nullfs mount, rm
> -rf
> | >>> it
> | >>> and repeat.  I set the ARC to the smallest value I can.  Untarring
> the
> | >>> Linux kernel was enough to get the ARC evict and prune to spin since
> | >>> they couldn't evict/prune anything.
> | >>>
> | >>> Looking at vnlru_free_vfsops called from ZFS arc_prune_task I see it
> | >>>   static int
> | >>>   vnlru_free_impl(int count, struct vfsops *mnt_op, struct vnode
> *mvp)
> | >>>   {
> | >>> ...
> | >>>
> | >>> for (;;) {
> | >>> ...
> | >>> vp = TAILQ_NEXT(vp, v_vnodelist);
> | >>> ...
> | >>>
> | >>> /*
> | >>>  * Don't recycle if our vnode is from different type
> | >>>  * of mount point.  Note that mp is type-safe, the
> | >>>  * check does not reach unmapped address even if
> | >>>  * vnode is reclaimed.
> | >>>  */
> | >>> if (mnt_op != NULL && (mp = vp->v_mount) != NULL &&
> | >>> mp->mnt_op != mnt_op) {
> | >>> continue;
> | >>> }
> | >>> ...
> | >>>
> | >>> The vp ends up being the nulfs mount and then hits the continue
> | >>> even though the passed in mvp is on ZFS.  If I do a hack to
> | >>> comment out the continue then I see the ARC, nullfs vnodes and
> | >>> ZFS vnodes grow.  When the ARC calls arc_prune_task that calls
> | >>> vnlru_free_vfsops and now the vnodes go down for nullfs and ZFS.
> | >>> The ARC cache usage also goes down.  Then they increase again until
> | >>> the ARC gets full and then they go down again.  So with this hack
> | >>> I don't need nocache passed to nullfs and I don't need to limit
> | >>> the max vnodes.  Doing multiple untars in parallel over and over
> | >>> doesn't seem to cause any issues for this test.  I'm not saying
> | >>> commenting out continue is the fix but a simple POC test.
> | >>>
> | >>
> | >> I don't see an easy way to say "this is a nullfs vnode holding onto a
> | >> zfs vnode". Perhaps the routine can be extrended with issuing a nullfs
> | >> callback, if the module is loaded.
> | >>
> | >> In the meantime I think a good enough(tm) fix would be to check that
> | >> nothing was freed and fallback to good old regular clean up without
> | >> filtering by vfsops. This would be very similar to what you are doing
> | >> with your hack.
> | >>
> | >
> | > Now that I wrote this perhaps an acceptable hack would be to extend
> | > struct mount with a pointer to "lower layer" mount 

Re: nullfs and ZFS issues

2022-04-20 Thread Alexander Leidinger
Quoting Doug Ambrisko  (from Mon, 18 Apr 2022  
16:32:38 -0700):



With nullfs, nocache and settings max vnodes to a low number I can


Where is nocache documented? I don't see it in mount_nullfs(8),  
mount(8) or nullfs(5).


I tried a nullfs mount with nocache and it doesn't show up in the  
output of "mount".


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


pgpGalMcrXooX.pgp
Description: Digitale PGP-Signatur