Am 2023-08-22 18:59, schrieb Mateusz Guzik:
On 8/22/23, Alexander Leidinger <alexan...@leidinger.net> wrote:
Am 2023-08-21 10:53, schrieb Konstantin Belousov:
On Mon, Aug 21, 2023 at 08:19:28AM +0200, Alexander Leidinger wrote:
Am 2023-08-20 23:17, schrieb Konstantin Belousov:
> On Sun, Aug 20, 2023 at 11:07:08PM +0200, Mateusz Guzik wrote:
> > On 8/20/23, Alexander Leidinger <alexan...@leidinger.net> wrote:
> > > Am 2023-08-20 22:02, schrieb Mateusz Guzik:
> > >> On 8/20/23, Alexander Leidinger <alexan...@leidinger.net> wrote:
> > >>> Am 2023-08-20 19:10, schrieb Mateusz Guzik:
> > >>>> On 8/18/23, Alexander Leidinger <alexan...@leidinger.net>
> > >>>> wrote:
> > >>>
> > >>>>> I have a 51MB text file, compressed to about 1MB. Are you
> > >>>>> interested
> > >>>>> to
> > >>>>> get it?
> > >>>>>
> > >>>>
> > >>>> Your problem is not the vnode limit, but nullfs.
> > >>>>
> > >>>> https://people.freebsd.org/~mjg/netchild-periodic-find.svg
> > >>>
> > >>> 122 nullfs mounts on this system. And every jail I setup has
> > >>> several
> > >>> null mounts. One basesystem mounted into every jail, and then
> > >>> shared
> > >>> ports (packages/distfiles/ccache) across all of them.
> > >>>
> > >>>> First, some of the contention is notorious VI_LOCK in order to
> > >>>> do
> > >>>> anything.
> > >>>>
> > >>>> But more importantly the mind-boggling off-cpu time comes from
> > >>>> exclusive locking which should not be there to begin with -- as
> > >>>> in
> > >>>> that xlock in stat should be a slock.
> > >>>>
> > >>>> Maybe I'm going to look into it later.
> > >>>
> > >>> That would be fantastic.
> > >>>
> > >>
> > >> I did a quick test, things are shared locked as expected.
> > >>
> > >> However, I found the following:
> > >>         if ((xmp->nullm_flags & NULLM_CACHE) != 0) {
> > >>                 mp->mnt_kern_flag |=
> > >> lowerrootvp->v_mount->mnt_kern_flag &
> > >>                     (MNTK_SHARED_WRITES | MNTK_LOOKUP_SHARED |
> > >>                     MNTK_EXTENDED_SHARED);
> > >>         }
> > >>
> > >> are you using the "nocache" option? it has a side effect of
> > >> xlocking
> > >
> > > I use noatime, noexec, nosuid, nfsv4acls. I do NOT use nocache.
> > >
> >
> > If you don't have "nocache" on null mounts, then I don't see how
> > this
> > could happen.
>
> There is also MNTK_NULL_NOCACHE on lower fs, which is currently set
> for
> fuse and nfs at least.

11 of those 122 nullfs mounts are ZFS datasets which are also NFS
exported.
6 of those nullfs mounts are also exported via Samba. The NFS exports
shouldn't be needed anymore, I will remove them.
By nfs I meant nfs client, not nfs exports.

No NFS client mounts anywhere on this system. So where is this exclusive
lock coming from then...
This is a ZFS system. 2 pools: one for the root, one for anything I need space for. Both pools reside on the same disks. The root pool is a 3-way
mirror, the "space-pool" is a 5-disk raidz2. All jails are on the
space-pool. The jails are all basejail-style jails.


While I don't see why xlocking happens, you should be able to dtrace
or printf your way into finding out.

dtrace looks to me like a faster approach to get to the root than printf... my first naive try is to detect exclusive locks. I'm not 100% sure I got it right, but at least dtrace doesn't complain about it:
---snip---
#pragma D option dynvarsize=32m

fbt:nullfs:null_lock:entry
/args[0]->a_flags & 0x080000 != 0/
{
        stack();
}
---snip---

In which direction should I look with dtrace if this works in tonights run of periodic? I don't have enough knowledge about VFS to come up with some immediate ideas.

Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org    netch...@freebsd.org  : PGP 0x8F31830F9F2772BF

Reply via email to