On Fri, Aug 15, 2025 at 5:58 PM Konstantin Belousov <kostik...@gmail.com> wrote:
>
> On Fri, Aug 15, 2025 at 05:41:26PM -0700, Rick Macklem wrote:
> > On Fri, Aug 15, 2025 at 5:35 PM Konstantin Belousov <kostik...@gmail.com> 
> > wrote:
> > >
> > > On Fri, Aug 15, 2025 at 05:26:21PM -0700, Rick Macklem wrote:
> > > > On Fri, Aug 15, 2025 at 5:07 PM Konstantin Belousov 
> > > > <kostik...@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 15, 2025 at 04:51:00PM -0700, Bakul Shah wrote:
> > > > > > On Aug 15, 2025, at 3:51 PM, Konstantin Belousov 
> > > > > > <kostik...@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Aug 15, 2025 at 11:19:55AM -0700, Bakul Shah wrote:
> > > > > > >> Is this a known bug or may be something specific on my machine?
> > > > > > >> If the latter, any way to "fsck" it? FYI, the zpool is a mirror
> > > > > > >> (two files on the host via nvme). built from c992ac621327 commit 
> > > > > > >> hash
> > > > > > >> (which has other issues but they seem to be separate from this).
> > > > > > >> I saw the same panic when I booted from a day old snapshot.
> > > > > > >>
> > > > > > >> Note that "ls /.zfs" panics but "ls /.zfs/snapshot" doesn't!
> > > > > > >>
> > > > > > >> This is on a -current VM:
> > > > > > >>
> > > > > > >> root@:/ # ls .zfs
> > > > > > >> VNASSERT failed: oresid == 0 || nresid != oresid || 
> > > > > > >> *(a)->a_eofflag == 1 not true at vnode_if.c:1824 
> > > > > > >> (VOP_READDIR_APV)
> > > > > > >
> > > > > > > Try this, untested.
> > > > > >
> > > > > > Thanks for the quick patch! But I am afraid it didn't help. Let me 
> > > > > > know if you
> > > > > > want me to check things via gdb. [I have filed
> > > > > >       https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=288889
> > > > > > so we can continue debugging there]
> > > > > >
> > > > > > On the console (single user, RO root):
> > > > > > # ls /.zfs
> > > > > > VNASSERT failed: oresid == 0 || nresid != oresid || *(a)->a_eofflag 
> > > > > > == 1 not true at vnode_if.c:1824 (VOP_READDIR_APV)
> > > > > > 0xfffff800059546e0: type VDIR state VSTATE_CONSTRUCTED op 
> > > > > > 0xffffffff8272cfd0
> > > > > >     usecount 1, writecount 0, refcount 1 seqc users 0 mountedhere 0
> > > > > >     hold count flags ()
> > > > > >     flags ()
> > > > > >     lock type zfs: SHARED (count 1)
> > > > > >         name = .zfs
> > > > > >         parent_id = 0
> > > > > >         id = 1
> > > > > > panic: VOP_READDIR: eofflag not set
> > > > > > cpuid = 0
> > > > > > time = 1755276357
> > > > > > KDB: stack backtrace:
> > > > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> > > > > > 0xfffffe0053f83af0
> > > > > > vpanic() at vpanic+0x136/frame 0xfffffe0053f83c20
> > > > > > panic() at panic+0x43/frame 0xfffffe0053f83c80
> > > > > > VOP_READDIR_APV() at VOP_READDIR_APV+0x205/frame 0xfffffe0053f83cd0
> > > > > > kern_getdirentries() at kern_getdirentries+0x228/frame 
> > > > > > 0xfffffe0053f83dd0
> > > > > > sys_getdirentries() at sys_getdirentries+0x29/frame 
> > > > > > 0xfffffe0053f83e00
> > > > > > amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe0053f83f30
> > > > > > fast_syscall_common() at fast_syscall_common+0xf8/frame 
> > > > > > 0xfffffe0053f83f30
> > > > > > --- syscall (554, FreeBSD ELF64, getdirentries), rip = 
> > > > > > 0x331339f976aa, rsp = 0x33133631ade8, rbp = 0x33133631ae20 ---
> > > > > > KDB: enter: panic
> > > > > > [ thread pid 23 tid 100211 ]
> > > > > > Stopped at      kdb_enter+0x33: movq    $0,0x12313e2(%rip)
> > > > > > db>
> > > > > >
> > > > > > Running gdb on the host (attached to tcp port):
> > > > > > #16 0xffffffff80b7992b in vpanic (
> > > > > >     fmt=0xffffffff812ddf30 "VOP_READDIR: eofflag not set",
> > > > > >     ap=ap@entry=0xfffffe0053f83c60)
> > > > > >     at /home/FreeBSD/current/sys/kern/kern_shutdown.c:962
> > > > > > #17 0xffffffff80b79793 in panic (
> > > > > >     fmt=0xffffffff81d9eab0 <cnputs_mtx> 
> > > > > > "\304\372\032\201\377\377\377\377")
> > > > > >     at /home/FreeBSD/current/sys/kern/kern_shutdown.c:887
> > > > > > #18 0xffffffff81195fd5 in VOP_READDIR_APV (vop=<optimized out>,
> > > > > >     a=a@entry=0xfffffe0053f83d30) at vnode_if.c:1824
> > > > > > #19 0xffffffff80c95e58 in VOP_READDIR (vp=0xfffff800059546e0,
> > > > > >     uio=0xfffffe0053f83d00, cred=<optimized out>, 
> > > > > > eofflag=0xfffffe0053f83d6c,
> > > > > >     ncookies=0x0, cookies=0x0) at ./vnode_if.h:972
> > > > > From this frame, do
> > > > > p *vp
> > > > > and
> > > > > p *(vp->v_op)
> > > > > I am mostly interested what is the .vop_readdir fp points to.
> > > > I think the problem is that, for this case, ZFS replies with eofflag
> > > > == -1 instead
> > > > of 1. (I don't know if you want to change the ASSERT or try to fix ZFS
> > > > to not do this?)
> > >
> > > Where do you see it? I mean the '-1' set to *eofp.
> > I saw it in a printf() after VOP_READDIR(). However, a subsequent
> > test showed 0.
> > --> The first time I was printing out for non-ZFS, so there was a fair 
> > amount
> >       of other printf()s being logged. Then I limited it to ZFS.
> >
> > Anyhow, ZFS seems to get eof wrong when it is already at eof.
> >
> > I'll take a look at the ZFS code, rick
> >
>
> Below is what I gathered so far.
>
> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c 
> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c
> index 61d0bb26d1e5..c191b2abf6cc 100644
> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c
> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c
> @@ -674,6 +674,7 @@ zfsctl_root_readdir(struct vop_readdir_args *ap)
>         zfs_uio_t uio;
>         int *eofp = ap->a_eofflag;
>         off_t dots_offset;
> +       ssize_t orig_resid;
>         int error;
>
>         zfs_uio_init(&uio, ap->a_uio);
> @@ -688,14 +689,20 @@ zfsctl_root_readdir(struct vop_readdir_args *ap)
>          * count to return is 0.
>          */
>         if (zfs_uio_offset(&uio) == 3 * sizeof (entry)) {
> +               if (eofp != NULL)
> +                       *eofp = 1;
Yea, I was just going to try the same thing. I think it probably fixes
the assert.
(Of course, it might take a while to get this in ZFS, so you might have to drop
the assert until it downstreams from OpenZFS?)

>                 return (0);
>         }
>
> +       orig_resid = zfs_uio_resid(&uio);
>         error = sfs_readdir_common(zfsvfs->z_root, ZFSCTL_INO_ROOT, ap, &uio,
>             &dots_offset);
>         if (error != 0) {
> -               if (error == ENAMETOOLONG) /* ran out of destination space */
> +               if (error == ENAMETOOLONG) { /* ran out of destination space 
> */
>                         error = 0;
> +                       if (orig_resid == zfs_uio_resid(&uio) && eofp != NULL)
> +                               *eofp = 1;
If it hasn't filled in an entry, shouldn't it be an error return and not
setting "error = 0;"?

> +               }
>                 return (error);
>         }
>         if (zfs_uio_offset(&uio) != dots_offset)
> @@ -710,8 +717,11 @@ zfsctl_root_readdir(struct vop_readdir_args *ap)
>         entry.d_reclen = sizeof (entry);
>         error = vfs_read_dirent(ap, &entry, zfs_uio_offset(&uio));
>         if (error != 0) {
> -               if (error == ENAMETOOLONG)
> +               if (error == ENAMETOOLONG) {
>                         error = 0;
> +                       if (orig_resid == zfs_uio_resid(&uio) && eofp != NULL)
> +                               *eofp = 1;
Ditto above.
> +               }
>                 return (SET_ERROR(error));
>         }
>         if (eofp != NULL)
> @@ -1056,17 +1066,22 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap)
>         zfs_uio_t uio;
>         int *eofp = ap->a_eofflag;
>         off_t dots_offset;
> +       ssize_t orig_resid;
>         int error;
>
>         zfs_uio_init(&uio, ap->a_uio);
> +       orig_resid = zfs_uio_resid(&uio);
>
>         ASSERT3S(vp->v_type, ==, VDIR);
>
>         error = sfs_readdir_common(ZFSCTL_INO_ROOT, ZFSCTL_INO_SNAPDIR, ap,
>             &uio, &dots_offset);
>         if (error != 0) {
> -               if (error == ENAMETOOLONG) /* ran out of destination space */
> +               if (error == ENAMETOOLONG) { /* ran out of destination space 
> */
>                         error = 0;
> +                       if (orig_resid == zfs_uio_resid(&uio) && eofp != NULL)
> +                               *eofp = 1;
> +               }
Ditto again.
>                 return (error);
>         }
>
> @@ -1084,7 +1099,8 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap)
>                 dsl_pool_config_exit(dmu_objset_pool(zfsvfs->z_os), FTAG);
>                 if (error != 0) {
>                         if (error == ENOENT) {
> -                               if (eofp != NULL)
> +                               if (orig_resid == zfs_uio_resid(&uio) &&
> +                                   eofp != NULL)
>                                         *eofp = 1;
>                                 error = 0;
>                         }
> @@ -1099,8 +1115,12 @@ zfsctl_snapdir_readdir(struct vop_readdir_args *ap)
>                 entry.d_reclen = sizeof (entry);
>                 error = vfs_read_dirent(ap, &entry, zfs_uio_offset(&uio));
>                 if (error != 0) {
> -                       if (error == ENAMETOOLONG)
> +                       if (error == ENAMETOOLONG) {
>                                 error = 0;
> +                               if (orig_resid == zfs_uio_resid(&uio) &&
> +                                   eofp != NULL)
> +                                       *eofp = 1;
Ditto again.
> +                       }
>                         zfs_exit(zfsvfs, FTAG);
>                         return (SET_ERROR(error));
>                 }
> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c 
> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> index 1813c411b013..79e808a3ab3d 100644
> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c
> @@ -1695,6 +1695,7 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, 
> int *eofp,
>         objset_t        *os;
>         caddr_t         outbuf;
>         size_t          bufsize;
> +       ssize_t         orig_resid;
>         zap_cursor_t    zc;
>         zap_attribute_t *zap;
>         uint_t          bytes_wanted;
> @@ -1735,7 +1736,7 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, 
> int *eofp,
>         /*
>          * Quit if directory has been removed (posix)
>          */
> -       if ((*eofp = zp->z_unlinked) != 0) {
> +       if ((*eofp = (zp->z_unlinked != 0)) != 0) {
>                 zfs_exit(zfsvfs, FTAG);
>                 return (0);
>         }
> @@ -1743,6 +1744,7 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, 
> int *eofp,
>         error = 0;
>         os = zfsvfs->z_os;
>         offset = zfs_uio_offset(uio);
> +       orig_resid = zfs_uio_resid(uio);
>         prefetch = zp->z_zn_prefetch;
>         zap = zap_attribute_long_alloc();
>
> @@ -1933,6 +1935,8 @@ zfs_readdir(vnode_t *vp, zfs_uio_t *uio, cred_t *cr, 
> int *eofp,
>                 *cookies = NULL;
>                 *ncookies = 0;
>         }
> +       if (error == 0 && zfs_uio_resid(uio) != orig_resid)
> +               *eofp = 1;
I looked at this earlier and I don't this is needed. When it breaks out
of the loop, it has set eof if it got ENOENT from zap_cursor_retrieve(),
which I think is the only case that will return 0 and not an error?
(But it looks harmless to me.)

>         return (error);
>  }
>
>

Reply via email to