Re: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics

Mark Millard Thu, 07 Sep 2023 16:33:08 -0700

On Sep 7, 2023, at 13:07, Alexander Motin <[email protected]> wrote:

> Thanks, Mark.
> 
> On 07.09.2023 15:40, Mark Millard wrote:
>> On Sep 7, 2023, at 11:48, Glen Barber <[email protected]> wrote:
>>> On Thu, Sep 07, 2023 at 11:17:22AM -0700, Mark Millard wrote:
>>>> When I next have time, should I retry based on a more recent
>>>> vintage of main that includes 969071be938c ?
>>> 
>>> Yes, please, if you can.
>> As stands, I rebooted that machine into my normal
>> enviroment, so the after-crash-with-dump-info
>> context is preserved. I'll presume lack of a need
>> to preserve that context unless I hear otherwise.
>> (But I'll work on this until later today.)
>> Even my normal environment predates the commit in
>> question by a few commits. So I'll end up doing a
>> more general round of updates overall.
>> Someone can let me know if there is a preference
>> for debug over non-debug for the next test run.
> 
> It is not unknown when some bugs disappear once debugging is enabled due to 
> different execution timings, but generally debug may to detect the problem 
> closer to its origin instead of looking on random consequences. I am only 
> starting to look on this report (unless Pawel or somebody beat me on it), and 
> don't have additional requests yet, but if you can repeat the same with debug 
> kernel (in-base ZFS's ZFS_DEBUG setting follows kernel's INVARIANTS), it may 
> give us some additional information.


So I did a zpool import, rewinding to the checkpoint.
(This depends on the questionable zfs doing fully as
desired for this. Notably the normal environment has
vfs.zfs.bclone_enabled=0 , including when it was
doing this activity.) My normal environment reported
no problems.

Note: the earlier snapshot from my first setup was
still in place since it was made just before the
original checkpoint used above.

However, the rewind did remove the /var/crash/
material that had been added.

I did the appropriate zfs mount.

I installed a debug kernel and world to the import. Again,
no problems reported.

I did the appropriate zfs umount.

I did the appropriate zpool export.

I rebooted with the test media.

# sysctl vfs.zfs.bclone_enabled
vfs.zfs.bclone_enabled: 1

# zpool trim -w zamd64

# zpool checkpoint zamd64

# uname -apKU
FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #74 
main-n265188-117c54a78ccd-dirty: Tue Sep  5 21:29:53 PDT 2023     
root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-dbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-DBG
 amd64 amd64 1500000 1500000

(So, before the 969071be938c vintage, same sources as for
my last run but a debug build.)

# poudriere bulk -jmain-amd64-bulk_a -a
. . .
[00:03:23] Building 34214 packages using up to 32 builders
[00:03:23] Hit CTRL+t at any time to see build progress and stats
[00:03:23] [01] [00:00:00] Builder starting
[00:04:19] [01] [00:00:56] Builder started
[00:04:20] [01] [00:00:01] Building ports-mgmt/pkg | pkg-1.20.6
[00:05:33] [01] [00:01:14] Finished ports-mgmt/pkg | pkg-1.20.6: Success
[00:05:53] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
[00:05:53] [02] [00:00:00] Builder starting
. . .
[00:05:54] [32] [00:00:00] Builder starting
[00:06:11] [01] [00:00:18] Finished print/indexinfo | indexinfo-0.3.1: Success
[00:06:12] [01] [00:00:00] Building devel/gettext-runtime | 
gettext-runtime-0.22_1
[00:08:24] [01] [00:02:12] Finished devel/gettext-runtime | 
gettext-runtime-0.22_1: Success
[00:08:31] [01] [00:00:00] Building devel/libtextstyle | libtextstyle-0.22
[00:10:06] [05] [00:04:13] Builder started
[00:10:06] [05] [00:00:00] Building devel/autoconf-switch | 
autoconf-switch-20220527
[00:10:06] [31] [00:04:12] Builder started
[00:10:06] [31] [00:00:00] Building devel/libatomic_ops | libatomic_ops-7.8.0
. . .

Crashed again, with 158 *.pkg files in .building/All/ after
rebooting.

The crash is similar to the non-debug one. No extra output
from the debug build.

For reference:

Unread portion of the kernel message buffer:
panic: Solaris(panic): zfs: accessing past end of object 422/10b1c02 (size=2560 
access=2560+2560)
cpuid = 15
time = 1694127988
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02e783b5a0
vpanic() at vpanic+0x132/frame 0xfffffe02e783b6d0
panic() at panic+0x43/frame 0xfffffe02e783b730
vcmn_err() at vcmn_err+0xeb/frame 0xfffffe02e783b860
zfs_panic_recover() at zfs_panic_recover+0x59/frame 0xfffffe02e783b8c0
dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0xb8/frame 
0xfffffe02e783b970
dmu_brt_clone() at dmu_brt_clone+0x61/frame 0xfffffe02e783b9f0
zfs_clone_range() at zfs_clone_range+0xaa3/frame 0xfffffe02e783bbc0
zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x18a/frame 
0xfffffe02e783bc40
vn_copy_file_range() at vn_copy_file_range+0x114/frame 0xfffffe02e783bce0
kern_copy_file_range() at kern_copy_file_range+0x36c/frame 0xfffffe02e783bdb0
sys_copy_file_range() at sys_copy_file_range+0x78/frame 0xfffffe02e783be00
amd64_syscall() at amd64_syscall+0x14f/frame 0xfffffe02e783bf30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe02e783bf30
--- syscall (569, FreeBSD ELF64, copy_file_range), rip = 0xa8a8d32b55a, rsp = 
0xa8a8c8fa158, rbp = 0xa8a8c8fa5f0 ---
KDB: enter: panic

__curthread () at /usr/main-src/sys/amd64/include/pcpu_aux.h:57
57              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct 
pcpu,
(kgdb) #0  __curthread () at /usr/main-src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=textdump@entry=0)
    at /usr/main-src/sys/kern/kern_shutdown.c:405
#2  0xffffffff804a505a in db_dump (dummy=<optimized out>, 
    dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
    at /usr/main-src/sys/ddb/db_command.c:591
#3  0xffffffff804a4e5d in db_command (last_cmdp=<optimized out>, 
    cmd_table=<optimized out>, dopager=true)
    at /usr/main-src/sys/ddb/db_command.c:504
#4  0xffffffff804a4b1d in db_command_loop ()
    at /usr/main-src/sys/ddb/db_command.c:551
#5  0xffffffff804a81f6 in db_trap (type=<optimized out>, code=<optimized out>)
    at /usr/main-src/sys/ddb/db_main.c:268
#6  0xffffffff80bb0793 in kdb_trap (type=type@entry=3, code=code@entry=0, 
    tf=tf@entry=0xfffffe02e783b4e0) at /usr/main-src/sys/kern/subr_kdb.c:790
#7  0xffffffff810623c9 in trap (frame=0xfffffe02e783b4e0)
    at /usr/main-src/sys/amd64/amd64/trap.c:608
#8  <signal handler called>
#9  kdb_enter (why=<optimized out>, msg=<optimized out>)
    at /usr/main-src/sys/kern/subr_kdb.c:556
#10 0xffffffff80b61b33 in vpanic (fmt=0xffffffff82c3914f "%s%s", 
    ap=ap@entry=0xfffffe02e783b710)
    at /usr/main-src/sys/kern/kern_shutdown.c:958
#11 0xffffffff80b61913 in panic (
    fmt=0xffffffff82369800 <cnputs_mtx> "W\346\025\201\377\377\377\377")
    at /usr/main-src/sys/kern/kern_shutdown.c:894
#12 0xffffffff8299464b in vcmn_err (ce=<optimized out>, 
    fmt=0xffffffff82c6f791 "zfs: accessing past end of object %llx/%llx 
(size=%u access=%llu+%llu)", adx=0xfffffe02e783b8a0)
    at /usr/main-src/sys/contrib/openzfs/module/os/freebsd/spl/spl_cmn_err.c:60
#13 0xffffffff82ab1ab9 in zfs_panic_recover (
    fmt=0x12 <error: Cannot access memory at address 0x12>)
    at /usr/main-src/sys/contrib/openzfs/module/zfs/spa_misc.c:1594
#14 0xffffffff82a10138 in dmu_buf_hold_array_by_dnode (dn=0xfffff80b9edd67f0, 
    offset=offset@entry=2560, length=length@entry=2560, read=read@entry=0, 
    tag=0xffffffff82c1c507, numbufsp=numbufsp@entry=0xfffffe02e783b9bc, 
    dbpp=0xfffffe02e783b9c0, flags=0)
    at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:543
#15 0xffffffff82a13c91 in dmu_buf_hold_array (os=<optimized out>, 
    object=<optimized out>, read=0, numbufsp=0xfffffe02e783b9bc, 
    dbpp=0xfffffe02e783b9c0, offset=<optimized out>, length=<optimized out>, 
    tag=<optimized out>)
    at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:654
#16 dmu_brt_clone (os=os@entry=0xfffff8011e989800, object=<optimized out>, 
    offset=offset@entry=2560, length=length@entry=2560, 
    tx=tx@entry=0xfffff81b68d0bb00, bps=bps@entry=0xfffffe04bcc60000, nbps=1, 
    replay=0) at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:2301
#17 0xffffffff82b7e393 in zfs_clone_range (inzp=0xfffff805a65f7000, 
    inoffp=0xfffff805c5e13548, outzp=0xfffff809d42c2cb0, 
    outoffp=0xfffff80134c184f8, lenp=lenp@entry=0xfffffe02e783bc00, 
    cr=0xfffff8041841f700)
    at /usr/main-src/sys/contrib/openzfs/module/zfs/zfs_vnops.c:1302
#18 0xffffffff829b77ca in zfs_freebsd_copy_file_range (ap=0xfffffe02e783bc58)
    at 
/usr/main-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:6294
#19 0xffffffff80c74694 in VOP_COPY_FILE_RANGE (invp=<optimized out>, 
    inoffp=<optimized out>, outvp=0xffffffff81183b4f, 
    outvp@entry=0xfffffe02e783bce0, outoffp=0xffffffff812265a2, lenp=0x0, 
    lenp@entry=0xfffffe02e783bc70, flags=0, incred=0xfffff8041841f700, 
    outcred=0x0, fsizetd=0xfffffe021b5f4740) at ./vnode_if.h:2385
#20 vn_copy_file_range (invp=invp@entry=0xfffff80435eef8c0, 
    inoffp=inoffp@entry=0xfffff805c5e13548, outvp=0xffffffff81183b4f, 
    outvp@entry=0xfffff80ce8c8ee00, outoffp=0xffffffff812265a2, lenp=0x0, 
    lenp@entry=0xfffffe02e783bd40, flags=flags@entry=0, 
    incred=0xfffff8041841f700, outcred=0xfffff8041841f700, 
    fsize_td=0xfffffe021b5f4740) at /usr/main-src/sys/kern/vfs_vnops.c:3085
#21 0xffffffff80c6f22c in kern_copy_file_range (
    td=td@entry=0xfffffe021b5f4740, infd=<optimized out>, 
    inoffp=0xfffff805c5e13548, inoffp@entry=0x0, outfd=<optimized out>, 
    outoffp=0xffffffff812265a2, outoffp@entry=0x0, len=9223372036854775807, 
    flags=0) at /usr/main-src/sys/kern/vfs_syscalls.c:4971
#22 0xffffffff80c6f338 in sys_copy_file_range (td=0xfffffe021b5f4740, 
    uap=0xfffffe021b5f4b40) at /usr/main-src/sys/kern/vfs_syscalls.c:5009
#23 0xffffffff8106323f in syscallenter (td=0xfffffe021b5f4740)
    at /usr/main-src/sys/amd64/amd64/../../kern/subr_syscall.c:187
#24 amd64_syscall (td=0xfffffe021b5f4740, traced=0)
    at /usr/main-src/sys/amd64/amd64/trap.c:1197
#25 <signal handler called>
#26 0x00000a8a8d32b55a in ?? ()
Backtrace stopped: Cannot access memory at address 0xa8a8c8fa158
(kgdb) 


>> Looking at "git: 969071be938c - main", the relevant
>> part seems to be just (white space possibly not
>> preserved accurately):
>> diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
>> index 9fb5aee6a023..4e4161ef1a7f 100644
>> --- a/sys/kern/vfs_vnops.c
>> +++ b/sys/kern/vfs_vnops.c
>> @@ -3076,12 +3076,14 @@ vn_copy_file_range(struct vnode *invp, off_t 
>> *inoffp, struct vnode *outvp,
>>   goto out;
>>     /*
>> - * If the two vnode are for the same file system, call
>> + * If the two vnodes are for the same file system type, call
>>   * VOP_COPY_FILE_RANGE(), otherwise call vn_generic_copy_file_range()
>> - * which can handle copies across multiple file systems.
>> + * which can handle copies across multiple file system types.
>>   */
>>   *lenp = len;
>> - if (invp->v_mount == outvp->v_mount)
>> + if (invp->v_mount == outvp->v_mount ||
>> +    strcmp(invp->v_mount->mnt_vfc->vfc_name,
>> +    outvp->v_mount->mnt_vfc->vfc_name) == 0)
>>   error = VOP_COPY_FILE_RANGE(invp, inoffp, outvp, outoffp,
>>      lenp, flags, incred, outcred, fsize_td);
>>   else
>> That looks to call VOP_COPY_FILE_RANGE in more contexts and
>> vn_generic_copy_file_range in fewer.
>> The backtrace I reported involves: VOP_COPY_FILE_RANGE
>> So it appears this change is unlikely to invalidate my
>> test result,  although failure might happen sooner if
>> more VOP_COPY_FILE_RANGE calls happen with the newer code.
> 
> Your logic is likely right, but if you have block cloning requests both 
> within and between datasets, this patch may change the pattern.  Though it is 
> obviously not a fix for the issue.  I responded to the commit email only 
> because it makes no difference while vfs.zfs.bclone_enabled is 0.
> 
>> That in turns means that someone may come up with some
>> other change for me to test by the time I get around to
>> setting up another test. Let me know if so.
> 
> -- 
> Alexander Motin


===
Mark Millard
marklmi at yahoo.com

Re: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics

Reply via email to