on 04/02/2014 19:32 Andriy Gapon said the following:
> I suggested the following diagnostic patch:
> http://people.freebsd.org/~avg/l2arc-b_tmp_cdata-diag.patch
> 
> With the patch applied there is the following panic (sorry for poor quality):
> http://oi59.tinypic.com/sfctvc.jpg
> 
> I am trying to get more details.

So, I've just tested the above patch myself and I have also caught a panic:
(kgdb) bt
#0  doadump (textdump=1) at pcpu.h:234
#1  0xffffffff808ea45a in kern_reboot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:449
#2  0xffffffff808e9c73 in panic (fmt=<value optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:637
#3  0xffffffff81d9216d in assfail (a=<value optimized out>, f=<value optimized
out>, l=<value optimized out>) at
/usr/src/sys/modules/opensolaris/../../cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81
#4  0xffffffff81ae0385 in arc_release (buf=0xfffffe01420d24c8,
tag=0xfffffe004f4637e0) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:3681
#5  0xffffffff81aed68a in dbuf_dirty (db=0xfffffe004f4637e0,
tx=0xfffffe03707e6000) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1181
#6  0xffffffff81aee68a in dbuf_will_dirty (db=0xfffffe004f4637e0,
tx=0xfffffe03707e6000) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1419
#7  0xffffffff81af4cfc in dmu_write (os=<value optimized out>, object=<value
optimized out>, offset=<value optimized out>, size=8, buf=0xffffff9de0a0f328,
tx=0xfffffe03707e6000)
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:822
#8  0xffffffff81b3f430 in spa_history_write (spa=0xffffff8cd9acf000,
buf=0xffffff9de0a0f328, len=8, shpp=0xfffffe004e1b4a00, tx=0xfffffe03707e6000)
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:169
#9  0xffffffff81b3fdb1 in spa_history_log_sync (arg=<value optimized out>,
tx=0xfffffe03707e6000) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:267
#10 0xffffffff81b3ff4c in log_internal (nvl=0xfffffe067862d700,
operation=0xffffffff81c1777c "snapshot", spa=0xffffff8cd9acf000,
tx=0xfffffe03707e6000, fmt=0xffffffff81c1777b "", adx=<value optimized out>)
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:465
#11 0xffffffff81b401c0 in spa_history_log_internal_ds (ds=0xfffffe0115de7c00,
operation=0xffffffff81c1777c "snapshot", tx=0xfffffe03707e6000,
fmt=0xffffffff81c1777b "")
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:513
#12 0xffffffff81b11b8a in dsl_dataset_snapshot_sync_impl (ds=0xfffffe0074fe8000,
snapname=0xfffffe012e0157c2 "1391545232610_xxx", tx=0xfffffe03707e6000)
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:1193
#13 0xffffffff81b140b3 in dsl_dataset_snapshot_sync (arg=<value optimized out>,
tx=0xfffffe03707e6000) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:1214
#14 0xffffffff81b1ace6 in dsl_sync_task_sync (dst=0xffffff9de178e550,
tx=0xfffffe03707e6000) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c:170
#15 0xffffffff81b19f6b in dsl_pool_sync (dp=0xfffffe002f6ec000, txg=7020757) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:597
#16 0xffffffff81b37dc1 in spa_sync (spa=0xffffff8cd9acf000, txg=7020757) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:6516
#17 0xffffffff81b46f38 in txg_sync_thread (arg=<value optimized out>) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:518
#18 0xffffffff808b866a in fork_exit (callout=0xffffffff81b46c00
<txg_sync_thread>, arg=0xfffffe002f6ec000, frame=0xffffff9de0a0fb00) at
/usr/src/sys/kern/kern_fork.c:990
#19 0xffffffff80ccef8e in fork_trampoline () at
/usr/src/sys/amd64/amd64/exception.S:606
#20 0x0000000000000000 in ?? ()
(kgdb) fr 4
#4  0xffffffff81ae0385 in arc_release (buf=0xfffffe01420d24c8,
tag=0xfffffe004f4637e0) at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:3681
3681                    ASSERT(l2hdr->b_compress == ZIO_COMPRESS_OFF ||
(kgdb) list
3676                        hdr->b_size, 0);
3677                    /*
3678                     * We are destroying l2hdr, so ensure that
3679                     * its compressed buffer, if any, is not leaked.
3680                     */
3681                    ASSERT(l2hdr->b_compress == ZIO_COMPRESS_OFF ||
3682                        l2hdr->b_tmp_cdata == NULL);
3683                    kmem_free(l2hdr, sizeof (l2arc_buf_hdr_t));
3684                    arc_space_return(sizeof (l2arc_buf_hdr_t),
ARC_SPACE_L2HDRS);
3685                    ARCSTAT_INCR(arcstat_l2_size, -buf_size);
(kgdb) p *l2hdr
$1 = {b_dev = 0xfffffe002f33e500, b_daddr = 402763776, b_compress =
ZIO_COMPRESS_LZ4, b_asize = 12288, b_tmp_cdata = 0xffffff8cf2c50000}
(kgdb) p *hdr
$2 = {b_dva = {dva_word = {0, 0}}, b_birth = 0, b_cksum0 = 0, b_freeze_lock =
{lock_object = {lo_name = 0xffffffff81c14ad4 "buf->b_freeze_lock", lo_flags =
40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1},
  b_freeze_cksum = 0x0, b_thawed = 0xfffffe012bd2b160, b_hash_next = 0x0, b_buf
= 0xfffffe01420d24c8, b_flags = 65632, b_datacnt = 1, b_acb = 0x0, b_cv =
{cv_description = 0xffffffff81c14ac8 "buf->b_cv)", cv_waiters = 0},
  b_type = ARC_BUFC_METADATA, b_size = 131072, b_spa = 1333901952715475381,
b_state = 0xffffffff81c2da00, b_arc_node = {list_next = 0x0, list_prev = 0x0},
b_arc_access = 0, b_refcnt = {rc_mtx = {lock_object = {
        lo_name = 0xffffffff81c18b87 "rcmtx", lo_flags = 16973824, lo_data = 0,
lo_witness = 0x0}, mtx_lock = 4}, rc_tracked = 0, rc_list = {list_size = 40,
list_offset = 0, list_head = {list_next = 0xfffffe007a267dc0,
        list_prev = 0xfffffe007a267dc0}}, rc_removed = {list_size = 40,
list_offset = 0, list_head = {list_next = 0xfffffe007a267de0, list_prev =
0xfffffe007a267de0}}, rc_count = 1, rc_removed_count = 0}, b_l2hdr = 0x0,
  b_l2node = {list_next = 0x0, list_prev = 0x0}}
(kgdb) p/x *hdr
$3 = {b_dva = {dva_word = {0x0, 0x0}}, b_birth = 0x0, b_cksum0 = 0x0,
b_freeze_lock = {lock_object = {lo_name = 0xffffffff81c14ad4, lo_flags =
0x2710000, lo_data = 0x0, lo_witness = 0x0}, sx_lock = 0x1}, b_freeze_cksum = 
0x0,
  b_thawed = 0xfffffe012bd2b160, b_hash_next = 0x0, b_buf = 0xfffffe01420d24c8,
b_flags = 0x10060, b_datacnt = 0x1, b_acb = 0x0, b_cv = {cv_description =
0xffffffff81c14ac8, cv_waiters = 0x0}, b_type = 0x1, b_size = 0x20000,
  b_spa = 0x1282f8c238ce49b5, b_state = 0xffffffff81c2da00, b_arc_node =
{list_next = 0x0, list_prev = 0x0}, b_arc_access = 0x0, b_refcnt = {rc_mtx =
{lock_object = {lo_name = 0xffffffff81c18b87, lo_flags = 0x1030000,
        lo_data = 0x0, lo_witness = 0x0}, mtx_lock = 0x4}, rc_tracked = 0x0,
rc_list = {list_size = 0x28, list_offset = 0x0, list_head = {list_next =
0xfffffe007a267dc0, list_prev = 0xfffffe007a267dc0}}, rc_removed = {
      list_size = 0x28, list_offset = 0x0, list_head = {list_next =
0xfffffe007a267de0, list_prev = 0xfffffe007a267de0}}, rc_count = 0x1,
rc_removed_count = 0x0}, b_l2hdr = 0x0, b_l2node = {list_next = 0x0, list_prev =
0x0}}
(kgdb) p hdr
$4 = (arc_buf_hdr_t *) 0xfffffe007a267cd0
(kgdb) p buf->b_hdr
$5 = (arc_buf_hdr_t *) 0xfffffe007a267cd0
(kgdb) info symbol hdr->b_state
ARC_anon in section .bss

b_flags value translates to ARC_L2CACHE | ARC_L2COMPRESS | ARC_L2_WRITING if I
am not mistaken.  b_compress is ZIO_COMPRESS_LZ4.
To me this looks like at least arc_release() can definitely be called
concurrently with l2arc_write_buffers() and both can be acting on the same
buffer.  If arc_release() "wins" then l2hdr->b_tmp_cdata is leaked.

-- 
Andriy Gapon
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to