On 2/4/14, 10:15 PM, Andriy Gapon wrote:
> on 04/02/2014 19:32 Andriy Gapon said the following:
>> I suggested the following diagnostic patch:
>> http://people.freebsd.org/~avg/l2arc-b_tmp_cdata-diag.patch
>>
>> With the patch applied there is the following panic (sorry for poor quality):
>> http://oi59.tinypic.com/sfctvc.jpg
>>
>> I am trying to get more details.
> 
> So, I've just tested the above patch myself and I have also caught a panic:
> (kgdb) bt
> #0  doadump (textdump=1) at pcpu.h:234
> #1  0xffffffff808ea45a in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:449
> #2  0xffffffff808e9c73 in panic (fmt=<value optimized out>) at
> /usr/src/sys/kern/kern_shutdown.c:637
> #3  0xffffffff81d9216d in assfail (a=<value optimized out>, f=<value optimized
> out>, l=<value optimized out>) at
> /usr/src/sys/modules/opensolaris/../../cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81
> #4  0xffffffff81ae0385 in arc_release (buf=0xfffffe01420d24c8,
> tag=0xfffffe004f4637e0) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:3681
> #5  0xffffffff81aed68a in dbuf_dirty (db=0xfffffe004f4637e0,
> tx=0xfffffe03707e6000) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1181
> #6  0xffffffff81aee68a in dbuf_will_dirty (db=0xfffffe004f4637e0,
> tx=0xfffffe03707e6000) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1419
> #7  0xffffffff81af4cfc in dmu_write (os=<value optimized out>, object=<value
> optimized out>, offset=<value optimized out>, size=8, buf=0xffffff9de0a0f328,
> tx=0xfffffe03707e6000)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:822
> #8  0xffffffff81b3f430 in spa_history_write (spa=0xffffff8cd9acf000,
> buf=0xffffff9de0a0f328, len=8, shpp=0xfffffe004e1b4a00, tx=0xfffffe03707e6000)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:169
> #9  0xffffffff81b3fdb1 in spa_history_log_sync (arg=<value optimized out>,
> tx=0xfffffe03707e6000) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:267
> #10 0xffffffff81b3ff4c in log_internal (nvl=0xfffffe067862d700,
> operation=0xffffffff81c1777c "snapshot", spa=0xffffff8cd9acf000,
> tx=0xfffffe03707e6000, fmt=0xffffffff81c1777b "", adx=<value optimized out>)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:465
> #11 0xffffffff81b401c0 in spa_history_log_internal_ds (ds=0xfffffe0115de7c00,
> operation=0xffffffff81c1777c "snapshot", tx=0xfffffe03707e6000,
> fmt=0xffffffff81c1777b "")
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c:513
> #12 0xffffffff81b11b8a in dsl_dataset_snapshot_sync_impl 
> (ds=0xfffffe0074fe8000,
> snapname=0xfffffe012e0157c2 "1391545232610_xxx", tx=0xfffffe03707e6000)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:1193
> #13 0xffffffff81b140b3 in dsl_dataset_snapshot_sync (arg=<value optimized 
> out>,
> tx=0xfffffe03707e6000) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:1214
> #14 0xffffffff81b1ace6 in dsl_sync_task_sync (dst=0xffffff9de178e550,
> tx=0xfffffe03707e6000) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c:170
> #15 0xffffffff81b19f6b in dsl_pool_sync (dp=0xfffffe002f6ec000, txg=7020757) 
> at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:597
> #16 0xffffffff81b37dc1 in spa_sync (spa=0xffffff8cd9acf000, txg=7020757) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:6516
> #17 0xffffffff81b46f38 in txg_sync_thread (arg=<value optimized out>) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:518
> #18 0xffffffff808b866a in fork_exit (callout=0xffffffff81b46c00
> <txg_sync_thread>, arg=0xfffffe002f6ec000, frame=0xffffff9de0a0fb00) at
> /usr/src/sys/kern/kern_fork.c:990
> #19 0xffffffff80ccef8e in fork_trampoline () at
> /usr/src/sys/amd64/amd64/exception.S:606
> #20 0x0000000000000000 in ?? ()
> (kgdb) fr 4
> #4  0xffffffff81ae0385 in arc_release (buf=0xfffffe01420d24c8,
> tag=0xfffffe004f4637e0) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:3681
> 3681                    ASSERT(l2hdr->b_compress == ZIO_COMPRESS_OFF ||
> (kgdb) list
> 3676                        hdr->b_size, 0);
> 3677                    /*
> 3678                     * We are destroying l2hdr, so ensure that
> 3679                     * its compressed buffer, if any, is not leaked.
> 3680                     */
> 3681                    ASSERT(l2hdr->b_compress == ZIO_COMPRESS_OFF ||
> 3682                        l2hdr->b_tmp_cdata == NULL);
> 3683                    kmem_free(l2hdr, sizeof (l2arc_buf_hdr_t));
> 3684                    arc_space_return(sizeof (l2arc_buf_hdr_t),
> ARC_SPACE_L2HDRS);
> 3685                    ARCSTAT_INCR(arcstat_l2_size, -buf_size);
> (kgdb) p *l2hdr
> $1 = {b_dev = 0xfffffe002f33e500, b_daddr = 402763776, b_compress =
> ZIO_COMPRESS_LZ4, b_asize = 12288, b_tmp_cdata = 0xffffff8cf2c50000}
> (kgdb) p *hdr
> $2 = {b_dva = {dva_word = {0, 0}}, b_birth = 0, b_cksum0 = 0, b_freeze_lock =
> {lock_object = {lo_name = 0xffffffff81c14ad4 "buf->b_freeze_lock", lo_flags =
> 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1},
>   b_freeze_cksum = 0x0, b_thawed = 0xfffffe012bd2b160, b_hash_next = 0x0, 
> b_buf
> = 0xfffffe01420d24c8, b_flags = 65632, b_datacnt = 1, b_acb = 0x0, b_cv =
> {cv_description = 0xffffffff81c14ac8 "buf->b_cv)", cv_waiters = 0},
>   b_type = ARC_BUFC_METADATA, b_size = 131072, b_spa = 1333901952715475381,
> b_state = 0xffffffff81c2da00, b_arc_node = {list_next = 0x0, list_prev = 0x0},
> b_arc_access = 0, b_refcnt = {rc_mtx = {lock_object = {
>         lo_name = 0xffffffff81c18b87 "rcmtx", lo_flags = 16973824, lo_data = 
> 0,
> lo_witness = 0x0}, mtx_lock = 4}, rc_tracked = 0, rc_list = {list_size = 40,
> list_offset = 0, list_head = {list_next = 0xfffffe007a267dc0,
>         list_prev = 0xfffffe007a267dc0}}, rc_removed = {list_size = 40,
> list_offset = 0, list_head = {list_next = 0xfffffe007a267de0, list_prev =
> 0xfffffe007a267de0}}, rc_count = 1, rc_removed_count = 0}, b_l2hdr = 0x0,
>   b_l2node = {list_next = 0x0, list_prev = 0x0}}
> (kgdb) p/x *hdr
> $3 = {b_dva = {dva_word = {0x0, 0x0}}, b_birth = 0x0, b_cksum0 = 0x0,
> b_freeze_lock = {lock_object = {lo_name = 0xffffffff81c14ad4, lo_flags =
> 0x2710000, lo_data = 0x0, lo_witness = 0x0}, sx_lock = 0x1}, b_freeze_cksum = 
> 0x0,
>   b_thawed = 0xfffffe012bd2b160, b_hash_next = 0x0, b_buf = 
> 0xfffffe01420d24c8,
> b_flags = 0x10060, b_datacnt = 0x1, b_acb = 0x0, b_cv = {cv_description =
> 0xffffffff81c14ac8, cv_waiters = 0x0}, b_type = 0x1, b_size = 0x20000,
>   b_spa = 0x1282f8c238ce49b5, b_state = 0xffffffff81c2da00, b_arc_node =
> {list_next = 0x0, list_prev = 0x0}, b_arc_access = 0x0, b_refcnt = {rc_mtx =
> {lock_object = {lo_name = 0xffffffff81c18b87, lo_flags = 0x1030000,
>         lo_data = 0x0, lo_witness = 0x0}, mtx_lock = 0x4}, rc_tracked = 0x0,
> rc_list = {list_size = 0x28, list_offset = 0x0, list_head = {list_next =
> 0xfffffe007a267dc0, list_prev = 0xfffffe007a267dc0}}, rc_removed = {
>       list_size = 0x28, list_offset = 0x0, list_head = {list_next =
> 0xfffffe007a267de0, list_prev = 0xfffffe007a267de0}}, rc_count = 0x1,
> rc_removed_count = 0x0}, b_l2hdr = 0x0, b_l2node = {list_next = 0x0, 
> list_prev =
> 0x0}}
> (kgdb) p hdr
> $4 = (arc_buf_hdr_t *) 0xfffffe007a267cd0
> (kgdb) p buf->b_hdr
> $5 = (arc_buf_hdr_t *) 0xfffffe007a267cd0
> (kgdb) info symbol hdr->b_state
> ARC_anon in section .bss
> 
> b_flags value translates to ARC_L2CACHE | ARC_L2COMPRESS | ARC_L2_WRITING if I
> am not mistaken.  b_compress is ZIO_COMPRESS_LZ4.
> To me this looks like at least arc_release() can definitely be called
> concurrently with l2arc_write_buffers() and both can be acting on the same
> buffer.  If arc_release() "wins" then l2hdr->b_tmp_cdata is leaked.
> 

Ah, now I understand what's going on. The application is dirtying a dbuf
who's backing ARC buffer is currently in-flight into the L2ARC, so it
needs to be discarded. Instead of going through arc_evict, it calls
arc_release directly to dispose of the buffer, but it doesn't clear the
in-flight compressed data.

So the solution would be obvious: if there's a compressed buffer, call
l2arc_release_cdata_buf to clear it.

-- 
Saso
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to