Fix looks good to me.  We made this change internally at Delphix a few
months back and we'll push it to illumos within the next few months.  We
are nearing the end of a release so we haven't had as much time to push
stuff to illumos.  If someone wants to push this before we get to it,
please feel free.

(Chris Siden: Delphix bug is 28702.)

Here is my analysis:

assertion failed: zb->zb_object <= td->td_resume->zb_object (0xffffffffffffffff
<= 0xfffffffffffffffe), file: ../../common/fs/zfs/dmu_traverse.c, line: 166

ffffff0008f32440 genunix:strlog+0 ()
ffffff0008f32490 zfs:resume_skip_check+d8 ()
ffffff0008f32570 zfs:traverse_visitbp+57 ()
ffffff0008f32600 zfs:traverse_dnode+8b ()
ffffff0008f326e0 zfs:traverse_visitbp+805 ()
ffffff0008f32820 zfs:traverse_impl+1bf ()
ffffff0008f32880 zfs:traverse_dataset_destroyed+49 ()
ffffff0008f32a00 zfs:bptree_iterate+1b9 ()
ffffff0008f32a70 zfs:dsl_scan_sync+401 ()
ffffff0008f32b50 zfs:spa_sync+344 ()
ffffff0008f32c20 zfs:txg_sync_thread+260 ()
ffffff0008f32c30 unix:thread_start+8 ()

The problem is that traverse_visitbp() visits the USERUSED and GROUPUSED
objects in the wrong order; they should be reversed, because GROUPUSED <
USERUSED.

Typically there will be just one block in the GROUPUSED object, so it's pretty
unlikely to pause on exactly this block.  But I think you'd hit this every time
we do.

The problem can be reproduced by setting zfs_free_max_blocks to 1 and then
destroying a filesystem.




On Thu, Jan 16, 2014 at 4:36 AM, Andriy Gapon <[email protected]> wrote:

>
> How does the following change look to you?
> If it's good, is anyone up to integrating it into illumos?
>
> Thanks!
>
> commit 380a6d6c42ff89f623d44fc78e67370012eb960e
> Author: Andriy Gapon <[email protected]>
> Date:   Mon Oct 14 09:39:15 2013 +0300
>
>     traverse_visitbp: visit DMU_GROUPUSED_OBJECT before DMU_USERUSED_OBJECT
>
>     This is done to ensure that visited object IDs are always increasing.
>     Also, pass correct object ID to prefetch_dnode_metadata for
>     os_groupused_dnode.
>
> diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
> b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
> index 1ff47c8..a0ca63f 100644
> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c
> @@ -329,9 +329,9 @@ traverse_visitbp(traverse_data_t *td, const
> dnode_phys_t *dnp,
>                 prefetch_dnode_metadata(td, dnp, zb->zb_objset,
>                     DMU_META_DNODE_OBJECT);
>                 if (arc_buf_size(buf) >= sizeof (objset_phys_t)) {
> -                       prefetch_dnode_metadata(td,
> &osp->os_userused_dnode,
> -                           zb->zb_objset, DMU_USERUSED_OBJECT);
>                         prefetch_dnode_metadata(td,
> &osp->os_groupused_dnode,
> +                           zb->zb_objset, DMU_GROUPUSED_OBJECT);
> +                       prefetch_dnode_metadata(td,
> &osp->os_userused_dnode,
>                             zb->zb_objset, DMU_USERUSED_OBJECT);
>                 }
>
> @@ -342,18 +342,18 @@ traverse_visitbp(traverse_data_t *td, const
> dnode_phys_t *dnp,
>                         err = 0;
>                 }
>                 if (err == 0 && arc_buf_size(buf) >= sizeof
> (objset_phys_t)) {
> -                       dnp = &osp->os_userused_dnode;
> +                       dnp = &osp->os_groupused_dnode;
>                         err = traverse_dnode(td, dnp, zb->zb_objset,
> -                           DMU_USERUSED_OBJECT);
> +                           DMU_GROUPUSED_OBJECT);
>                 }
>                 if (err && hard) {
>                         lasterr = err;
>                         err = 0;
>                 }
>                 if (err == 0 && arc_buf_size(buf) >= sizeof
> (objset_phys_t)) {
> -                       dnp = &osp->os_groupused_dnode;
> +                       dnp = &osp->os_userused_dnode;
>                         err = traverse_dnode(td, dnp, zb->zb_objset,
> -                           DMU_GROUPUSED_OBJECT);
> +                           DMU_USERUSED_OBJECT);
>                 }
>         }
>
>
> --
> Andriy Gapon
>
> _______________________________________________
> developer mailing list
> [email protected]
> http://lists.open-zfs.org/mailman/listinfo/developer
>
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to