Re: [OpenZFS Developer] zfs send blocks and/or kernel panic (illumos)

Matthew Ahrens Thu, 13 Feb 2014 09:11:17 -0800

I took a look at the dumps.  There are 2 problems here.  First is that
zpool on zpool on the same machine is not really supported.  In the
pool_create dump, you are hitting a deadlock on the spa_namespace_lock,
between the 2 threads listed below.  In the zfs_send dump, it looks like
there is a deadlock on the zfsdev_state_lock.


This is causing the system to hang, causing a timeout in the iscsi code.
 This is hitting a different bug in iscsi which is causing the panic.  It
looks like some thread has forgotten to release the ic_state_mutex.

I imagine you could get iscsi out of the picture by using the LUN directly
as /dev/zvol/dsk/..., but you'd still hit the problem of pool on pool on
the same machine.  I think some development work would be required to make
this use case work correctly.

--matt

stack pointer for thread ffffff0007b07c40: ffffff0007b07380
[ ffffff0007b07380 _resume_from_idle+0xf4() ]
  ffffff0007b073b0 swtch+0x141()
  ffffff0007b07460 turnstile_block+0x262(0, 0, fffffffffbd33058,
fffffffffbc07980, 0, 0)
  ffffff0007b074d0 mutex_vector_enter+0x3c5(fffffffffbd33058)
  ffffff0007b07570 spa_open_common+0x79()
  ffffff0007b075a0 spa_open+0x1e(ffffff01eb4e8000, ffffff0007b075b8,
fffffffff7a58fb0)
  ffffff0007b075f0 pool_status_check+0x48(ffffff01eb4e8000, 2, 6)
  ffffff0007b076a0 zfsdev_ioctl+0x3f6(ac00000000, 5a16, ffffff01ebf7f000,
80200000, ffffff01cd041db0, ffffff0007b0777c)
  ffffff0007b076e0 cdev_ioctl+0x39(ac00000000, 5a16, ffffff01ebf7f000,
80200000, ffffff01cd041db0, ffffff0007b0777c)
  ffffff0007b07740 ldi_ioctl+0x88(ffffff01cfc8d840, 5a16, ffffff01ebf7f000,
80000000, ffffff01cd041db0, ffffff0007b0777c)
  ffffff0007b077b0 sbd_zvolset+0x136(ffffff01ec2ce878, ffffff01dcd6a000)
  ffffff0007b077f0 sbd_update_zfs_prop+0xc0(ffffff01ebc06418)
  ffffff0007b07850 sbd_write_zfs_meta+0xd9()
  ffffff0007b07900 sbd_write_meta+0x1d2(ffffff01ebc06418, 30, 19d,
ffffff01deae5100)
  ffffff0007b079b0 sbd_write_meta_section+0x30e(ffffff01ebc06418,
ffffff01deae5100)
  ffffff0007b07a20 sbd_write_lu_info+0x201(ffffff01ebc06418)
  ffffff0007b07a70 sbd_handle_mode_select_xfer+0x180(ffffff01ebfe0000,
ffffff01edae2e80, 18)
  ffffff0007b07aa0
sbd_handle_short_write_xfer_completion+0x10f(ffffff01ebfe0000,
ffffff01e2b96428)
  ffffff0007b07b00 sbd_handle_short_write_transfers+0x7f(ffffff01ebfe0000,
ffffff01e2b96428, 18)
  ffffff0007b07b20 sbd_handle_mode_select+0x3d(ffffff01ebfe0000,
ffffff01e2b96428)
  ffffff0007b07b80 sbd_new_task+0x889(ffffff01ebfe0000, ffffff01e2b96428)
  ffffff0007b07c20 stmf_worker_task+0x33a(ffffff01cf4530f0)
  ffffff0007b07c30 thread_start+8()


stack pointer for thread ffffff01df5b7480: ffffff0009c14830
[ ffffff0009c14830 _resume_from_idle+0xf4() ]
  ffffff0009c14860 swtch+0x141()
  ffffff0009c148a0 cv_wait+0x70(ffffff01ec2c62ba, ffffff01ec2c62a8)
  ffffff0009c148e0 taskq_wait+0x43(ffffff01ec2c6288)
  ffffff0009c14930 taskq_destroy+0x6c(ffffff01ec2c6288)
  ffffff0009c14980 vdev_open_children+0x119(ffffff01dfb2ca80)
  ffffff0009c149e0 vdev_root_open+0x7d(ffffff01dfb2ca80, ffffff0009c14a08,
ffffff0009c14a00, ffffff0009c149f8)
  ffffff0009c14a50 vdev_open+0xed(ffffff01dfb2ca80)
  ffffff0009c14ab0 vdev_create+0x2e(ffffff01dfb2ca80, 4, 0)
  ffffff0009c14b70 spa_create+0x238()
  ffffff0009c14bd0 zfs_ioc_pool_create+0x181(ffffff01eb4ec000)
  ffffff0009c14c80 zfsdev_ioctl+0x4a7(ac00000000, 5a00, 80426d0, 100003,
ffffff01d726ee78, ffffff0009c14e68)
  ffffff0009c14cc0 cdev_ioctl+0x39(ac00000000, 5a00, 80426d0, 100003,
ffffff01d726ee78, ffffff0009c14e68)
  ffffff0009c14d10 spec_ioctl+0x60(ffffff01cfc00480, 5a00, 80426d0, 100003,
ffffff01d726ee78, ffffff0009c14e68, 0)
  ffffff0009c14da0 fop_ioctl+0x55(ffffff01cfc00480, 5a00, 80426d0, 100003,
ffffff01d726ee78, ffffff0009c14e68, 0)
  ffffff0009c14ec0 ioctl+0x9b(3, 5a00, 80426d0)
  ffffff0009c14f10 _sys_sysenter_post_swapgs+0x149()



from the zfs_send dump:

stack pointer for thread ffffff014d37fae0: ffffff0004766810
[ ffffff0004766810 _resume_from_idle+0xf1() ]
  ffffff0004766840 swtch+0x141()
  ffffff0004766880 cv_wait+0x70(ffffff0153865b92, ffffff0153865b58)
  ffffff00047668d0 txg_wait_synced+0x83(ffffff01538659c0, e11)
  ffffff00047669e0 dsl_sync_task+0x187()
  ffffff0004766b50 dsl_dataset_user_release_tmp+0xa5()
  ffffff0004766b90 dsl_dataset_user_release_onexit+0xa2(ffffff014d3bca40)
  ffffff0004766bd0 zfs_onexit_destroy+0x43(ffffff0156b884c8)
  ffffff0004766c00 zfs_ctldev_destroy+0x18(ffffff0156b884c8, 4)
  ffffff0004766c60 zfsdev_close+0x89(ac00000004, 403, 2, ffffff01484d1178)
  ffffff0004766c90 dev_close+0x31(ac00000004, 403, 2, ffffff01484d1178)
  ffffff0004766ce0 device_close+0xd8(ffffff014b936c40, 403,
ffffff01484d1178)
  ffffff0004766d70 spec_close+0x17b(ffffff014b936c40, 403, 1, 0,
ffffff01484d1178, 0)
  ffffff0004766df0 fop_close+0x61(ffffff014b936c40, 403, 1, 0,
ffffff01484d1178, 0)
  ffffff0004766e30 closef+0x5e(ffffff014afc8750)
  ffffff0004766ea0 closeandsetf+0x398(8, 0)
  ffffff0004766ec0 close+0x13(8)
  ffffff0004766f10 _sys_sysenter_post_swapgs+0x149()

stack pointer for thread ffffff0006367c40: ffffff0006367680
[ ffffff0006367680 _resume_from_idle+0xf1() ]
  ffffff00063676b0 swtch+0x141()
  ffffff0006367760 turnstile_block+0x262(0, 0, fffffffffbd32b00,
fffffffffbc07980, 0, 0)
  ffffff00063677d0 mutex_vector_enter+0x3c5(fffffffffbd32b00)
  ffffff00063678d0 zvol_ioctl+0x4f()
  ffffff0006367980 zfsdev_ioctl+0x2ec(ac00000003, 422, 0, 80000000,
ffffff01484d1db0, ffffff0006367acc)
  ffffff00063679c0 cdev_ioctl+0x39(ac00000003, 422, 0, 80000000,
ffffff01484d1db0, ffffff0006367acc)
  ffffff0006367a10 spec_ioctl+0x60(ffffff014b936e40, 422, 0, 80000000,
ffffff01484d1db0, ffffff0006367acc, 0)
  ffffff0006367aa0 fop_ioctl+0x55(ffffff014b936e40, 422, 0, 80000000,
ffffff01484d1db0, ffffff0006367acc, 0)
  ffffff0006367af0 sbd_flush_data_cache+0xa1(ffffff0153127298, 0)
  ffffff0006367b20 sbd_handle_sync_cache+0xf1(ffffff014b056000, 0)
  ffffff0006367b80 sbd_new_task+0x8f5(ffffff014b056000, 0)
  ffffff0006367c20 stmf_worker_task+0x33a(ffffff01497580a0)
  ffffff0006367c30 thread_start+8()



On Thu, Feb 13, 2014 at 2:31 AM, Franz Schober <[email protected]>wrote:

>
>  Can you make the crash dumps available?  I only see text output attached
>> to the bug.
>>
> You find it under:
>
> http://images.firmos.at/download/crash_zfs_send/vmdump.0
> http://images.firmos.at/download/crash_zpool_create/vmdump.0
>
> We simulated the bug using VMware Fusion on a Mac,
> but it is the same as on the datacenter hardware.
>
>
>> Any particular reason that you're creating a pool on top of an iscsi
>> device which is backed by a zvol on a pool on the same machine?  Have you
>> tried it using 2 different machines?
>>
>>  Yes it has an economical reason, we have a setup with two rather
> expensive servers.
> (Dual E5-2680, 256GB RAM, 6 Disk Chassis ZEUS RAM Drive for ZIL)
>
> The SAS disks are arranged in a zpool with RAIDZ2 vdevs. This pool exports
> a large ZVOL as a LUN via FC.
> A mirror of two LUNs form a synchronous mirrored pool, one LUN is local on
> the same server and one remote on the other location.
>
> The pool is imported on exclusively one machine and can be transparently
> failovered for a VMWare Host that imports via NFS doing forced sync writes.
>
> Everything works very well, performance is very good (not comparable to
> async replication alone, but with high availability)
> Alone when we "ZFS send" a snapshot to a backup system the problem occurs.
>
> In the mirrored pool in the datacenter, the "zfs send" hangs also until
> timeouts on the local provided LUN occur
> and the mirrored pool took the local LUN offline.
>
> A design with 2 machines on each location(one with the diskpool, and one
> with the mirrored pool) is tested and works -
> but if the mentioned problem would not occur we could save power, costs
> and rackspace.
>
> Is there a reason why this setup would be a bad idea, or are we just
> hitting something no one has tried before?
>
> Thx,
> Franz
>
>
>

_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Re: [OpenZFS Developer] zfs send blocks and/or kernel panic (illumos)

Reply via email to