Re: [zfs-discuss] [storage-discuss] iscsi target problems on snv_97
Moore, Joe wrote: I believe the problem you're seeing might be related to deadlock condition (CR 6745310), if you run pstack on the iscsi target daemon you might find a bunch of zombie threads. The fix is putback to snv-99, give snv-99 a try. Yes, a pstack of the core I've generated from iscsitgtd does have a number of zombie threads. I'm afraid I can't make heads nor tails of the bug report at http://bugs.opensolaris.org/view_bug.do?bug_id=6658836 nor its duplicate-of 6745310, nor any of the related bugs (all are unavailable except for 6676298, and the stack trace reported in that bug doesn't look anything like mine. As far as I can tell snv-98 is the latest build, from Sep 10 according to http://dlc.sun.com/osol/on/downloads/. So snv-99 should be out next week, correct? snv-99 should be out next week. Anything I can do in the mean time? Do I need to BFU to the latest nightly build? Or would just taking the iscsitgtd from that build suffice? You could try snv-98. You don't need to bfu, just get the latest iscsitgtd. -Tim --Joe ___ storage-discuss mailing list [EMAIL PROTECTED] http://mail.opensolaris.org/mailman/listinfo/storage-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [storage-discuss] iscsi target problems on snv_97
Moore, Joe wrote: I've recently upgraded my x4500 to Nevada build 97, and am having problems with the iscsi target. Background: this box is used to serve NFS underlying a VMware ESX environment (zfs filesystem-type datasets) and presents iSCSI targets (zfs zvol datasets) for a Windows host and to act as zoneroots for Solaris 10 hosts. For optimal random-read performance, I've configured a single zfs pool of mirrored VDEVs of all 44 disks (+2 boot disks, +2 spares = 48) Before the upgrade, the box was flaky under load: all I/Os to the ZFS pool would stop occasionally. Since the upgrade, that hasn't happened, and the NFS clients are quite happy. The iSCSI initiators are not. The windows initiator is running the Microsoft iSCSI initiator v2.0.6 on Windows 2003 SP2 x64 Enterprise Edition. When the system reboots, it is not able to connect to its iscsi targets. No devices are found until I restart the iscsitgt process on the x4500, at which point the initiator will reconnect and find everything. I notice that on the x4500, it maintains an active TCP connection (according to netstat -an | grep 3260) to the Windows box through the reboot and for a long time afterwards. The initiator starts a second connection, but it seems that the target doesn't let go of the old one. Or something. At this point, every time I reboot the Windows system I have to `pkill iscsitgtd` The Solaris system is running S10 Update 4. Every once in a while (twice today, and not correlated with the pkill's above) the system reports that all of the iscsi disks are unavailable. Nothing I've tried short of a reboot of the whole host brings them back. All of the zones on the system remount their zoneroots read-only (and give I/O errors when read or zlogin'd to) There are a set of TCP connections from the zonehost to the x4500 that remain even through disabling the iscsi_initiator service. There's no process holding them as far as pfiles can tell. Does this sound familiar to anyone? Any suggestions on what I can do to troubleshoot further? I have a kernel dump from the zonehost and a snoop capture of the wire for the Windows host (but it's big). I believe the problem you're seeing might be related to deadlock condition (CR 6745310), if you run pstack on the iscsi target daemon you might find a bunch of zombie threads. The fix is putback to snv-99, give snv-99 a try. -Tim I'll be opening a bug too. Thanks, --Joe ___ storage-discuss mailing list [EMAIL PROTECTED] http://mail.opensolaris.org/mailman/listinfo/storage-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: iSCSI target not coming back up (was Fwd: [zfs-discuss] Re: snv63: kernel panic on import)
Nigel, Was the iSCSI target daemon running and the targets are gone? or did the daemon core repeatedly? How did you created the targets? -tim eric kustarz wrote: Hi Tim, Is the iSCSI target not coming back up after a reboot a known problem? Can you take a look? eric Begin forwarded message: From: eric kustarz [EMAIL PROTECTED] Date: May 16, 2007 8:56:44 AM PDT To: Nigel Smith [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Re: snv63: kernel panic on import On May 15, 2007, at 4:49 PM, Nigel Smith wrote: I seem to have got the same core dump, in a different way. I had a zpool setup on a iscsi 'disk'. For details see: http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/001162.html But after a reboot the iscsi target was not longer available, so the iscsi initiator could not provide the disk that he zpool was based on. I did a 'zpool status', but the PC just rebooted, rather than handling it in a graceful way. After the reboot I discover a core dump has been created - details below: ZFS panic'ing on a failed write in a non-redundant pool is known and is being worked on. Why the iSCSI device didn't come up is also a bug. I'll ask the iSCSI people to take a look... eric # cat /etc/release Solaris Nevada snv_60 X86 Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 12 March 2007 # # cd /var/crash/solaris # mdb -k 1 Loading modules: [ unix genunix specfs dtrace uppc pcplusmp scsi_vhci ufs ip hook neti sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip fcp logindmux ptm sppp emlxs ipc ] ::status debugging crash dump vmcore.1 (64-bit) from solaris operating system: 5.11 snv_60 (i86pc) panic message: ZFS: I/O failure (write on unknown off 0: zio fffec38cf340 [L0 packed nvlist] 4000L/600P DVA[0]=0:160225800:600 DVA[1]=0:9800:600 fletcher4 lzjb LE contiguous birth=192896 fill=1 cksum=6b28 dump content: kernel pages only *panic_thread::findstack -v stack pointer for thread ff00025b2c80: ff00025b28f0 ff00025b29e0 panic+0x9c() ff00025b2a40 zio_done+0x17c(fffec38cf340) ff00025b2a60 zio_next_stage+0xb3(fffec38cf340) ff00025b2ab0 zio_wait_for_children+0x5d(fffec38cf340, 11, fffec38cf598) ff00025b2ad0 zio_wait_children_done+0x20(fffec38cf340) ff00025b2af0 zio_next_stage+0xb3(fffec38cf340) ff00025b2b40 zio_vdev_io_assess+0x129(fffec38cf340) ff00025b2b60 zio_next_stage+0xb3(fffec38cf340) ff00025b2bb0 vdev_mirror_io_done+0x2af(fffec38cf340) ff00025b2bd0 zio_vdev_io_done+0x26(fffec38cf340) ff00025b2c60 taskq_thread+0x1a7(fffec154f018) ff00025b2c70 thread_start+8() ::cpuinfo -v ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 0 fbc31f80 1b20 99 nono t-0 ff00025b2c80 sched || RUNNING --++-- PRI THREAD PROC READY60 ff00022c9c80 sched EXISTS60 ff00020e9c80 sched ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 1 fffec11ad000 1f30 59 yesno t-0 fffec3dcbbc0 syslogd || RUNNING --++-- PRI THREAD PROC READY60 ff000212bc80 sched QUIESCED59 fffec1e51360 syslogd EXISTS59 fffec1ec2180 syslogd ENABLE ::quit This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss