Moore, Joe wrote:
> I've recently upgraded my x4500 to Nevada build 97, and am having problems 
> with the iscsi target.
>
> Background: this box is used to serve NFS underlying a VMware ESX environment 
> (zfs filesystem-type datasets) and presents iSCSI targets (zfs zvol datasets) 
> for a Windows host and to act as zoneroots for Solaris 10 hosts.  For optimal 
> random-read performance, I've configured a single zfs pool of mirrored VDEVs 
> of all 44 disks (+2 boot disks, +2 spares = 48)
>
> Before the upgrade, the box was flaky under load: all I/Os to the ZFS pool 
> would stop occasionally.
>
> Since the upgrade, that hasn't happened, and the NFS clients are quite happy. 
>  The iSCSI initiators are not.
>
> The windows initiator is running the Microsoft iSCSI initiator v2.0.6 on 
> Windows 2003 SP2 x64 Enterprise Edition.  When the system reboots, it is not 
> able to connect to its iscsi targets.  No devices are found until I restart 
> the iscsitgt process on the x4500, at which point the initiator will 
> reconnect and find everything.  I notice that on the x4500, it maintains an 
> active TCP connection (according to netstat -an | grep 3260) to the Windows 
> box through the reboot and for a long time afterwards.  The initiator starts 
> a second connection, but it seems that the target doesn't let go of the old 
> one.  Or something.  At this point, every time I reboot the Windows system I 
> have to `pkill iscsitgtd`
>   
> The Solaris system is running S10 Update 4.  Every once in a while (twice 
> today, and not correlated with the pkill's above) the system reports that all 
> of the iscsi disks are unavailable.  Nothing I've tried short of a reboot of 
> the whole host brings them back.  All of the zones on the system remount 
> their zoneroots read-only (and give I/O errors when read or zlogin'd to)
>
> There are a set of TCP connections from the zonehost to the x4500 that remain 
> even through disabling the iscsi_initiator service.  There's no process 
> holding them as far as pfiles can tell.
>
> Does this sound familiar to anyone?  Any suggestions on what I can do to 
> troubleshoot further?  I have a kernel dump from the zonehost and a snoop 
> capture of the wire for the Windows host (but it's big).
>   
I believe the problem you're seeing might be related to deadlock 
condition (CR 6745310), if you run pstack on the
iscsi target  daemon you might find a bunch of zombie threads.  The fix 
is putback to snv-99, give snv-99 a try.

-Tim

> I'll be opening a bug too.
>
> Thanks,
> --Joe
> _______________________________________________
> storage-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss
>   

_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to