I've witnessed several occasions in which ZFS Zpool's effectively "hang"
(commands don't return, etc).  The cause is apparent when you look at an
iostat... one disk is "hung up", the "busy" time is 100% but no IO is
passing, all other disks are idle. 

The only way I've found to fix this is to use cfgadm to send a sata
disconnect to the drive, then things get back to business.

The question is, why doesn't Solaris handle the situation? 

I'm running snv_89 currently (upgraded from snv_43) on a test system.  I
suspect some lock contention around this, but lockstat causes the system
to hang (fixed by 6697756 in snv_90), so I'm not sure exactly what.


This is troubling because it means one disk can go wonky and render your
storage system useless until someone can respond, and I'd imagine most
admins would "solve" the problem via reboot, a very poor solution.

benr.
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to