On 17.04.2010 00:34, Frank Middleton wrote: > On 04/15/10 12:19 AM, James C.McPherson wrote: > >> From what I've observed, the consistent feature of this >> problem (where no SAS expanders are involved) is the >> use of WD disk drives. > > If I run Open Solaris (and ZFS) on a COMSTAR initiator, run a > scrub, and also run a scrub on the ZFS target, the iops get > cranked up to a sustained 130 or so and eventually something > very similar to CR6894775 may occur. Setting /etc/system > zfs:zfs_vdev_max_pending = 4 definitely reduces the frequency > of it happening but eventually it does if you do it often enough. > > This on a simple whole-disk mirror of Seagate 7200 RPM drives > on SPARC running SXCE snv125 with 4GB. Both disks are on the > same controller with no expanders. So if this is the same problem > (and I'm pretty certain it is) then it happens on (these) Seagates, > too. > > It seems to require a power cycle to reset; if you can get it to > do a warm reboot (e.g., by forcing a panic) the disks remain offline. > What is the difference between a warm reboot and a power cycle > in this context? Just wondering if there is some way that the mpt > driver could detect that every disk on a given controller has suddenly > gone offline more or less at once, then it could somehow reset the > controller. Sorry if this is a naive question, but when this happens, > however rarely, it is rather annoying albeit relatively harmless, and a > brute force fix would be better than having to power cycle, which > is always scary...
Since I was seeing similar behavior with just a tape autoloader behind the MPT (when a queue builds up, controller ends up in a bus-reset-loop as seen from the devices), could this be timing-related? Could it be as simple as the MPT driver needing a larger value for timeouts before it starts bus-resetting? //Svein -- --------+-------------------+------------------------------- /"\ |Svein Skogen | sv...@d80.iso100.no \ / |Solberg Østli 9 | PGP Key: 0xE5E76831 X |2020 Skedsmokorset | sv...@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | sv...@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listm...@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +-------------------+------------------------------- |msn messenger: | Mobile Phone: +47 907 03 575 |sv...@jernhuset.no | RIPE handle: SS16503-RIPE --------+-------------------+------------------------------- If you really are in a hurry, mail me at svein-mob...@stillbilde.net This mailbox goes directly to my cellphone and is checked even when I'm not in front of my computer. ------------------------------------------------------------ Picture Gallery: https://gallery.stillbilde.net/v/svein/ ------------------------------------------------------------
signature.asc
Description: OpenPGP digital signature
_______________________________________________ storage-discuss mailing list storage-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/storage-discuss