On 17.04.2010 00:34, Frank Middleton wrote:
> On 04/15/10 12:19 AM, James C.McPherson wrote:
> 
>>  From what I've observed, the consistent feature of this
>> problem (where no SAS expanders are involved) is the
>> use of WD disk drives.
> 
> If I run Open Solaris (and ZFS) on a COMSTAR initiator, run a
> scrub, and also run a scrub on the ZFS target, the iops get
> cranked up to a sustained 130 or so and eventually something
> very similar to CR6894775 may occur. Setting /etc/system
> zfs:zfs_vdev_max_pending = 4 definitely reduces the frequency
> of it happening but eventually it does if you do it often enough.
> 
> This on a simple whole-disk mirror of Seagate 7200 RPM drives
> on SPARC running SXCE snv125 with 4GB. Both disks are on the
> same controller with no expanders. So if this is the same problem
> (and I'm pretty certain it is) then it happens on (these) Seagates,
> too.
> 
> It seems to require a power cycle to reset; if you can get it to
> do a warm reboot (e.g., by forcing a panic) the disks remain offline.
> What is the difference between a warm reboot and a power cycle
> in this context? Just wondering if there is some way that the mpt
> driver could detect that every disk on a given controller has suddenly
> gone offline more or less at once, then it could somehow reset the
> controller. Sorry if this is a naive question, but when this happens,
> however rarely, it is rather annoying albeit relatively harmless, and a
> brute force fix would be better than having to power cycle, which
> is always scary...

Since I was seeing similar behavior with just a tape autoloader behind
the MPT (when a queue builds up, controller ends up in a bus-reset-loop
as seen from the devices), could this be timing-related? Could it be as
simple as the MPT driver needing a larger value for timeouts before it
starts bus-resetting?

//Svein

-- 
--------+-------------------+-------------------------------
  /"\   |Svein Skogen       | sv...@d80.iso100.no
  \ /   |Solberg Østli 9    | PGP Key:  0xE5E76831
   X    |2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway             | PGP Key:  0xCE96CE13
        |                   | sv...@stillbilde.net
 ascii  |                   | PGP Key:  0x58CD33B6
 ribbon |System Admin       | svein-listm...@stillbilde.net
Campaign|stillbilde.net     | PGP Key:  0x22D494A4
        +-------------------+-------------------------------
        |msn messenger:     | Mobile Phone: +47 907 03 575
        |sv...@jernhuset.no | RIPE handle:    SS16503-RIPE
--------+-------------------+-------------------------------
         If you really are in a hurry, mail me at
               svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
        even when I'm not in front of my computer.
------------------------------------------------------------
                     Picture Gallery:
          https://gallery.stillbilde.net/v/svein/
------------------------------------------------------------

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to