On 17/04/10 03:33 PM, Svein Skogen wrote:
On 17.04.2010 00:34, Frank Middleton wrote:
On 04/15/10 12:19 AM, James C.McPherson wrote:
From what I've observed, the consistent feature of this
problem (where no SAS expanders are involved) is the
use of WD disk drives.
If I run Open Solaris (and ZFS) on a COMSTAR initiator, run a
scrub, and also run a scrub on the ZFS target, the iops get
cranked up to a sustained 130 or so and eventually something
very similar to CR6894775 may occur. Setting /etc/system
zfs:zfs_vdev_max_pending = 4 definitely reduces the frequency
of it happening but eventually it does if you do it often enough.
This on a simple whole-disk mirror of Seagate 7200 RPM drives
on SPARC running SXCE snv125 with 4GB. Both disks are on the
same controller with no expanders. So if this is the same problem
(and I'm pretty certain it is) then it happens on (these) Seagates,
too.
It seems to require a power cycle to reset; if you can get it to
do a warm reboot (e.g., by forcing a panic) the disks remain offline.
What is the difference between a warm reboot and a power cycle
in this context? Just wondering if there is some way that the mpt
driver could detect that every disk on a given controller has suddenly
gone offline more or less at once, then it could somehow reset the
controller. Sorry if this is a naive question, but when this happens,
however rarely, it is rather annoying albeit relatively harmless, and a
brute force fix would be better than having to power cycle, which
is always scary...
Since I was seeing similar behavior with just a tape autoloader behind
the MPT (when a queue builds up, controller ends up in a bus-reset-loop
as seen from the devices), could this be timing-related? Could it be as
simple as the MPT driver needing a larger value for timeouts before it
starts bus-resetting?
We have had too many conflicting reports of this problem, or
of problems which appear to be closely related, for anybody
who works on the mpt(7d) driver to be able to say with any
degree of certainty exactly what the cause is.
We have constraints on how long certain timeout values can
be set to, which come from other layers in SCSA, and also from
client stacks such as Cluster.
James C. McPherson
--
Senior Software Engineer, Solaris
Oracle
http://www.jmcp.homeunix.com/blog
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss