Re: [storage-discuss] mpxio and 6140 flapping

Brett Monroe Thu, 21 Aug 2008 09:32:54 -0700

>> I would appreciate it if anyone could confirm, or even suggest a fix for,
>> a problem we have with S10 x86 servers running MPXIO to connect to multiple
>> 6140 units. In particular, if there is more than one path from a server to
>> each 6140 controller then a reboot of the server will cause some of the
>> volumes on the 6140 to fail-over to their non-preferred controller. However,
>
> I am seeing the same problem here.  When I configured my zfs pool I
> made sure that the drives were evenly balanced across controllers
> based on the decisions made by MPXIO at that time. The end result was
> perfect loading of the controllers and FC paths. After upgrading to
> Solaris 10U5 I see that MPXIO is choosing different active paths by
> default.  Now one of my 2540's controllers is seeing at least 30% more
> load than the other and all of the drives accessed by that controller
> are seeing substantially more service time.  I am not happy about
> that.
>
>> While the system heals itself with all of the 6140 volumes eventually
>> reverting back to their preferred controller this is a particularly
>> annoying problem since there is another bug in Solaris which causes
>> some of our servers to kernel panic under certain I/O loads when this
>> unnecessary "flapping" happens.
>
> What is the nature of the kernel panic?  I have been encountering a
> kernel panic/shutdown about once per week at the start of the nightly
> zfs scrub I have scheduled via cron.  Sometimes it happens virtually
> instantaneously at the start of 'zfs scrub' while other times it takes
> a few minutes.  Once the system was mysteriously halted with no
> messages in the logs and no apparent response to keyboard, network, or
> display.  I am not happy about that.


Bob and Stuart,

The 6140 and 2500 devices are re-badged LSI products that do fail-over
only (non-symmetric devices).

It's been my experience that that is how the LSI products behave.  We
have seen non-Sun (IBM) branded LSI storage arrays do the same thing.
The nice thing about MPxIO is that it handles it very well (for us).
The IBM devices needed RDAC which, IMHO, sucks and didn't handle
redistribution of LUNs well.  MPxIO does it without any problem, but I
still have to tell the array to redistribute the LUNs (which I agree
is annoying).

Something tells me this is inherent to active-passive arrays in
general, but I've only worked with LSI based a-p storage arrays
before.

--Brett Monroe
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Re: [storage-discuss] mpxio and 6140 flapping

Reply via email to