Given past 6140/6540 experience (and the code is shared with 25xx series) I would suggest you look into MPxIO's aquisition of the primary/secondary paths, specifically that the preferred path has changed since module loading and therefore the system ping-pongs between paths.
MPxIO has traditionally only classified paths (ie. as to their status as either a primary/secondary path) once upon first discover after boot, hence which path is discovered first as the primary is retained by the kernel module. Most A/P arrays propogate the identity bits and hence MPxIO ids the path and builds its maps accordingly and everything is fine, hence primary paths are accessed until all are exhausted (ie. multiple primaries are round-robined), upon which even we switch to secondary paths (again utilising all secondaries in a round-robin fashion). The specific difference in the LSI arrays is that we have the notion of "preferred path" and more correctly this is a mechanism of changing the identity of the primary path. If the preferred path is changed after a host has booted and acquired its path clasifications then Solaris's view of the paths is now at odds with the arrays and hence the flapping effect as Solaris attempts to regain its "primary" path, whilst the array does otherwise. You need to re-initialise the path classification, which I assume would take a module unload/reload, not particularly easy with most of the FC drivers, so normally a host reboot after the correct set of "preferred" mappings is accomplished in the array is sufficient and then verify the Solaris view of the paths. Note that this would normally be a one-time event on first config/deployment of the LSI box/LUNs, we are not talking about failure of paths here, specifically path classification, which is a one time event. HTH Craig 2008/9/2 Joel Miller <[EMAIL PROTECTED]>: > Hi Bob, > > Sorry I have not had a lot of spare time lately... > > So the LSI-sourced arrays have bits in the inquiry data that tells the host > whether or not the path that the inquiry is being handled on is a preferred > or secondary path... > > When a path is brought online (or back online after going away), MPxIO is > supposed to use that information to know whether or not to attempt to > failback the LUN or not.. > > If you can collect the support data via CAM, I can take a look at your > configuration...but basically it will likely come down to: > 1) If the preferred owner of a LUN is not what you expected...which I assume > you checked already... > > 2) A timing issue that is either causing the controller to not set the > correct bit during re-discovery > > or > > 3) A timing or load related issue that is causing the host to "drop the ball" > during re-discovery > > > BTW, The Wide-port message you see when you reset a controller is likely the > surviving controller noticing that it lost one of its back-end SAS channels > until the other controller comes back... > > -Joel > -- > This message posted from opensolaris.org > _______________________________________________ > storage-discuss mailing list > [email protected] > http://mail.opensolaris.org/mailman/listinfo/storage-discuss > -- Craig Morgan Cinnabar Solutions Ltd t: +44 (0)791 338 3190 f: +44 (0)870 705 1726 e: [EMAIL PROTECTED] w: www.cinnabar-solutions.com _______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
