I found this technote which confirmed your statement Christian : http://www.symantec.com/business/support/index?page=content&id=TECH51507
"- Storage Foundation on Solaris sparc and X64 is supported with MPxIO on Sun Storage hardware only. Storage Foundation does not support MPxIO on non-sun storage arrays. For Non-Sun storage hardware, DMP is required. If MPxIO is enabled on a host, the tunable dmp_fast_recovery must be set to off: vxdmpadm settune dmp_fast_recovery=off." Le 07/10/2010 11:12, Sebastien DAUBIGNE a écrit : > Hi, > > Thank you all for your feedback. > > I am very surprised that MPxIO+DMP is only supported on Sun storages : > as stated in my very first message, the MPxIO solution was imposed by > our SAN team, following HP recommendations. > > When we joined this SAN, I asked to go with DMP for multipathing layer > because we usually adopt this solution for all our > Solaris+VxVM+dedicated storage configuration, regardless of the > storage hardware : for instance with EMC hardware we use DMP and not > Powerpath and it works like a charm. > Unfortunately the SAN team and HP told us that for Solaris servers > incluing thoses with VxVM, we must use MPxIO otherwise they would not > support it, hence we used MPxIO. > > Now for the issue, the question is still : will 5.0 bypass the MPxIO > layer for error detection or is this functionality only implemented > starting at MP2 ? > The idea is to be sure that this is a fast recovery issue and not > anything else. > > Cheers, > > Le 06/10/2010 23:02, Christian Gerbrandt a écrit : >> We support several 3rd party multipathing solutions, like MPxIO or >> EMCs PowerPath. >> However, MPxIO is only supported on Sun branded Storages. >> DMP has also been known to outperform other solutions in certain >> configurations. >> >> When a 3rd party multipathing is in use, DMP will fail back into TPD >> mode (Third Party Driver), and let the underlaying multipathing do >> its job. >> That's when you see just a single disk in VxVM, when you know you >> have more than one path per disk. >> >> I would recommend to install the 5.0 MP3 RP4 patch, and then check >> again if MPxIO is still misbehaving. >> Or ideally, switch over to DMP. >> >> -----Original Message----- >> From: veritas-vx-boun...@mailman.eng.auburn.edu >> [mailto:veritas-vx-boun...@mailman.eng.auburn.edu] On Behalf Of >> Victor Engle >> Sent: 06 October 2010 20:48 >> To: Ashish Yajnik >> Cc: sebastien.daubi...@atosorigin.com; "undisclosed-recipients:, >> "@mailman.eng.auburn.edu; Veritas-vx@mailman.eng.auburn.edu >> Subject: Re: [Veritas-vx] Solaris-SFS / MPxIO / VxVM failover issue >> >> This is absolutely false! >> >> MPxIO is an excellent multipathing solution and is supported by all >> major storage vendors including HP. This issue discussed in this >> thread has to do with improper behavior of DMP when multipathing is >> managed by a native layer like MPxIO. >> >> Storage and OS vendors have no motivation to lock you into a veritas >> solution. >> >> Or, Ashish, are you saying that your Symantec is locking Symantec >> customers into DMP? Hitachi, EMC, NetApp and HP all have supported >> configurations which include vxvm and native OS multipathing stacks. >> >> Thanks, >> Vic >> >> >> On Wed, Oct 6, 2010 at 1:26 PM, Ashish >> Yajnik<ashish_yaj...@symantec.com> wrote: >>> MPxIO with VxVM is only supported with Sun storage. If you run into >>> problems with MPxIO and SF on XP24K then support will not be able to >>> help you. I would recommend using DMP with XP24K. >>> >>> Ashish >>> -------------------------- >>> Sent using BlackBerry >>> >>> >>> ----- Original Message ----- >>> From: veritas-vx-boun...@mailman.eng.auburn.edu >>> <veritas-vx-boun...@mailman.eng.auburn.edu> >>> To: Sebastien DAUBIGNE<sebastien.daubi...@atosorigin.com>; >>> undisclosed-recipients >>> <"undisclosed-recipients:;"@mailman.eng.auburn.edu> >>> Cc: Veritas-vx@mailman.eng.auburn.edu >>> <Veritas-vx@mailman.eng.auburn.edu> >>> Sent: Wed Oct 06 10:08:08 2010 >>> Subject: Re: [Veritas-vx] Solaris-SFS / MPxIO / VxVM failover issue >>> >>> Hi Sebastien, >>> >>> In the first mail you mentioned that you are using mpxio to control >>> the XP24K array. Why are you using mpxio here? >>> >>> Thanks, >>> Venkata Sreenivasarao Nagineni, >>> Symantec >>> >>>> -----Original Message----- >>>> From: veritas-vx-boun...@mailman.eng.auburn.edu [mailto:veritas-vx- >>>> boun...@mailman.eng.auburn.edu] On Behalf Of Sebastien DAUBIGNE >>>> Sent: Wednesday, October 06, 2010 9:32 AM >>>> To: undisclosed-recipients >>>> Cc: Veritas-vx@mailman.eng.auburn.edu >>>> Subject: Re: [Veritas-vx] Solaris-SFS / MPxIO / VxVM failover issue >>>> >>>> Hi, >>>> >>>> I come back with my dmp_fast_recovery issue (VxDMP fails the path >>>> before MPxIO gets a chance to failover on alternate path). >>>> As stated previously, I am running 5.0GA, and this tunable is not >>>> supported in this release. However I still don't know if VxVM 5.0GA >>>> silently bypasses the MPxIO stack for error recovery. >>>> >>>> Now I try to determine if upgrading to MP3 will resolve this issue >>>> (which rarely occured). >>>> >>>> Could anyone (maybe Joshua ?) explain if the behaviour of 5.0GA >>>> without tunable is functionally identical to dmp_fast_recovery=0 or >>>> dmp_fast_recovery=1 ? Maybe the mechanism has been implemented in 5.0 >>>> without the option to disable it (this could explain my issue) ? >>>> >>>> Joshua, you mentioned another tuneable for 5.0 but looking at the >>>> list I can't identify the corresponding tunable : >>>> >>>> > vxdmpadm gettune all >>>> Tunable Current Value Default Value >>>> ------------------------------ ------------- ------------- >>>> dmp_failed_io_threshold 57600 57600 >>>> dmp_retry_count 5 5 >>>> dmp_pathswitch_blks_shift 11 11 >>>> dmp_queue_depth 32 32 >>>> dmp_cache_open on on >>>> dmp_daemon_count 10 10 >>>> dmp_scsi_timeout 30 30 >>>> dmp_delayq_interval 15 15 >>>> dmp_path_age 0 300 >>>> dmp_stat_interval 1 1 >>>> dmp_health_time 0 60 >>>> dmp_probe_idle_lun on on >>>> dmp_log_level 4 1 >>>> >>>> Cheers. >>>> >>>> >>>> >>>> Le 16/09/2010 16:50, Joshua Fielden a écrit : >>>>> dmp_fast_recovery is a mechanism by which we bypass the sd/scsi >>>>> stack >>>> and send path inquiry/status CDBs directly from the HBA in order to >>>> bypass long SCSI queues and recover paths faster. With a TPD (third- >>>> party driver) such as MPxIO, bypassing the stack means we bypass the >>>> TPD completely, and interactions such as this can happen. The vxesd >>>> (event-source daemon) is another 5.0/MP2 backport addition that's >>>> moot in the presence of a TPD. >>>>> From your modinfo, you're not actually running MP3. This technote >>>> (http://seer.entsupport.symantec.com/docs/327057.htm) isn't exactly >>>> your scenario, but looking for partially-installed pkgs is a good >>>> start to getting your server correctly installed, then the tuneable >>>> should work -- very early 5.0 versions had a differently-named >>>> tuneable I can't find in my mail archive ATM. >>>>> Cheers, >>>>> >>>>> Jf >>>>> >>>>> -----Original Message----- >>>>> From: veritas-vx-boun...@mailman.eng.auburn.edu [mailto:veritas-vx- >>>> boun...@mailman.eng.auburn.edu] On Behalf Of Sebastien DAUBIGNE >>>>> Sent: Thursday, September 16, 2010 7:41 AM >>>>> To: Veritas-vx@mailman.eng.auburn.edu >>>>> Subject: Re: [Veritas-vx] Solaris-SFS / MPxIO / VxVM failover issue >>>>> >>>>> Thank you Victor and William, it seems to be a very good lead. >>>>> >>>>> Unfortunately, this tunable seems not to be supported in the VxVM >>>>> version installed on my system : >>>>> >>>>> > vxdmpadm gettune dmp_fast_recovery VxVM vxdmpadm ERROR >>>>> V-5-1-12015 Incorrect tunable vxdmpadm gettune [tunable name] Note >>>>> - Tunable name can be dmp_failed_io_threshold, dmp_retry_count, >>>>> dmp_pathswitch_blks_shift, dmp_queue_depth, dmp_cache_open, >>>>> dmp_daemon_count, dmp_scsi_timeout, dmp_delayq_interval, >>>> dmp_path_age, >>>>> or dmp_stat_interval >>>>> >>>>> Something odd because my version is 5.0 MP3 Solaris SPARC, and >>>> according >>>>> to http://seer.entsupport.symantec.com/docs/316981.htm this tunable >>>>> should be available. >>>>> >>>>> > modinfo | grep -i vx >>>>> 38 7846a000 3800e 288 1 vxdmp (VxVM 5.0-2006-05-11a: DMP >>>> Drive) >>>>> 40 784a4000 334c40 289 1 vxio (VxVM 5.0-2006-05-11a I/O >>>>> driver) >>>>> 42 783ec71d df8 290 1 vxspec (VxVM 5.0-2006-05-11a >>>> control/st) >>>>> 296 78cfb0a2 c6b 291 1 vxportal (VxFS 5.0_REV-5.0A55_sol >>>>> portal >>>> ) >>>>> 297 78d6c000 1b9d4f 8 1 vxfs (VxFS 5.0_REV-5.0A55_sol SunOS 5) >>>>> 298 78f18000 a270 292 1 fdd (VxQIO 5.0_REV-5.0A55_sol Quick ) >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Le 16/09/2010 12:15, Victor Engle a écrit : >>>>>> Which version of veritas? Version 4/2MP2 and version 5.x >>>>>> introduced >>>> a >>>>>> feature called DMP fast recovery. It was probably supposed to be >>>>>> called DMP fast fail but "recovery" sounds better. It is supposed >>>>>> to fail suspect paths more aggressively to speed up failover. But >>>>>> when you only have one vxvm DMP path, as is the case with MPxIO, >>>>>> and fast-recovery fails that path, then you're in trouble. In >>>>>> version >>>> 5.x, >>>>>> it is possible to disable this feature. >>>>>> >>>>>> Google DMP fast recovery. >>>>>> >>>>>> http://seer.entsupport.symantec.com/docs/307959.htm >>>>>> >>>>>> I can imagine there must have been some internal fights at >>>>>> symantec between product management and QA to get that feature >>>>>> released. >>>>>> >>>>>> Vic >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Sep 16, 2010 at 6:03 AM, Sebastien DAUBIGNE >>>>>> <sebastien.daubi...@atosorigin.com> wrote: >>>>>>> Dear Vx-addicts, >>>>>>> >>>>>>> We encountered a failover issue on this configuration : >>>>>>> >>>>>>> - Solaris 9 HW 9/05 >>>>>>> - SUN SAN (SFS) 4.4.15 >>>>>>> - Emulex with SUN generic driver (emlx) >>>>>>> - VxVM 5.0-2006-05-11a >>>>>>> >>>>>>> - storage on HP SAN (XP 24K). >>>>>>> >>>>>>> >>>>>>> Multipathing is managed by MPxIO (not VxDMP) because the SAN team >>>> and HP >>>>>>> support imposed the Solaris native solution for multipathing : >>>>>>> >>>>>>> VxVM ==> VxDMP ==> MPxIO ==> FCP ... >>>>>>> >>>>>>> We have 2 paths to the switch, linked to 2 paths to the storage, >>>>>>> so >>>> the >>>>>>> LUNs have 4 paths, with active/active support. >>>>>>> Failover operation has been tested successfully by offlining each >>>> port >>>>>>> successively on the SAN. >>>>>>> >>>>>>> We regulary have transient I/O errors (scsi timeout, I/O error >>>> retries >>>>>>> with "Unit attention"), due to SAN-side issues. Usually these >>>> errors are >>>>>>> transparently managed by MPxIO/VxVM without impact on the >>>> applications. >>>>>>> Now for the incident we encountered : >>>>>>> >>>>>>> One of the SAN port was reset , consequently there were some >>>> transient >>>>>>> I/O error. >>>>>>> The other SAN port was OK, so the MPxIO multipathing layer should >>>> have >>>>>>> failover the I/O on the other path, without transmiting the error >>>> to the >>>>>>> VxDMP layer. >>>>>>> For some reason, it did not failover the I/O before VxVM caught >>>>>>> it >>>> as >>>>>>> unrecoverable I/O error, disabling the subdisk and consequently >>>>>>> the filesystem. >>>>>>> >>>>>>> Note the "giving up" message from scsi layer at 06:23:03 : >>>>>>> >>>>>>> Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: >>>> VxVM >>>>>>> vxdmp V-5-0-112 disabled path 118/0x558 belonging to the dmpnode >>>> 288/0x60 >>>>>>> Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: >>>> VxVM >>>>>>> vxdmp V-5-0-111 disabled dmpnode 288/0x60 Sep 1 06:18:54 >>>>>>> myserver vxdmp: [ID 917986 kern.notice] NOTICE: >>>> VxVM >>>>>>> vxdmp V-5-0-112 disabled path 118/0x538 belonging to the dmpnode >>>> 288/0x20 >>>>>>> Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: >>>> VxVM >>>>>>> vxdmp V-5-0-112 disabled path 118/0x550 belonging to the dmpnode >>>> 288/0x18 >>>>>>> Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: >>>> VxVM >>>>>>> vxdmp V-5-0-111 disabled dmpnode 288/0x20 Sep 1 06:18:54 >>>>>>> myserver vxdmp: [ID 824220 kern.notice] NOTICE: >>>> VxVM >>>>>>> vxdmp V-5-0-111 disabled dmpnode 288/0x18 Sep 1 06:18:54 >>>>>>> myserver scsi: [ID 107833 kern.warning] WARNING: >>>>>>> /scsi_vhci/s...@g60060e80152777000001277700003794 (ssd165): >>>>>>> Sep 1 06:18:54 myserver SCSI transport failed: reason >>>>>>> 'tran_err': retrying command >>>>>>> Sep 1 06:19:05 myserver scsi: [ID 107833 kern.warning] WARNING: >>>>>>> /scsi_vhci/s...@g60060e80152777000001277700003794 (ssd165): >>>>>>> Sep 1 06:19:05 myserver SCSI transport failed: reason >>>> 'timeout': >>>>>>> retrying command >>>>>>> Sep 1 06:21:57 myserver scsi: [ID 107833 kern.warning] WARNING: >>>>>>> /scsi_vhci/s...@g60060e8015277700000127770000376d (ssd168): >>>>>>> Sep 1 06:21:57 myserver SCSI transport failed: reason >>>>>>> 'tran_err': retrying command >>>>>>> Sep 1 06:22:45 myserver scsi: [ID 107833 kern.warning] WARNING: >>>>>>> /scsi_vhci/s...@g60060e8015277700000127770000376d (ssd168): >>>>>>> Sep 1 06:22:45 myserver SCSI transport failed: reason >>>> 'timeout': >>>>>>> retrying command >>>>>>> Sep 1 06:23:03 myserver scsi: [ID 107833 kern.warning] WARNING: >>>>>>> /scsi_vhci/s...@g60060e80152777000001277700003787 (ssd166): >>>>>>> Sep 1 06:23:03 myserver SCSI transport failed: reason >>>> 'timeout': >>>>>>> giving up >>>>>>> Sep 1 06:23:03 myserver vxio: [ID 539309 kern.warning] WARNING: >>>> VxVM >>>>>>> vxio V-5-3-0 voldmp_errbuf_sio_start: Failed to flush the error >>>> buffer >>>>>>> 300ce41c340 on device 0x1200000003a to DMP Sep 1 06:23:03 >>>>>>> myserver vxio: [ID 771159 kern.warning] WARNING: >>>> VxVM >>>>>>> vxio V-5-0-2 Subdisk mydisk_2-02 block 5935: Uncorrectable write >>>> error >>>>>>> Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: >>>> msgcnt >>>>>>> 1 mesg 037: V-2-37: vx_metaioerr - vx_logbuf_clean - >>>>>>> /dev/vx/dsk/mydg/vol1 file system meta data write error in >>>> dev/block 0/5935 >>>>>>> Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: >>>> msgcnt >>>>>>> 2 mesg 031: V-2-31: vx_disable - /dev/vx/dsk/mydg/vol1 file >>>>>>> system >>>> disabled >>>>>>> Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: >>>> msgcnt >>>>>>> 3 mesg 037: V-2-37: vx_metaioerr - vx_inode_iodone - >>>>>>> /dev/vx/dsk/mydg/vol1 file system meta data write error in >>>> dev/block >>>>>>> 0/265984 >>>>>>> >>>>>>> >>>>>>> It seems VxDMP gets the I/O error at the same time as MPxIO : I >>>> though >>>>>>> MPxIO would have conceal the I/O error until failover has >>>>>>> occured, >>>> which >>>>>>> is not the case. >>>>>>> >>>>>>> As a workaround, I increased the VxDMP >>>>>>> recoveryotion/fixedretry/retrycount tunable from 5 to 20 to give >>>> MPxIO a >>>>>>> chance to failover before VxDMP fails, but I still don't >>>>>>> understand >>>> why >>>>>>> VxVM catch the scsi errors. >>>>>>> >>>>>>> Any advice ? >>>>>>> >>>>>>> thanks. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Sebastien DAUBIGNE >>>>>>> sebastien.daubi...@atosorigin.com - +33(0)5.57.89.31.09 >>>>>>> AtosOrigin Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Veritas-vx maillist - Veritas-vx@mailman.eng.auburn.edu >>>>>>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx >>>>>>> >>>> >>>> -- >>>> Sebastien DAUBIGNE >>>> sebastien.daubi...@atosorigin.com - +33(0)5.57.89.31.09 AtosOrigin >>>> Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix >>>> >>>> _______________________________________________ >>>> Veritas-vx maillist - Veritas-vx@mailman.eng.auburn.edu >>>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx >>> _______________________________________________ >>> Veritas-vx maillist - Veritas-vx@mailman.eng.auburn.edu >>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx >>> _______________________________________________ >>> Veritas-vx maillist - Veritas-vx@mailman.eng.auburn.edu >>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx >>> >> _______________________________________________ >> Veritas-vx maillist - Veritas-vx@mailman.eng.auburn.edu >> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx >> >> _______________________________________________ >> Veritas-vx maillist - Veritas-vx@mailman.eng.auburn.edu >> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx > > -- Sebastien DAUBIGNE sebastien.daubi...@atosorigin.com - +33(0)5.57.89.31.09 AtosOrigin Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix _______________________________________________ Veritas-vx maillist - Veritas-vx@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx