Re: [storage-discuss] ZFS or MPXIO failure after importing new zpool

Ethan Erchinger Thu, 05 Jun 2008 11:07:13 -0700

Hi James,
also inline.

James C. McPherson wrote:

Hi Ethan,
responses inline below

Ethan Erchinger wrote:
Hello,
We have a backup strategy that involves mapping LUNs between a givenpair of hosts, and copying data from one of the LUNs (src) andanother LUN (dest). The src LUNs sit a SAN device, sometimesmultiple devices (zpool mirror). The src LUN is running a MySQLdatabase and typically will be running for weeks without issue.
I'm sorry, I don't quite understand how this can be a serious
"backup strategy" - how on earth did you get to thinking that
it was going to work reliably?

Well, this is not the final destination of the backup. This a method oftaking periodic snapshots between the primary and secondary hosts, of areplication pair. We take the completed ibbackup and stream to tapeafterwards, then do the equivalent of a restore on the dest lun. Itactually is pretty reliable. We do this weekly, mainly because we don'ttrust MySQL replication, it's somewhat error prone. While we may haveissues with our implementation, I don't believe the strategy to befaulty at it's core.

When we start the backup sequence, we map a previously unmapped LUNto the DB host and issue the following commands:
root# cfgadm -al
(sleep 10)
root# luxadm probe
(sleep 10)
root# zpool import <pool_name>
You're kidding, right? Have you RTFMd the cfgadm_fp(1M) manpage?
Ever thought about running something similar to


# cfgadm -c configure c$X::$target-pwwn

Well, no, not kidding. I believe that yes, this may be one of the mainissues with our system. We have read quite a bit of documentation, andnormally this works pretty darn well. Doing a configure, and I canonly assume an unconfigure prior to remapping the LUN is the properprocedure? We believed that doing a configure was having little to noeffect, because in the cfgadm -al output, condition (on a configuredLUN) is "unknown". According to the manpage that can very well meanthat the configure command will have zero effect.

"""
        configure       Configure  a  connected  Fibre   Channel
                        Fabric  device  to  a host. When a Fibre
                        Channel device is listed as  an  unknown
                        type in the output of the list operation
                        the device might not be configurable. No
                        attempt  is  made  to  configure devices
                        with unknown types.
"""

After importing we'll perform some minor IO on the dest LUN, such asadding a symlink, removing some old configuration files. Then we'llstart an ibbackup of that database from the src LUN to the dest LUN,and things go bad.
Frankly, I'm surprised it takes this long for you to get to the
"things go bad" stage.

I think that it's possible that having an improper "configure" stagefrom above is causing things to go bad.

It's not completely consistent, but sometimes the DB host will crash,sometimes we'll get chksum/read/write errors on the src LUN. Lookingat dmesg (when the host doesn't crash), we see the LUNs paths alldisappear and then reappear usually around 20 seconds later. Exampleoutput below. Each LUN has 2 paths out of the DB host and 4 paths oneach storage device, across two separate SANs.
You're yanking drives in- and out-of-view of your host, you're
doing so with zpool importing (and exporting?) and yet you still
want your database to be reliable.

We are not attempting to yank the src LUN in and out. Yes, removing thedest LUN from a hosts view inexplicable may be causing other MPxIOinconsistencies though. We typically import/export zpools betweenhosts, because ibbackup from Innobase (the recommended hot backupsolution for InnoDB), cannot write over the network, and must copy to alocal directory. Mapping LUNs between hosts is a method. NFS isanother, and so on. The typical 'Enterprise' backup solution didn'tsupport hot backups on InnoDB until more recently.

Usually the host will crash when not running with a zpool mirror,which apparently in Sol10u4, it's expected behavior.
Sorry, but no. What you're doing is creating inconsistencies in the
host's view of it storage. Don't blame Solaris for this, it's actually
trying to keep your data consistent.

As mentioned, we _never_ remove (via wwn host masking) LUNs that areactive from a ZFS perspective. So consistency should not be compromised.

These hosts are x86_64 servers, running Sol10u4, unpatched. They useqlogic qla2342 HBAs, and the stock qlc driver. They are using MPXIO,from what I can tell.
Yes, they're using MPxIO. You can tell that from the pathnames
such as /scsi_vhci/[EMAIL PROTECTED] - that's a dead giveaway.

Good, that's what I thought. I was only not positive because runs ofmpathadm --cannot-remember didn't list our storage device as a supportedMPxIO device, at least that's what I took the output to mean.


So ... _unpatched_ you say? _Why_ ? I know organisations generally
have rigorous patching methodologies and schedules, but fer cryin'
out loud, S10 Update 4 has been available since the middle of 2007.
That's very nearly 12 months old.

As mentioned by Bob in another email, we have not been running intoissues with u4, that we knew of. We have begun upgrading to U5 andapplying the latest patches, in-fact all of our secondary hosts havethis completed, we just haven't had a downtime window available to getthe primary systems upgraded. We are aware that we should be upgrading,and we are working towards that goal. We have also seen enoughinstability with software releases (in software in general) that we likesome bake time in new releases, so we've been waiting a little bit onthe U5 release. Which turns out was a good idea, given the memory leakin the qlc driver, fixed by patch 125165-10.

If anyone has any tips on troubleshooting, or knows of things we aredoing wrong, help would be appreciated.


Two major recommendations. Firstly, PATCH YOUR SYSTEM.
Secondly, design a backup methodology which doesn't rely
on playing the fool with your storage.

Assuming that you're posting from your work email address,
_surely_ you could convince your management that implementing
a backup strategy based around an enterprise-class backup
package such as NetBackup or Networker.

You should also seriously consider getting a professional
services organisation (such as Sun's) to come in and help
you get your systems setup properly.

Thanks for your recommendations.

_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Re: [storage-discuss] ZFS or MPXIO failure after importing new zpool

Reply via email to