Re: [storage-discuss] ZFS or MPXIO failure after importing new zpool

Andrey Kuzmin Thu, 05 Jun 2008 13:08:56 -0700

Those errors you reportedly see when reading zfs snapshot with
ibbackup are peculiar. zfs checksums data on write and verifies it
when read back, so either ibb has corruption issues of its own or you
read data back in some way that obviates zfs verification path.
Alternatively, you may be facing zfs issue as well, and I suggest you
submit a report to zfs-code.


On 6/5/08, Ethan Erchinger <[EMAIL PROTECTED]> wrote:
> Hi James,
> also inline.
>
> James C. McPherson wrote:
>>
>> Hi Ethan,
>> responses inline below
>>
>> Ethan Erchinger wrote:
>>> Hello,
>>> We have a backup strategy that involves mapping LUNs between a given
>>> pair of hosts, and copying data from one of the LUNs (src) and
>>> another LUN (dest).  The src LUNs sit a SAN device, sometimes
>>> multiple devices (zpool mirror).  The src LUN is running a MySQL
>>> database and typically will be running for weeks without issue.
>>
>> I'm sorry, I don't quite understand how this can be a serious
>> "backup strategy" - how on earth did you get to thinking that
>> it was going to work reliably?
> Well, this is not the final destination of the backup.  This a method of
> taking periodic snapshots between the primary and secondary hosts, of a
> replication pair.  We take the completed ibbackup and stream to tape
> afterwards, then do the equivalent of a restore on the dest lun.  It
> actually is pretty reliable.  We do this weekly, mainly because we don't
> trust MySQL replication, it's somewhat error prone.  While we may have
> issues with our implementation, I don't believe the strategy to be
> faulty at it's core.
>>
>>
>>> When we start the backup sequence, we map a previously unmapped LUN
>>> to the DB host and issue the following commands:
>>>
>>> root# cfgadm -al
>>> (sleep 10)
>>> root# luxadm probe
>>> (sleep 10)
>>> root# zpool import <pool_name>
>>
>> You're kidding, right? Have you RTFMd the cfgadm_fp(1M) manpage?
>> Ever thought about running something similar to
>>
>>
>> # cfgadm -c configure c$X::$target-pwwn
> Well, no, not kidding.  I believe that yes, this may be one of the main
> issues with our system.  We have read quite a bit of documentation, and
> normally this works pretty darn well.   Doing a configure, and I can
> only assume an unconfigure prior to remapping the LUN is the proper
> procedure? We believed that doing a configure was having little to no
> effect, because in the cfgadm -al output, condition (on a configured
> LUN) is "unknown".  According to the manpage that can very well mean
> that the configure command will have zero effect.
> """
>          configure       Configure  a  connected  Fibre   Channel
>                          Fabric  device  to  a host. When a Fibre
>                          Channel device is listed as  an  unknown
>                          type in the output of the list operation
>                          the device might not be configurable. No
>                          attempt  is  made  to  configure devices
>                          with unknown types.
> """
>>
>>
>>> After importing we'll perform some minor IO on the dest LUN, such as
>>> adding a symlink, removing some old configuration files.  Then we'll
>>> start an ibbackup of that database from the src LUN to the dest LUN,
>>> and things go bad.
>>
>> Frankly, I'm surprised it takes this long for you to get to the
>> "things go bad" stage.
> I think that it's possible that having an improper "configure" stage
> from above is causing things to go bad.
>>
>>
>>> It's not completely consistent, but sometimes the DB host will crash,
>>> sometimes we'll get chksum/read/write errors on the src LUN.  Looking
>>> at dmesg (when the host doesn't crash), we see the LUNs paths all
>>> disappear and then reappear usually around 20 seconds later.  Example
>>> output below.  Each LUN has 2 paths out of the DB host and 4 paths on
>>> each storage device, across two separate SANs.
>>
>> You're yanking drives in- and out-of-view of your host, you're
>> doing so with zpool importing (and exporting?) and yet you still
>> want your database to be reliable.
> We are not attempting to yank the src LUN in and out.  Yes, removing the
> dest LUN from a hosts view inexplicable may be causing other MPxIO
> inconsistencies though.  We typically import/export zpools between
> hosts, because ibbackup from Innobase (the recommended hot backup
> solution for InnoDB), cannot write over the network, and must copy to a
> local directory.  Mapping LUNs between hosts is a method.  NFS is
> another, and so on.  The typical 'Enterprise' backup solution didn't
> support hot backups on InnoDB until more recently.
>>
>>> Usually the host will crash when not running with a zpool mirror,
>>> which apparently in Sol10u4, it's expected behavior.
>>
>> Sorry, but no. What you're doing is creating inconsistencies in the
>> host's view of it storage. Don't blame Solaris for this, it's actually
>> trying to keep your data consistent.
> As mentioned, we _never_ remove (via wwn host masking) LUNs that are
> active from a ZFS perspective.  So consistency should not be compromised.
>>
>>> These hosts are x86_64 servers, running Sol10u4, unpatched.  They use
>>> qlogic qla2342 HBAs, and the stock qlc driver.  They are using MPXIO,
>>> from what I can tell.
>>
>> Yes, they're using MPxIO. You can tell that from the pathnames
>> such as /scsi_vhci/[EMAIL PROTECTED] - that's a dead giveaway.
> Good, that's what I thought.  I was only not positive because runs of
> mpathadm --cannot-remember didn't list our storage device as a supported
> MPxIO device, at least that's what I took the output to mean.
>>
>> So ... _unpatched_ you say? _Why_ ? I know organisations generally
>> have rigorous patching methodologies and schedules, but fer cryin'
>> out loud, S10 Update 4 has been available since the middle of 2007.
>> That's very nearly 12 months old.
> As mentioned by Bob in another email, we have not been running into
> issues with u4, that we knew of.  We have begun upgrading to U5 and
> applying the latest patches, in-fact all of our secondary hosts have
> this completed, we just haven't had a downtime window available to get
> the primary systems upgraded.  We are aware that we should be upgrading,
> and we are working towards that goal.  We have also seen enough
> instability with software releases (in software in general) that we like
> some bake time in new releases, so we've been waiting a little bit on
> the U5 release.  Which turns out was a good idea, given the memory leak
> in the qlc driver, fixed by patch 125165-10.
>>
>>
>>> If anyone has any tips on troubleshooting, or knows of things we are
>>> doing wrong, help would be appreciated.
>>
>> Two major recommendations. Firstly, PATCH YOUR SYSTEM.
>> Secondly, design a backup methodology which doesn't rely
>> on playing the fool with your storage.
>>
>> Assuming that you're posting from your work email address,
>> _surely_ you could convince your management that implementing
>> a backup strategy based around an enterprise-class backup
>> package such as NetBackup or Networker.
>>
>> You should also seriously consider getting a professional
>> services organisation (such as Sun's) to come in and help
>> you get your systems setup properly.
>>
> Thanks for your recommendations.
>


-- 
Regards,
Andrey
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Re: [storage-discuss] ZFS or MPXIO failure after importing new zpool

Reply via email to