Using ZFS snapshots as an alternative hot backup approach is worth a
look. It gives you the same recovery guarantees provided your db log
shares the dataset with data, is lightweight, and may be used for
fairly frequent incremental backup, modern way :-).

On 6/5/08, Ethan Erchinger <[EMAIL PROTECTED]> wrote:
> Hello,
> We have a backup strategy that involves mapping LUNs between a given
> pair of hosts, and copying data from one of the LUNs (src) and another
> LUN (dest).  The src LUNs sit a SAN device, sometimes multiple devices
> (zpool mirror).  The src LUN is running a MySQL database and typically
> will be running for weeks without issue.
>
> When we start the backup sequence, we map a previously unmapped LUN to
> the DB host and issue the following commands:
>
> root# cfgadm -al
> (sleep 10)
> root# luxadm probe
> (sleep 10)
> root# zpool import <pool_name>
>
> After importing we'll perform some minor IO on the dest LUN, such as
> adding a symlink, removing some old configuration files.  Then we'll
> start an ibbackup of that database from the src LUN to the dest LUN, and
> things go bad.
>
> It's not completely consistent, but sometimes the DB host will crash,
> sometimes we'll get chksum/read/write errors on the src LUN.  Looking at
> dmesg (when the host doesn't crash), we see the LUNs paths all disappear
> and then reappear usually around 20 seconds later.  Example output
> below.  Each LUN has 2 paths out of the DB host and 4 paths on each
> storage device, across two separate SANs.
>
> Usually the host will crash when not running with a zpool mirror, which
> apparently in Sol10u4, it's expected behavior.
>
> These hosts are x86_64 servers, running Sol10u4, unpatched.  They use
> qlogic qla2342 HBAs, and the stock qlc driver.  They are using MPXIO,
> from what I can tell.
>
> If anyone has any tips on troubleshooting, or knows of things we are
> doing wrong, help would be appreciated.
>
> Thanks,
> Ethan
>
> =======================================================
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.warning] WARNING:
> /scsi_vhci/[EMAIL PROTECTED] (sd3):
> Jun  3 15:00:33 dbhost   Error for Command: write(10)
> Error Level: Retryable
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     Requested
> Block: 186020890                 Error Block: 186020890
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     Vendor:
> Pillar                             Serial Number:
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     Sense Key: Unit
> Attention
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     ASC: 0x3f
> (reported LUNs data has changed), ASCQ: 0xe, FRU: 0x0
> Jun  3 15:00:33 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED],1/[EMAIL PROTECTED],0 (fcp1):
> Jun  3 15:00:33 dbhost   Lun=2 for target=21f00 reappeared
> Jun  3 15:00:33 dbhost scsi: [ID 243001 kern.info]       Target 0x21f00:
> Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.warning] WARNING:
> /scsi_vhci/[EMAIL PROTECTED] (sd3):
> Jun  3 15:00:33 dbhost   Error for Command: write(10)
> Error Level: Retryable
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     Requested
> Block: 186020890                 Error Block: 186020890
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     Vendor:
> Pillar                             Serial Number:
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     Sense Key: Unit
> Attention
> Jun  3 15:00:33 dbhost scsi: [ID 107833 kern.notice]     ASC: 0x3f
> (reported LUNs data has changed), ASCQ: 0xe, FRU: 0x0
> Jun  3 15:00:33 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0 (fcp0):
> Jun  3 15:00:33 dbhost   Lun=2 for target=11f00 reappeared
> Jun  3 15:00:33 dbhost scsi: [ID 243001 kern.info]       Target 0x11f00:
> Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1
> Jun  3 15:00:33 dbhost scsi: [ID 799468 kern.info] sd6 at scsi_vhci0:
> name g000b080084001453, bus address g000b080084001453
> Jun  3 15:00:33 dbhost genunix: [ID 936769 kern.info] sd6 is
> /scsi_vhci/[EMAIL PROTECTED]
> Jun  3 15:00:33 dbhost genunix: [ID 408114 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd6) online
> Jun  3 15:00:33 dbhost genunix: [ID 834635 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: degraded, path
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED],1/[EMAIL PROTECTED],0 (fp1) to target add
> ress: w2300000b08040e40,2 is online Load balancing: round-robin
> Jun  3 15:00:34 dbhost genunix: [ID 834635 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: optimal, path
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0 (fp0) to target addres
> s: w2100000b08040e40,2 is online Load balancing: round-robin
> Jun  3 15:00:37 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0 (fcp0):
> Jun  3 15:00:37 dbhost   Lun=2 for target=11e00 reappeared
> Jun  3 15:00:37 dbhost scsi: [ID 243001 kern.info]       Target 0x11e00:
> Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1
> Jun  3 15:00:37 dbhost genunix: [ID 834635 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: optimal, path
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0 (fp0) to target addres
> s: w2200000b08040e40,2 is online Load balancing: round-robin
> Jun  3 15:00:42 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0 (fcp0):
> Jun  3 15:00:42 dbhost   Lun=3 for target=10e00 disappeared
> Jun  3 15:00:42 dbhost scsi: [ID 243001 kern.info]       Target 0x10e00:
> Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1
> Jun  3 15:00:42 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0 (fcp0):
> Jun  3 15:00:42 dbhost   offlining lun=3 (trace=0), target=10e00
> (trace=b10101)
> Jun  3 15:00:42 dbhost genunix: [ID 834635 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd5) multipath status: degraded, path
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],0 (fp0) to target addre
> ss: w2200000b08040e20,3 is offline Load balancing: round-robin
> Jun  3 15:00:47 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED],1/[EMAIL PROTECTED],0 (fcp1):
> Jun  3 15:00:47 dbhost   Lun=2 for target=21e00 reappeared
> Jun  3 15:00:47 dbhost scsi: [ID 243001 kern.info]       Target 0x21e00:
> Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1
> Jun  3 15:00:47 dbhost genunix: [ID 834635 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: optimal, path
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED],1/[EMAIL PROTECTED],0 (fp1) to target addr
> ess: w2400000b08040e40,2 is online Load balancing: round-robin
> Jun  3 15:00:52 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED],1/[EMAIL PROTECTED],0 (fcp1):
> Jun  3 15:00:52 dbhost   Lun=3 for target=20e00 disappeared
> Jun  3 15:00:52 dbhost scsi: [ID 243001 kern.info]       Target 0x20e00:
> Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1
> Jun  3 15:00:52 dbhost scsi: [ID 243001 kern.info]
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED],1/[EMAIL PROTECTED],0 (fcp1):
> Jun  3 15:00:52 dbhost   offlining lun=3 (trace=0), target=20e00
> (trace=b10101)
> Jun  3 15:00:52 dbhost genunix: [ID 408114 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd5) offline
> Jun  3 15:00:52 dbhost genunix: [ID 834635 kern.info]
> /scsi_vhci/[EMAIL PROTECTED] (sd5) multipath status: failed, path
> /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL 
> PROTECTED],1/[EMAIL PROTECTED],0 (fp1) to target addre
> ss: w2400000b08040e20,3 is offline Load balancing: round-robin
>
>


-- 
Regards,
Andrey
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to