Using ZFS snapshots as an alternative hot backup approach is worth a look. It gives you the same recovery guarantees provided your db log shares the dataset with data, is lightweight, and may be used for fairly frequent incremental backup, modern way :-).
On 6/5/08, Ethan Erchinger <[EMAIL PROTECTED]> wrote: > Hello, > We have a backup strategy that involves mapping LUNs between a given > pair of hosts, and copying data from one of the LUNs (src) and another > LUN (dest). The src LUNs sit a SAN device, sometimes multiple devices > (zpool mirror). The src LUN is running a MySQL database and typically > will be running for weeks without issue. > > When we start the backup sequence, we map a previously unmapped LUN to > the DB host and issue the following commands: > > root# cfgadm -al > (sleep 10) > root# luxadm probe > (sleep 10) > root# zpool import <pool_name> > > After importing we'll perform some minor IO on the dest LUN, such as > adding a symlink, removing some old configuration files. Then we'll > start an ibbackup of that database from the src LUN to the dest LUN, and > things go bad. > > It's not completely consistent, but sometimes the DB host will crash, > sometimes we'll get chksum/read/write errors on the src LUN. Looking at > dmesg (when the host doesn't crash), we see the LUNs paths all disappear > and then reappear usually around 20 seconds later. Example output > below. Each LUN has 2 paths out of the DB host and 4 paths on each > storage device, across two separate SANs. > > Usually the host will crash when not running with a zpool mirror, which > apparently in Sol10u4, it's expected behavior. > > These hosts are x86_64 servers, running Sol10u4, unpatched. They use > qlogic qla2342 HBAs, and the stock qlc driver. They are using MPXIO, > from what I can tell. > > If anyone has any tips on troubleshooting, or knows of things we are > doing wrong, help would be appreciated. > > Thanks, > Ethan > > ======================================================= > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.warning] WARNING: > /scsi_vhci/[EMAIL PROTECTED] (sd3): > Jun 3 15:00:33 dbhost Error for Command: write(10) > Error Level: Retryable > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] Requested > Block: 186020890 Error Block: 186020890 > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] Vendor: > Pillar Serial Number: > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] Sense Key: Unit > Attention > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] ASC: 0x3f > (reported LUNs data has changed), ASCQ: 0xe, FRU: 0x0 > Jun 3 15:00:33 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED],1/[EMAIL PROTECTED],0 (fcp1): > Jun 3 15:00:33 dbhost Lun=2 for target=21f00 reappeared > Jun 3 15:00:33 dbhost scsi: [ID 243001 kern.info] Target 0x21f00: > Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1 > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.warning] WARNING: > /scsi_vhci/[EMAIL PROTECTED] (sd3): > Jun 3 15:00:33 dbhost Error for Command: write(10) > Error Level: Retryable > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] Requested > Block: 186020890 Error Block: 186020890 > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] Vendor: > Pillar Serial Number: > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] Sense Key: Unit > Attention > Jun 3 15:00:33 dbhost scsi: [ID 107833 kern.notice] ASC: 0x3f > (reported LUNs data has changed), ASCQ: 0xe, FRU: 0x0 > Jun 3 15:00:33 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED]/[EMAIL PROTECTED],0 (fcp0): > Jun 3 15:00:33 dbhost Lun=2 for target=11f00 reappeared > Jun 3 15:00:33 dbhost scsi: [ID 243001 kern.info] Target 0x11f00: > Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1 > Jun 3 15:00:33 dbhost scsi: [ID 799468 kern.info] sd6 at scsi_vhci0: > name g000b080084001453, bus address g000b080084001453 > Jun 3 15:00:33 dbhost genunix: [ID 936769 kern.info] sd6 is > /scsi_vhci/[EMAIL PROTECTED] > Jun 3 15:00:33 dbhost genunix: [ID 408114 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd6) online > Jun 3 15:00:33 dbhost genunix: [ID 834635 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: degraded, path > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED],1/[EMAIL PROTECTED],0 (fp1) to target add > ress: w2300000b08040e40,2 is online Load balancing: round-robin > Jun 3 15:00:34 dbhost genunix: [ID 834635 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: optimal, path > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED]/[EMAIL PROTECTED],0 (fp0) to target addres > s: w2100000b08040e40,2 is online Load balancing: round-robin > Jun 3 15:00:37 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED]/[EMAIL PROTECTED],0 (fcp0): > Jun 3 15:00:37 dbhost Lun=2 for target=11e00 reappeared > Jun 3 15:00:37 dbhost scsi: [ID 243001 kern.info] Target 0x11e00: > Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1 > Jun 3 15:00:37 dbhost genunix: [ID 834635 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: optimal, path > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED]/[EMAIL PROTECTED],0 (fp0) to target addres > s: w2200000b08040e40,2 is online Load balancing: round-robin > Jun 3 15:00:42 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED]/[EMAIL PROTECTED],0 (fcp0): > Jun 3 15:00:42 dbhost Lun=3 for target=10e00 disappeared > Jun 3 15:00:42 dbhost scsi: [ID 243001 kern.info] Target 0x10e00: > Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1 > Jun 3 15:00:42 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED]/[EMAIL PROTECTED],0 (fcp0): > Jun 3 15:00:42 dbhost offlining lun=3 (trace=0), target=10e00 > (trace=b10101) > Jun 3 15:00:42 dbhost genunix: [ID 834635 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd5) multipath status: degraded, path > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED]/[EMAIL PROTECTED],0 (fp0) to target addre > ss: w2200000b08040e20,3 is offline Load balancing: round-robin > Jun 3 15:00:47 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED],1/[EMAIL PROTECTED],0 (fcp1): > Jun 3 15:00:47 dbhost Lun=2 for target=21e00 reappeared > Jun 3 15:00:47 dbhost scsi: [ID 243001 kern.info] Target 0x21e00: > Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1 > Jun 3 15:00:47 dbhost genunix: [ID 834635 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd6) multipath status: optimal, path > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED],1/[EMAIL PROTECTED],0 (fp1) to target addr > ess: w2400000b08040e40,2 is online Load balancing: round-robin > Jun 3 15:00:52 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED],1/[EMAIL PROTECTED],0 (fcp1): > Jun 3 15:00:52 dbhost Lun=3 for target=20e00 disappeared > Jun 3 15:00:52 dbhost scsi: [ID 243001 kern.info] Target 0x20e00: > Nonzero peripheral qualifier: Device type=0x0 Peripheral qual=0x1 > Jun 3 15:00:52 dbhost scsi: [ID 243001 kern.info] > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED],1/[EMAIL PROTECTED],0 (fcp1): > Jun 3 15:00:52 dbhost offlining lun=3 (trace=0), target=20e00 > (trace=b10101) > Jun 3 15:00:52 dbhost genunix: [ID 408114 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd5) offline > Jun 3 15:00:52 dbhost genunix: [ID 834635 kern.info] > /scsi_vhci/[EMAIL PROTECTED] (sd5) multipath status: failed, path > /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1077,[EMAIL > PROTECTED],1/[EMAIL PROTECTED],0 (fp1) to target addre > ss: w2400000b08040e20,3 is offline Load balancing: round-robin > > -- Regards, Andrey _______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
