Re: [zfs-discuss] Help! ZFS pool is UNAVAILABLE

2008-01-02 Thread Moore, Joe
I AM NOT A ZFS DEVELOPER.  These suggestions should work, but there
may be other people who have better ideas.

Aaron Berland wrote:
 Basically, I have a 3 drive raidz array on internal Seagate 
 drives. running build 64nv. I purchased 3 add'l USB drives 
 with the intention of mirroring and then migrating the data 
 to the new USB drives.
 (snip)
 Below is my current zpool status.  Note the USB drives are 
 showing up as the same device.  They are plugged into 3 
 different port and they used to show up as different controllers??  
 
 This whole thing was supposed to duplicate my data and have 
 more redundancy, but now it looks like I could be loosing it 
 all?!  I have some data backed up on other devices, but not all.
 
 NAMESTATE READ WRITE CKSUM
 zbk UNAVAIL  0 0 0  insufficient replicas
   raidz1ONLINE   0 0 0
 c2d0p2  ONLINE   0 0 0
 c1d0ONLINE   0 0 0
 c1d1ONLINE   0 0 0
   raidz1UNAVAIL  0 0 0  insufficient replicas
 c5t0d0  ONLINE   0 0 0
 c5t0d0  FAULTED  0 0 0  corrupted data
 c5t0d0  FAULTED  0 0 0  corrupted data

Ok, from here, we can see that you have a single pool, with two striped
components: a raidz set from c1 and c2 disks, and the (presumably new)
raidz set from c5 -- I'm guessing this is where the USB disks show up.

Unfortunately, it is not possible to remove a component from a zfs pool.

On the bright side, it might be possible to trick it, at least for long
enough to get the data back.

First, we'll want to get the system booted.  You'll connect the USB
devices, but DON't try to do anything with your pool (especially don't
put more data on it)

You should then be able to get a consistant pool up and running -- the
devices will be scanned and detected and automatically reenabled.  You
might have to do a zpool import to search all of the /dev/dsk/
devices.

From there, pull out one of the USB drives and do a zpool scrub to
resilver the failed RAID group.  So now, wipe off the removed USB disks
(format it with ufs or something... it just needs to lose the ZFS
identifiers.  And while we're at it, ufs is probably a good choice
anyway, given the next step(s))  One of the disks will show FAULTED at
this point, I'll call it c5t2d0.

Now, mount up that extra disk, and run mkfile 500g
/mnt/theUSBdisk/disk1.img (This will create a sparse file)

Then do a zfs replace c5t2d0 /mnt/theUSBdisk/disk1.img

Then you can also replace the other 2 USB disks with other img files
too... as long as the total data written to these stripes doesn't exceed
the actual size of the disk, you'll be OK.  At this point, back up your
data (zfs send  | bzip2 -9  /mnt/theUSBdisk/backup.dat).

--Joe
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! ZFS pool is UNAVAILABLE

2008-01-02 Thread Richard Elling
Moore, Joe wrote:
 I AM NOT A ZFS DEVELOPER.  These suggestions should work, but there
 may be other people who have better ideas.

 Aaron Berland wrote:
   
 Basically, I have a 3 drive raidz array on internal Seagate 
 drives. running build 64nv. I purchased 3 add'l USB drives 
 with the intention of mirroring and then migrating the data 
 to the new USB drives.
 
  (snip)
   
 Below is my current zpool status.  Note the USB drives are 
 showing up as the same device.  They are plugged into 3 
 different port and they used to show up as different controllers??  

 This whole thing was supposed to duplicate my data and have 
 more redundancy, but now it looks like I could be loosing it 
 all?!  I have some data backed up on other devices, but not all.

 NAMESTATE READ WRITE CKSUM
 zbk UNAVAIL  0 0 0  insufficient replicas
   raidz1ONLINE   0 0 0
 c2d0p2  ONLINE   0 0 0
 c1d0ONLINE   0 0 0
 c1d1ONLINE   0 0 0
   raidz1UNAVAIL  0 0 0  insufficient replicas
 c5t0d0  ONLINE   0 0 0
 c5t0d0  FAULTED  0 0 0  corrupted data
 c5t0d0  FAULTED  0 0 0  corrupted data
 

 Ok, from here, we can see that you have a single pool, with two striped
 components: a raidz set from c1 and c2 disks, and the (presumably new)
 raidz set from c5 -- I'm guessing this is where the USB disks show up.

 Unfortunately, it is not possible to remove a component from a zfs pool.

 On the bright side, it might be possible to trick it, at least for long
 enough to get the data back.

 First, we'll want to get the system booted.  You'll connect the USB
 devices, but DON't try to do anything with your pool (especially don't
 put more data on it)

 You should then be able to get a consistant pool up and running -- the
 devices will be scanned and detected and automatically reenabled.  You
 might have to do a zpool import to search all of the /dev/dsk/
 devices.

 From there, pull out one of the USB drives and do a zpool scrub to
 resilver the failed RAID group.  So now, wipe off the removed USB disks
 (format it with ufs or something... it just needs to lose the ZFS
 identifiers.  And while we're at it, ufs is probably a good choice
 anyway, given the next step(s))  One of the disks will show FAULTED at
 this point, I'll call it c5t2d0.

 Now, mount up that extra disk, and run mkfile 500g
 /mnt/theUSBdisk/disk1.img (This will create a sparse file)
   

Be careful here, if your USB disk is smaller than 500g (likely) then you
won't be able to later replace this disk1.img file with a smaller USB disk.
You will need to make sure the disk1.img file is the same size as the
USB disk.  Since USB disks are often different sizes (!), this might get 
tricky.
[yes, this would be fixed by the notorious shrink RFE]
 -- richard

 Then do a zfs replace c5t2d0 /mnt/theUSBdisk/disk1.img

 Then you can also replace the other 2 USB disks with other img files
 too... as long as the total data written to these stripes doesn't exceed
 the actual size of the disk, you'll be OK.  At this point, back up your
 data (zfs send  | bzip2 -9  /mnt/theUSBdisk/backup.dat).

 --Joe
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! ZFS pool is UNAVAILABLE

2008-01-02 Thread Aaron Berland
Hi Joe,

Thanks for trying.  I can't even get the pool online because there are 2 
corrupt drives according to zpool status.  Yours and the other gentlemen's 
insights have been very helpful, however!

I lucked out and realized that I did have copies of 90% of my data, so I am 
just going to destroy the pool and start over.  

I will have more questions in the future on how to best safeguard my pools.  

Thanks again for your help!

Aaron
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! ZFS pool is UNAVAILABLE

2008-01-01 Thread Nigel Smith
It would be interesting to see the output from:

# zdb -v zbk

You can also use zdb to examine the labels on each of the disks.
Each disk had  4 copies of the labels, for redundancy.
Two at the start, and two at the end of each disk.
Use a command similar to this:

# zdb -l /dev/dsk/c2d0p2

Presumably the labels are some how confused,
especially for your USB drives :-(
Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss