Re: [zfs-discuss] after controller crash,

2008-10-30 Thread Oleg Muravskiy
I had problem with one of the labels on disk being unavailable. I was able to 
recover label (according to zdb -l) by doing export/import. But disk was still 
unavailable. Only scrub removed UNAVAILABLE status from disk.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Solution: Recover after disk labels failure

2008-10-21 Thread Oleg Muravskiy
I recovered the pool by doing export, import  and scrub.

Apparently you could export pool with a FAILED device, and import will restore 
labels from backup copies. Data errors are still there after import, so you 
need to scrub pool. After all that the filesystem is back with no 
errors/problems.

It would be nice if documentation mention this, namely that before trying to 
replace disks or restoring backups, you could try to export/import.

Also, it is not clear what zpool clear actually clears (what a nice use of 
word clear!). It does not clear data errors recorded within the pool. In my 
case they were registered when I tried to read data from pool with one device 
marked as FAILED (when in fact only label was corrupted, the data itself was 
OK), and disappeared upon scrub.

So my thanks go to people on Internet who share their findings about zfs, and 
zfs developers who made such a robust system (I still think it's the best from 
all [free] systems I used).
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice recommendations for backing up to ZFS

2008-10-20 Thread Oleg Muravskiy
If you are using rsync already, I would run it on server in daemon mode. And 
there are Windows clients that support rsync protocol.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recover after disk labels failure

2008-10-20 Thread Oleg Muravskiy
 Is there a way to recover from this problem? I'm
 pretty sure the data is still OK, it's just labels
 that get corrupted by controller or zfs. :(

And this s confirmed by zdb, after loong wait for comparison of data and 
checksums: no data errors.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Recover after disk labels failure

2008-10-19 Thread Oleg Muravskiy
I have a zfs pool made of two vdevs, each using one whole physical disk, under 
OpenSolaris 2008.5.
Disks live on a Netcell SATA/RAID controller, that has three ports (and I 
planned to use three disks there, and configure mirrors in zfs), but as it 
turned out could only provide one or two disks to the system. So I decided to 
create a mirror of two identical drives  on controller instead of one of the 
vdevs in zfs pool.

I did a 'dd' dump of whole disk of second vdev, and created an array on 
controller with second disk to mirror that one. After booting to Solaris 'zpool 
status' reported that it can't use the disk because label is missing or 
corrupted. I restored the dump made previously with dd - and 'zpool status' 
reported vdev in state FAILED, data corrupted, ~100k files with errors. Upon 
reboot, controller's BIOS reported that one of the disks in mirror is failing 
and needs to be replaced.

So I rebuilt an array. After booting, ZFS still reports disk as FAILED. An 
attempt to scrub crushes and reboots the system. Restoring the dump makes array 
broken from controller's point of view. It seems it stores some configuration 
information on disk, and that conflicts with ZFS using whole disk. This is an 
example of when using whole disk for ZFS is not a good idea. I reverted to 
using one-disk array on controller, as it was when I created zfs filesystem, 
but that does not help either. 'zdb -l' fails to unpack label 0 and 1. 2 and 
3 look ok to me, showing correct zfs info. 'format' reports this disk as being 
the part of active zfs pool (as well as for the other disk, that was part of a 
mirror and now connected via USB). 'zfs replace' also does not want to replace 
because it thinks second disk is part of active pool.

Is there a way to recover from this problem? I'm pretty sure the data is still 
OK, it's just labels that get corrupted by controller or zfs. :(
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss