[zfs-discuss] Backing up ZFS metadata

2012-08-24 Thread Scott Aitken
Hi all,

I know the easiest answer to this question is "don't do it in the first
place, and if you do, you should have a backup", however I'll ask it
regardless.

Is there a way to backup the ZFS metadata on each member device of a pool
to another device (possibly non-ZFS)?

I have recently read a discusson on this list regarding storing the working
metadata on off-data devices (mirrored I assume).  Is there a way today to
walk, and save, the metadata of an entire pool and save it somewhere?

The main motivation for the question is that I recently ruined a large raidz
pool buy overwriting the start and end of two member disks (and possibly some
data).  I assume that if I could have restored the lost metadata I could
have recovered most of the real data.

Thanks
Scott
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Corrupted pool: I/O error and Bad exchange descriptor

2012-07-16 Thread Scott Aitken
Hi all,

this is a follow up some help I was soliciting with my corrupted pool.

The short story is I can have no confidence in the quality in the labels on 2
of my 5 drive RAIDZ array.  For various reasons.

There is a possibility even that one drive has label of another (a mirroring
accident).

Anyhoo, for some odd reason, the drives finally mounted (they are actually
drive images on another ZFS pool which I have snapshotted).

When I imported the pool, ZFS complained that two of the datasets would not
mount, but the remainder did.

It seems that small files read ok.  (Perhaps small enough to fit on a single 
block -
hence probably mirrored and not striped.  Assuming my understanding of what
happens to small files is correct).

But on larger files I get:

root@openindiana-01:/ZP-8T-RZ1-01/incoming# cp httpd-error.log.zip /mnt2/
cp: reading `httpd-error.log.zip': I/O error

and on some directories:

root@openindiana-01:/ZP-8T-RZ1-01/usr# ls -al
cd ..ls: cannot access obj: Bad exchange descriptor
total 54
drwxr-xr-x  5 root root  5 2011-11-03 16:28 .
drwxr-xr-x 11 root root 11 2011-11-04 13:14 ..
??  ? ?? ?? obj
drwxr-xr-x 68 root root 83 2011-10-30 01:00 ports
drwxr-xr-x 22 root root 31 2011-09-25 02:00 src

Here is the zpool status output:

root@openindiana-01:/ZP-8T-RZ1-01# zpool status
 pool: ZP-8T-RZ1-01
state: DEGRADED
status: One or more devices has experienced an error resulting in data
   corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
   entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
 scan: scrub in progress since Sat Nov  5 23:57:46 2011
   112G scanned out of 6.93T at 6.24M/s, 318h17m to go
   305M repaired, 1.57% done
config:

   NAME  STATE READ WRITE CKSUM
   ZP-8T-RZ1-01  DEGRADED 0 0  356K
 raidz1-0DEGRADED 0 0  722K
   12339070507640025002  UNAVAIL  0 0 0  was /dev/lofi/2
   /dev/lofi/5   DEGRADED 0 0 0  too many errors 
(repairing)
   /dev/lofi/4   DEGRADED 0 0 0  too many errors 
(repairing)
   /dev/lofi/3   DEGRADED 0 0 74.4K  too many errors 
(repairing)
   /dev/lofi/1   DEGRADED 0 0 0  too many errors 
(repairing)

All those errors may be caused by one disk actually owning the wrong label.
I'm not entirely sure.

Also, while it's complaining that /dev/lofi/2 is UNAVAIL, it certainly is.
Although it's probably not labelled with '12339070507640025002'.

I'd love to get some of my data back.  Any recovery is a bonus.

If anyone is keen, I have enabled SSH into the Open Indiana box
which I'm using to try and recovery the pool, so if you'd like to take a shot
please let me know.

Thanks in advance,
Scott
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-16 Thread Scott Aitken
On Sat, Jun 16, 2012 at 09:58:40AM -0500, Gregg Wonderly wrote:
> 
> On Jun 16, 2012, at 9:49 AM, Scott Aitken wrote:
> 
> > On Sat, Jun 16, 2012 at 09:09:53AM -0500, Gregg Wonderly wrote:
> >> Use 'dd' to replicate as much of lofi/2 as you can onto another device, 
> >> and then 
> >> cable that into place?
> >> 
> >> It looks like you just need to put a functioning, working, but not correct 
> >> device, in that slot so that it will import and then you can 'zpool 
> >> replace' the 
> >> new disk into the pool perhaps?
> >> 
> >> Gregg Wonderly
> >> 
> >> On 6/16/2012 2:02 AM, Scott Aitken wrote:
> >>> On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
> >>>>> when you say remove the device, I assume you mean simply make it 
> >>>>> unavailable
> >>>>> for import (I can't remove it from the vdev).
> >>>> Yes, that's what I meant.
> >>>> 
> >>>>> root@openindiana-01:/mnt# zpool import -d /dev/lofi
> >>>>> ??pool: ZP-8T-RZ1-01
> >>>>> ?? ??id: 9952605666247778346
> >>>>> ??state: FAULTED
> >>>>> status: One or more devices are missing from the system.
> >>>>> action: The pool cannot be imported. Attach the missing
> >>>>> ?? ?? ?? ??devices and try again.
> >>>>> ?? see: http://www.sun.com/msg/ZFS-8000-3C
> >>>>> config:
> >>>>> 
> >>>>> ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
> >>>>> ?? ?? ?? ?? ??raidz1-0 ?? ?? ?? ?? ?? ?? ?? ??DEGRADED
> >>>>> ?? ?? ?? ?? ?? ??12339070507640025002 ??UNAVAIL ??cannot open
> >>>>> ?? ?? ?? ?? ?? ??/dev/lofi/5 ?? ?? ?? ?? ?? ONLINE
> >>>>> ?? ?? ?? ?? ?? ??/dev/lofi/4 ?? ?? ?? ?? ?? ONLINE
> >>>>> ?? ?? ?? ?? ?? ??/dev/lofi/3 ?? ?? ?? ?? ?? ONLINE
> >>>>> ?? ?? ?? ?? ?? ??/dev/lofi/1 ?? ?? ?? ?? ?? ONLINE
> >>>>> 
> >>>>> It's interesting that even though 4 of the 5 disks are available, it 
> >>>>> still
> >>>>> can import it as DEGRADED.
> >>>> I agree that it's "interesting". Now someone really knowledgable will
> >>>> need to have a look at this. I can only imagine that somehow the
> >>>> devices contain data from different points in time, and that it's too
> >>>> far apart for the aggressive txg rollback that was added in PSARC
> >>>> 2009/479. Btw, did you try that? Try: zpool import -d /dev/lofi -FVX
> >>>> ZP-8T-RZ1-01.
> >>>> 
> >>> Hi again,
> >>> 
> >>> that got slightly further, but still no dice:
> >>> 
> >>> root@openindiana-01:/mnt#  zpool import -d /dev/lofi -FVX ZP-8T-RZ1-01
> >>> root@openindiana-01:/mnt# zpool list
> >>> NAME   SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
> >>> ZP-8T-RZ1-01  -  -  -  -  -  FAULTED  -
> >>> rpool 15.9G  2.17G  13.7G13%  1.00x  ONLINE  -
> >>> root@openindiana-01:/mnt# zpool status
> >>>   pool: ZP-8T-RZ1-01
> >>>  state: FAULTED
> >>> status: One or more devices could not be used because the label is missing
> >>> or invalid.  There are insufficient replicas for the pool to 
> >>> continue
> >>> functioning.
> >>> action: Destroy and re-create the pool from
> >>> a backup source.
> >>>see: http://www.sun.com/msg/ZFS-8000-5E
> >>>   scan: none requested
> >>> config:
> >>> 
> >>> NAME  STATE READ WRITE CKSUM
> >>> ZP-8T-RZ1-01  FAULTED  0 0 1  corrupted 
> >>> data
> >>>   raidz1-0ONLINE   0 0 6
> >>> 12339070507640025002  UNAVAIL  0 0 0  was 
> >>> /dev/lofi/2
> >>> /dev/lofi/5   ONLINE   0 0 0
> >>> /dev/lofi/4   ONLINE   0 0 0
> >>> /dev/lofi/3   ONLINE   0 0 0
> >>> /dev/lofi/1   ONLINE   0 0 0
> >>> 
> >>> root@openindiana-01:/mnt# zpool scrub ZP-8T-RZ1-01
> >>> cannot scrub 'ZP-8T-RZ1-01': po

Re: [zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-16 Thread Scott Aitken
On Sat, Jun 16, 2012 at 09:09:53AM -0500, Gregg Wonderly wrote:
> Use 'dd' to replicate as much of lofi/2 as you can onto another device, and 
> then 
> cable that into place?
> 
> It looks like you just need to put a functioning, working, but not correct 
> device, in that slot so that it will import and then you can 'zpool replace' 
> the 
> new disk into the pool perhaps?
> 
> Gregg Wonderly
> 
> On 6/16/2012 2:02 AM, Scott Aitken wrote:
> > On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
> >>> when you say remove the device, I assume you mean simply make it 
> >>> unavailable
> >>> for import (I can't remove it from the vdev).
> >> Yes, that's what I meant.
> >>
> >>> root@openindiana-01:/mnt# zpool import -d /dev/lofi
> >>> ??pool: ZP-8T-RZ1-01
> >>> ?? ??id: 9952605666247778346
> >>> ??state: FAULTED
> >>> status: One or more devices are missing from the system.
> >>> action: The pool cannot be imported. Attach the missing
> >>> ?? ?? ?? ??devices and try again.
> >>> ?? see: http://www.sun.com/msg/ZFS-8000-3C
> >>> config:
> >>>
> >>> ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
> >>> ?? ?? ?? ?? ??raidz1-0 ?? ?? ?? ?? ?? ?? ?? ??DEGRADED
> >>> ?? ?? ?? ?? ?? ??12339070507640025002 ??UNAVAIL ??cannot open
> >>> ?? ?? ?? ?? ?? ??/dev/lofi/5 ?? ?? ?? ?? ?? ONLINE
> >>> ?? ?? ?? ?? ?? ??/dev/lofi/4 ?? ?? ?? ?? ?? ONLINE
> >>> ?? ?? ?? ?? ?? ??/dev/lofi/3 ?? ?? ?? ?? ?? ONLINE
> >>> ?? ?? ?? ?? ?? ??/dev/lofi/1 ?? ?? ?? ?? ?? ONLINE
> >>>
> >>> It's interesting that even though 4 of the 5 disks are available, it still
> >>> can import it as DEGRADED.
> >> I agree that it's "interesting". Now someone really knowledgable will
> >> need to have a look at this. I can only imagine that somehow the
> >> devices contain data from different points in time, and that it's too
> >> far apart for the aggressive txg rollback that was added in PSARC
> >> 2009/479. Btw, did you try that? Try: zpool import -d /dev/lofi -FVX
> >> ZP-8T-RZ1-01.
> >>
> > Hi again,
> >
> > that got slightly further, but still no dice:
> >
> > root@openindiana-01:/mnt#  zpool import -d /dev/lofi -FVX ZP-8T-RZ1-01
> > root@openindiana-01:/mnt# zpool list
> > NAME   SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
> > ZP-8T-RZ1-01  -  -  -  -  -  FAULTED  -
> > rpool 15.9G  2.17G  13.7G13%  1.00x  ONLINE  -
> > root@openindiana-01:/mnt# zpool status
> >pool: ZP-8T-RZ1-01
> >   state: FAULTED
> > status: One or more devices could not be used because the label is missing
> >  or invalid.  There are insufficient replicas for the pool to 
> > continue
> >  functioning.
> > action: Destroy and re-create the pool from
> >  a backup source.
> > see: http://www.sun.com/msg/ZFS-8000-5E
> >scan: none requested
> > config:
> >
> >  NAME  STATE READ WRITE CKSUM
> >  ZP-8T-RZ1-01  FAULTED  0 0 1  corrupted 
> > data
> >raidz1-0ONLINE   0 0 6
> >  12339070507640025002  UNAVAIL  0 0 0  was 
> > /dev/lofi/2
> >  /dev/lofi/5   ONLINE   0 0 0
> >  /dev/lofi/4   ONLINE   0 0 0
> >  /dev/lofi/3   ONLINE   0 0 0
> >  /dev/lofi/1   ONLINE   0 0 0
> >
> > root@openindiana-01:/mnt# zpool scrub ZP-8T-RZ1-01
> > cannot scrub 'ZP-8T-RZ1-01': pool is currently unavailable
> >
> > Thanks for your tenacity Stefan.
> > Scott
> > ___
> > zfs-discuss mailing list
> > zfs-discuss@opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
> 
> 

Hi Greg,

lofi/2 is a dd of a real disk.  I am using disk images because I can roll
back, clone etc without using the original drives (which are long gone
anyway).

I have tried making /2 unavailable for import, and zfs just moans that it
can't be opened.  It fails to import even though I have only one disk missing
of a RAIDZ array.

Scott


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-16 Thread Scott Aitken
On Sat, Jun 16, 2012 at 08:54:05AM +0200, Stefan Ring wrote:
> > when you say remove the device, I assume you mean simply make it unavailable
> > for import (I can't remove it from the vdev).
> 
> Yes, that's what I meant.
> 
> > root@openindiana-01:/mnt# zpool import -d /dev/lofi
> > ??pool: ZP-8T-RZ1-01
> > ?? ??id: 9952605666247778346
> > ??state: FAULTED
> > status: One or more devices are missing from the system.
> > action: The pool cannot be imported. Attach the missing
> > ?? ?? ?? ??devices and try again.
> > ?? see: http://www.sun.com/msg/ZFS-8000-3C
> > config:
> >
> > ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
> > ?? ?? ?? ?? ??raidz1-0 ?? ?? ?? ?? ?? ?? ?? ??DEGRADED
> > ?? ?? ?? ?? ?? ??12339070507640025002 ??UNAVAIL ??cannot open
> > ?? ?? ?? ?? ?? ??/dev/lofi/5 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/4 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/3 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/1 ?? ?? ?? ?? ?? ONLINE
> >
> > It's interesting that even though 4 of the 5 disks are available, it still
> > can import it as DEGRADED.
> 
> I agree that it's "interesting". Now someone really knowledgable will
> need to have a look at this. I can only imagine that somehow the
> devices contain data from different points in time, and that it's too
> far apart for the aggressive txg rollback that was added in PSARC
> 2009/479. Btw, did you try that? Try: zpool import -d /dev/lofi -FVX
> ZP-8T-RZ1-01.
> 

Hi again,

that got slightly further, but still no dice:

root@openindiana-01:/mnt#  zpool import -d /dev/lofi -FVX ZP-8T-RZ1-01
root@openindiana-01:/mnt# zpool list
NAME   SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
ZP-8T-RZ1-01  -  -  -  -  -  FAULTED  -
rpool 15.9G  2.17G  13.7G13%  1.00x  ONLINE  -
root@openindiana-01:/mnt# zpool status
  pool: ZP-8T-RZ1-01
 state: FAULTED
status: One or more devices could not be used because the label is missing
or invalid.  There are insufficient replicas for the pool to continue
functioning.
action: Destroy and re-create the pool from
a backup source.
   see: http://www.sun.com/msg/ZFS-8000-5E
  scan: none requested
config:

NAME  STATE READ WRITE CKSUM
ZP-8T-RZ1-01  FAULTED  0 0 1  corrupted data
  raidz1-0ONLINE   0 0 6
12339070507640025002  UNAVAIL  0 0 0  was /dev/lofi/2
/dev/lofi/5   ONLINE   0 0 0
/dev/lofi/4   ONLINE   0 0 0
/dev/lofi/3   ONLINE   0 0 0
/dev/lofi/1   ONLINE   0 0 0

root@openindiana-01:/mnt# zpool scrub ZP-8T-RZ1-01
cannot scrub 'ZP-8T-RZ1-01': pool is currently unavailable

Thanks for your tenacity Stefan.
Scott
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-15 Thread Scott Aitken
On Fri, Jun 15, 2012 at 10:54:34AM +0200, Stefan Ring wrote:
> >> Have you also mounted the broken image as /dev/lofi/2?
> >
> > Yep.
> 
> Wouldn't it be better to just remove the corrupted device? This worked
> just fine in my case.
>
 
Hi Stefan,

when you say remove the device, I assume you mean simply make it unavailable
for import (I can't remove it from the vdev).

This is what happens (lofi/2 is the drive which ZFS thinks has corrupted
data):

oot@openindiana-01:/mnt# zpool import -d /dev/lofi
  pool: ZP-8T-RZ1-01
id: 9952605666247778346
 state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-5E
config:

ZP-8T-RZ1-01  FAULTED  corrupted data
  raidz1-0ONLINE
12339070507640025002  UNAVAIL  corrupted data
/dev/lofi/5   ONLINE
/dev/lofi/4   ONLINE
/dev/lofi/3   ONLINE
/dev/lofi/1   ONLINE
root@openindiana-01:/mnt# lofiadm -d /dev/lofi/2
root@openindiana-01:/mnt# zpool import -d /dev/lofi
  pool: ZP-8T-RZ1-01
id: 9952605666247778346
 state: FAULTED
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-3C
config:

ZP-8T-RZ1-01  FAULTED  corrupted data
  raidz1-0DEGRADED
12339070507640025002  UNAVAIL  cannot open
/dev/lofi/5   ONLINE
/dev/lofi/4   ONLINE
/dev/lofi/3   ONLINE
/dev/lofi/1   ONLINE

So in the second import, it complains that it can't open the device, rather
than saying it has corrupted data.

It's interesting that even though 4 of the 5 disks are available, it still
can import it as DEGRADED.

Thanks again.
Scott
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-15 Thread Scott Aitken
On Fri, Jun 15, 2012 at 07:37:50AM +0200, Stefan Ring wrote:
> > root@solaris-01:/mnt# ??zpool import -d /dev/lofi
> > ??pool: ZP-8T-RZ1-01
> > ?? ??id: 9952605666247778346
> > ??state: FAULTED
> > status: One or more devices contains corrupted data.
> > action: The pool cannot be imported due to damaged devices or data.
> > ?? see: http://www.sun.com/msg/ZFS-8000-5E
> > config:
> >
> > ?? ?? ?? ??ZP-8T-RZ1-01 ?? ?? ?? ?? ?? ?? ??FAULTED ??corrupted data
> > ?? ?? ?? ?? ??raidz1-0 ?? ?? ?? ?? ?? ?? ?? ??ONLINE
> > ?? ?? ?? ?? ?? ??12339070507640025002 ??UNAVAIL ??corrupted data
> > ?? ?? ?? ?? ?? ??/dev/lofi/5 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/4 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/3 ?? ?? ?? ?? ?? ONLINE
> > ?? ?? ?? ?? ?? ??/dev/lofi/1 ?? ?? ?? ?? ?? ONLINE
> 
> Have you also mounted the broken image as /dev/lofi/2?


Yep.  I first ran:

for foo in WCAZA1217278 WCAZA1262989 WCAZA1447179 WCAZA1583652 WCAZA1589216 ; \
do lofiadm -a $foo ; done 

(the WC* are the file names of each disk image).

root@solaris-01:/# ls -al /dev/lofi
total 21
drwxr-xr-x   7 root root   7 Jun 14 22:06 .
drwxr-xr-x 246 root sys  246 Jun 14 21:49 ..
lrwxrwxrwx   1 root root  29 Jun 14 22:06 1 -> 
../../devices/pseudo/lofi@0:1
lrwxrwxrwx   1 root root  29 Jun 14 22:06 2 -> 
../../devices/pseudo/lofi@0:2
lrwxrwxrwx   1 root root  29 Jun 14 22:06 3 -> 
../../devices/pseudo/lofi@0:3
lrwxrwxrwx   1 root root  29 Jun 14 22:06 4 -> 
../../devices/pseudo/lofi@0:4
lrwxrwxrwx   1 root root  29 Jun 14 22:06 5 -> 
../../devices/pseudo/lofi@0:5

Clearly there's a disk with an incorrect label.  But how I can reconstruct
that label is a problem.

Also, there are four drives of the five-drive RAIDZ available.  Based on what
criteria does ZFS decide that it is FAULTED and not DEGRADED?  Odd.

Thanks,
Scott

ps I'm downloading OpenIndiana now.

> 
> When I try to recreate your situation, it looks like this (as
> expected), where /dev/lofi/2 is just not present:
> 
> $ lofiadm
> Block Device File   Options
> /dev/lofi/1  /dpool/dump/temp/watched/raid1 -
> /dev/lofi/3  /dpool/dump/temp/watched/raid3 -
> /dev/lofi/4  /dpool/dump/temp/watched/raid4 -
> 
> $ sudo zpool import -d /dev/lofi
>pool: lpool
>  id: 12540294359519404167
>   state: DEGRADED
>  status: One or more devices are missing from the system.
>  action: The pool can be imported despite missing or damaged devices.  The
> fault tolerance of the pool may be compromised if imported.
>see: http://illumos.org/msg/ZFS-8000-2Q
>  config:
> 
> lpoolDEGRADED
>   raidz1-0   DEGRADED
> /dev/lofi/1  ONLINE
> /dev/lofi/2  UNAVAIL  cannot open
> /dev/lofi/3  ONLINE
> /dev/lofi/4  ONLINE
> 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-14 Thread Scott Aitken
On Thu, Jun 14, 2012 at 09:56:43AM +1000, Daniel Carosone wrote:
> On Tue, Jun 12, 2012 at 03:46:00PM +1000, Scott Aitken wrote:
> > Hi all,
> 
> Hi Scott. :-)
> 
> > I have a 5 drive RAIDZ volume with data that I'd like to recover.
> 
> Yeah, still..
> 
> > I tried using Jeff Bonwick's labelfix binary to create new labels but it
> > carps because the txg is not zero.
> 
> Can you provide details of invocation and error response?

# /root/labelfix /dev/lofi/1
assertion failed for thread 0xfecb2a40, thread-id 1: txg == 0, file label.c,
line 53
Abort (core dumped)

The reporting line of code is:
VERIFY(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_TXG, &txg) == 0);

Here is the entire labelfix code:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 


#include 

/*
* Write a label block with a ZBT checksum.
*/
static void
label_write(int fd, uint64_t offset, uint64_t size, void *buf)
{
   zio_block_tail_t *zbt, zbt_orig;
   zio_cksum_t zc;

   zbt = (zio_block_tail_t *)((char *)buf + size) - 1;
   zbt_orig = *zbt;

   ZIO_SET_CHECKSUM(&zbt->zbt_cksum, offset, 0, 0, 0);

   zio_checksum(ZIO_CHECKSUM_LABEL, &zc, buf, size);

   VERIFY(pwrite64(fd, buf, size, offset) == size);

   *zbt = zbt_orig;
}

int
main(int argc, char **argv)
{
   int fd;
   vdev_label_t vl;
   nvlist_t *config;
   uberblock_t *ub = (uberblock_t *)vl.vl_uberblock;
   uint64_t txg;
   char *buf;
   size_t buflen;

   VERIFY(argc == 2);
   VERIFY((fd = open(argv[1], O_RDWR)) != -1);
   VERIFY(pread64(fd, &vl, sizeof (vdev_label_t), 0) ==
   sizeof (vdev_label_t));
   VERIFY(nvlist_unpack(vl.vl_vdev_phys.vp_nvlist,
   sizeof (vl.vl_vdev_phys.vp_nvlist), &config, 0) == 0);
   VERIFY(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_TXG, &txg) ==
0);
   VERIFY(txg == 0);
   VERIFY(ub->ub_txg == 0);
   VERIFY(ub->ub_rootbp.blk_birth != 0);

   txg = ub->ub_rootbp.blk_birth;
   ub->ub_txg = txg;

   VERIFY(nvlist_remove_all(config, ZPOOL_CONFIG_POOL_TXG) == 0);
   VERIFY(nvlist_add_uint64(config, ZPOOL_CONFIG_POOL_TXG, txg) == 0);
   buf = vl.vl_vdev_phys.vp_nvlist;
   buflen = sizeof (vl.vl_vdev_phys.vp_nvlist);
   VERIFY(nvlist_pack(config, &buf, &buflen, NV_ENCODE_XDR, 0) == 0);

   label_write(fd, offsetof(vdev_label_t, vl_uberblock),
   1ULL << UBERBLOCK_SHIFT, ub);

   label_write(fd, offsetof(vdev_label_t, vl_vdev_phys),
   VDEV_PHYS_SIZE, &vl.vl_vdev_phys);

   fsync(fd);

   return (0);
}

> 
> For the benefit of others, this was at my suggestion; I've been
> discussing this problem with Scott for.. some time. 
> 
> > I can also make the solaris machine available via SSH if some wonderful
> > person wants to poke around. 
> 
> Will take a poke, as discussed.  May well raise more discussion here
> as a result.
> 
> --
> Dan.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Recovery of RAIDZ with broken label(s)

2012-06-11 Thread Scott Aitken
Hi all,

I have a 5 drive RAIDZ volume with data that I'd like to recover.

The long story runs roughly:

1) The volume was running fine under FreeBSD on motherboard SATA controllers.
2) Two drives were moved to a HP P411 SAS/SATA controller
3) I *think* the HP controllers wrote some volume information to the end of
each disk (hence no more ZFS labels 2,3)
4) In its "auto configuration" wisdom, the HP controller built a mirrored
volume using the two drives (and I think started the actual mirroring
process).  (Hence on at least on of the drives - a copied labels 0,1).
5) From there everything went downhill.

This happened a while back, and so the exact order of things (including my
botched attemtps at recovery) are hazy.

I tried using Jeff Bonwick's labelfix binary to create new labels but it
carps because the txg is not zero.

The situation now is I have dd'd the drives onto a NAS.  These images are
shared via NFS to a VM running Oracle Solaris 11 11/11 X86.

When I attempt to import the pool I get:

root@solaris-01:/mnt#  zpool import -d /dev/lofi
  pool: ZP-8T-RZ1-01
id: 9952605666247778346
 state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-5E
config:

ZP-8T-RZ1-01  FAULTED  corrupted data
  raidz1-0ONLINE
12339070507640025002  UNAVAIL  corrupted data
/dev/lofi/5   ONLINE
/dev/lofi/4   ONLINE
/dev/lofi/3   ONLINE
/dev/lofi/1   ONLINE

I'm not sure why I can't import although 4 of the 5 drives are "ONLINE".

Can anyone please point me to a next step?

I can also make the solaris machine available via SSH if some wonderful
person wants to poke around.  If I lose the data that's ok, but it'd be nice
to know all avenues were tried before I delete the 9TB of images (I need the
space...)

Many thanks,
Scott
zfs-list at thismonkey dot com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss