Re: [zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-03 Thread Peter Jeremy
On 2010-May-03 23:59:17 +0800, Diogo Franco  wrote:
>I managed to get a livefs cd that had zfs14, but it was unable to import
>the zpool ("internal error: Illegal byte sequence"). The zpool does
>appear if I try to run `zpool import` though, as "tank FAULTED corrupted
>data", and ad6s1d is ONLINE.

That's not promising.

> There is no -F option on bsd's zpool import.

It was introduced around zfs20.  I feared it might be needed.

>
>> This is almost certainly the problem.  ad6s1 may be the same as c5d0p1
>> but OpenSolaris isn't going to understand the FreeBSD partition label
>> on that slice.  All I can suggest is to (temporarily) change the disk
>> slicing so that there is a fdisk slice that matches ad6s1d.
>How could I do just that? I know that my label has a 1G UFS, 1G swap,
>and the rest is ZFS; but I don't know how to calculate the correct
>offset to give to 'format'. I can just regenerate the UFS later after
>the ZFS is fixed since it was only used for its /boot.

In FreeBSD, "bsdlabel ad0s1" will report the size and offset of the
'd' partition in sectors.  The offset is relative to the start of that
slice - which would normally be absolute block 63 ("fdisk ad0" will
confirm that).

Adding the offset of 's1' to the offset of 'd' will give you a sector
offset for your ZFS data.  I haven't tried using OpenSolaris on x86
so I'm not sure if format allows sector offsets (I know format on
Solaris/SPARC insists on cylinder offsets).  Since cylinders are a
fiction anyway, you might be able to kludge a cylinder size to suit
your offset if necessary.  The FreeBSD fdisk(8) man page implies
that slices start at a track boundary and and at a cylinder boundary
but I'm not sure if this is a restriction on LBA disks.

Note that if you keep a record of your existing c5d0 format and
restore it later, this will recover your existing boot and swap so you
shouldn't need to restore them.

-- 
Peter Jeremy


pgpHLIUCADaBM.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-03 Thread Diogo Franco
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/02/2010 07:33 PM, Peter Jeremy wrote:
> Note that ZFS v14 was imported to FreeBSD 8-stable in mid-January.
> I can't comment whether it would be able to recover your data.
I managed to get a livefs cd that had zfs14, but it was unable to import
the zpool ("internal error: Illegal byte sequence"). The zpool does
appear if I try to run `zpool import` though, as "tank FAULTED corrupted
data", and ad6s1d is ONLINE. There is no -F option on bsd's zpool import.

> This is almost certainly the problem.  ad6s1 may be the same as c5d0p1
> but OpenSolaris isn't going to understand the FreeBSD partition label
> on that slice.  All I can suggest is to (temporarily) change the disk
> slicing so that there is a fdisk slice that matches ad6s1d.
How could I do just that? I know that my label has a 1G UFS, 1G swap,
and the rest is ZFS; but I don't know how to calculate the correct
offset to give to 'format'. I can just regenerate the UFS later after
the ZFS is fixed since it was only used for its /boot.

Also, don't To: me and Cc: the list, I'm subscribed to it :)
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkve8tUACgkQvPrHMFeSjrLxaQCgrAP0Ne+EPg6Dl7zojQItszfd
nrUAnjJR19Zu08aSNpBtlTbV3GvecDQk
=vkEK
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-02 Thread Peter Jeremy
On 2010-May-02 04:06:41 +0800, Diogo Franco  wrote:
>regular data corruption and then the box locked up. I had also
>converted the pool to v14 a few days before, so the freebsd v13 tools
>couldn't do anything to help.

Note that ZFS v14 was imported to FreeBSD 8-stable in mid-January.
I can't comment whether it would be able to recover your data.

On 2010-May-02 05:07:17 +0800, Bill Sommerfeld  
wrote:
>  2) the labels are not at the start of what solaris sees as p1, and 
>thus are somewhere else on the disk.  I'd look more closely at how 
>freebsd computes the start of the partition or slice '/dev/ad6s1d'
>that contains the pool.
>
>I think #2 is somewhat more likely.

This is almost certainly the problem.  ad6s1 may be the same as c5d0p1
but OpenSolaris isn't going to understand the FreeBSD partition label
on that slice.  All I can suggest is to (temporarily) change the disk
slicing so that there is a fdisk slice that matches ad6s1d.

-- 
Peter Jeremy


pgpuiR7yDRv37.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-01 Thread Diogo Franco
On 05/01/2010 06:07 PM, Bill Sommerfeld wrote:
> there are two reasons why you could get this:
>  1) the labels are gone.
Possible, since I got the metadata errors on `zfs status` before.

>  2) the labels are not at the start of what solaris sees as p1, and thus
> are somewhere else on the disk.  I'd look more closely at how freebsd
> computes the start of the partition or slice '/dev/ad6s1d'
> that contains the pool.
> 
> I think #2 is somewhat more likely.
c5d0p1 is the only place where zdb finds any labels at all too...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-01 Thread Bill Sommerfeld

On 05/01/10 13:06, Diogo Franco wrote:

After seeing that on some cases labels were corrupted, I tried running
zdb -l on mine:

...
(labels 0, 1 not there, labels 2, 3 are there).


I'm looking for pointers on how to fix this situation, since the disk
still has available metadata.


there are two reasons why you could get this:
 1) the labels are gone.

 2) the labels are not at the start of what solaris sees as p1, and 
thus are somewhere else on the disk.  I'd look more closely at how 
freebsd computes the start of the partition or slice '/dev/ad6s1d'

that contains the pool.

I think #2 is somewhat more likely.

- Bill
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-01 Thread Diogo Franco
I had a single spare 500GB HDD and I decided to install a FreeBSD file
server in it for learning purposes, and I moved almost all of my data
to it. Yesterday, and naturally after no longer having backups of the
data in the server, I had a controller failure (SiS 180 (oh, the
quality)) and the HDD was considered unplugged. When I noticed a few
checksum failures on `zfs status` (including two on metadata (small
hex numbers)), I tried running `zfs scrub tank`, thinking it was a
regular data corruption and then the box locked up. I had also
converted the pool to v14 a few days before, so the freebsd v13 tools
couldn't do anything to help.

Today I downloaded the OpenSolaris 134 snapshot image and booted it to
try and rescue the pool, but:

# zpool status
no pools available

So I couldn't run a clean or an export or destroy to reimport with -D.
I tried to run a regular import:

# zpool import
 pool: tank
   id: 6157028625215863355
state: FAULTED
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
  see: http://www.sun.com/msg/ZFS-8000-EY
config:

tankFAULTED  corrupted data
 c5d0p1UNAVAIL  corrupted data

There was no important data written in the past two days or so, thus
using an older uberblock would't be a problem, so I tried using the
new recovery option:

# mkdir -p /mnt/tank && zpool import -fF -R /mnt/tank tank
cannot import 'tank': one or more devices is currently unavailable
Destroy and re-create the pool from
a backup source.

I tried googling for other people with similar issues, but almost all
of them had raids and other complex configuration and were not really
related to this problem.
After seeing that on some cases labels were corrupted, I tried running
zdb -l on mine:

# zdb -l /dev/dsk/c5d0p1

LABEL 0

failed to unpack label 0

LABEL 1

failed to unpack label 1

LABEL 2

version: 14
name: 'tank'
state: 0
txg: 11420324
pool_guid: 6157028625215863355
hostid: 2563111091
hostname: ''
top_guid: 1987270273092463401
guid: 1987270273092463401
vdev_tree:
type: 'disk'
id: 0
guid: 1987270273092463401
path: '/dev/ad6s1d'
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 497955373056
is_log: 0
DTL: 111

LABEL 3

version: 14
name: 'tank'
state: 0
txg: 11420324
pool_guid: 6157028625215863355
hostid: 2563111091
hostname: ''
top_guid: 1987270273092463401
guid: 1987270273092463401
vdev_tree:
type: 'disk'
id: 0
guid: 1987270273092463401
path: '/dev/ad6s1d'
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 497955373056
is_log: 0
DTL: 111

I'm looking for pointers on how to fix this situation, since the disk
still has available metadata.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss