Hey guys,

I've managed to end up with a corrupted zpool that I can no longer import
on Solaris 11.1.  Luckily I do have a backup of the most important stuff,
but I'd still like to recover the remainder if possible.

The pool was a 4-disk raidz1 with a ZIL (unmirrored).

>From what I can tell, it looks like one of the devices caused the SAS HBA
(LSI1068E) to reset and offline the pool. Upon reboot the pool wouldn't
import and "zpool status" was showing that all disks except 1 were
unavailable.  However, I think this was a result of the HBA reset that
caused solaris to re-enumerate the devices.  At this point I exported the
pool and attempted to reimport - I probably shouldn't have because it has
now led to a condition where two disks have the same GUID.


root@solaris:~# zdb -l /dev/dsk/c7t5d0s0 |egrep "(children|guid)"
    pool_guid: 7650914121155923652
    top_guid: 4244192714700669945
    guid: 2113359054019808692
    vdev_children: 2
        guid: 4244192714700669945
        children[0]:
            guid: 16334042155336894037
        children[1]:
            guid: 2113359054019808692
        children[2]:
            guid: 11196011208380299867
        children[3]:
            guid: 15149956586209127431

root@solaris:~# zdb -l /dev/dsk/c7t6d0s0 |egrep "(children|guid)"
    pool_guid: 7650914121155923652
    top_guid: 4244192714700669945
    guid: 11196011208380299867
    vdev_children: 2
        guid: 4244192714700669945
        children[0]:
            guid: 16334042155336894037
        children[1]:
            guid: 2113359054019808692
        children[2]:
            guid: 11196011208380299867
        children[3]:
            guid: 15149956586209127431

root@solaris:~# zdb -l /dev/dsk/c7t7d0s0 |egrep "(children|guid)"
    pool_guid: 7650914121155923652
    top_guid: 4244192714700669945
    guid: 2113359054019808692
    vdev_children: 2
        guid: 4244192714700669945
        children[0]:
            guid: 16334042155336894037
        children[1]:
            guid: 2113359054019808692
        children[2]:
            guid: 11196011208380299867
        children[3]:
            guid: 15149956586209127431

root@solaris:~# zdb -l /dev/dsk/c7t8d0s0 |egrep "(children|guid)"
    pool_guid: 7650914121155923652
    top_guid: 4244192714700669945
    guid: 15149956586209127431
    vdev_children: 2
        guid: 4244192714700669945
        children[0]:
            guid: 16334042155336894037
        children[1]:
            guid: 2113359054019808692
        children[2]:
            guid: 11196011208380299867
        children[3]:
            guid: 15149956586209127431


>From the above, c7t5d0 and c7t7d0 both have a guid of 2113359054019808692.
 Looks like one of these should be 16334042155336894037 instead.

zpool import gives me the following output (I'm attempting to import this
on another system which is why there's a reference to c9t4d0, aka guid:
16334042155336894037).

root@solaris:~# zpool import
  pool: bucket
    id: 7650914121155923652
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to unavailable devices or data.
   see: http://support.oracle.com/msg/ZFS-8000-EY
config:

        bucket       UNAVAIL  insufficient replicas
          raidz1-0   DEGRADED
            c9t4d0   UNAVAIL  corrupted data
            c7t7d0   ONLINE
            c7t6d0   ONLINE
            c7t8d0   ONLINE

device details:

        c9t4d0     UNAVAIL        corrupted data
        status: ZFS detected errors on this device.
                The device has bad label or disk contents.


Being raidz1, I thought I would be able to import the pool with 1 device
missing, but no matter what I try it simply won't import due to
"insufficient replicas"

Is anyone aware of a method to modify the on-disk metadata and change the
device guid?

Any help is greatly appreciated!

Cheers,
Josh
_______________________________________________
msosug mailing list
[email protected]
http://mexico.purplecow.org/m/listinfo/msosug

Reply via email to