Re: [zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

2011-05-24 Thread Cindy Swearingen

Hi Alex,

If the hardware and cables were moved around then this is probably
the root cause of your problem. You should see if you can move the
devices/cabling back to what they were before the move.

The zpool history output provides the original device name, which
isn't c5t1d0, either:

# zpool create tank c13t0d0

You might grep the zpool history output to find out which disk was
eventually attached, like this:

# zpool history | grep attach

But its clear from the zdb -l output, that the devid for this
particular device changed, which we've seen happen on some hardware. If
the devid persists, ZFS can follow the devid of the device even if its
physical path changes and is able to recover more gracefully.

If you continue to use this hardware for your storage pool, you should
export the pool before making any kind of hardware change.

Thanks,

Cindy


On 05/21/11 18:05, Alex Dolski wrote:

Hi Cindy,

Thanks for the advice. This is just a little old Gateway PC provisioned as an 
informal workgroup server. The main storage is two SATA drives in an external 
enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is snv_134b, 
upgraded from snv_111a.

I can't identify a cause in particular. The box has been running for several 
months without much oversight. It's possible that the two eSATA cables got 
reconnected to different ports after a recent move.

The backup has been made and I will try the export  import, per your advice 
(if zpool command works - it does again at the moment, no reboot!). I will also try 
switching the eSATA cables to opposite ports.

Thanks,
Alex


Command output follows:

# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
   0. c5t1d0 ATA-WDC WD5000AAKS-0-1D05-465.76GB
  /pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0
   1. c8d0 DEFAULT cyl 9726 alt 2 hd 255 sec 63
  /pci@0,0/pci-ide@1f,2/ide@0/cmdk@0,0
   2. c9d0 DEFAULT cyl 38910 alt 2 hd 255 sec 63
  /pci@0,0/pci-ide@1f,2/ide@1/cmdk@0,0
   3. c11t0d0 WD-Ext HDD 1021-2002-931.51GB
  /pci@0,0/pci107b,5058@1a,7/storage@1/disk@0,0


# zpool history tank
History for 'tank':
2010-06-18.15:14:16 zpool create tank c13t0d0
2011-05-07.02:00:07 zpool scrub tank
2011-05-14.02:00:08 zpool scrub tank
2011-05-21.02:00:12 zpool scrub tank
a million 'zfs snapshot' and 'zfs destroy' events from zfs-auto-snap omitted


# zdb -l /dev/dsk/c5t1d0s0

LABEL 0

version: 14
name: 'tank'
state: 0
txg: 3374337
pool_guid: 6242690959503408617
hostid: 8697169
hostname: 'wdssandbox'
top_guid: 17982590661103377266
guid: 1717308203478351258
vdev_children: 1
vdev_tree:
type: 'mirror'
id: 0
guid: 17982590661103377266
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 500094468096
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 1717308203478351258
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1939879/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 27
children[1]:
type: 'disk'
id: 1
guid: 9267693216478869057
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1769949/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 893

LABEL 1

version: 14
name: 'tank'
state: 0
txg: 3374337
pool_guid: 6242690959503408617
hostid: 8697169
hostname: 'wdssandbox'
top_guid: 17982590661103377266
guid: 1717308203478351258
vdev_children: 1
vdev_tree:
type: 'mirror'
id: 0
guid: 17982590661103377266
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 500094468096
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 1717308203478351258
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1939879/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 27
children[1]:
type: 'disk'
id: 1
guid: 9267693216478869057
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1769949/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 893

LABEL 2


Re: [zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

2011-05-24 Thread Alex Dolski
Sure enough Cindy, the eSATA cables had been crossed. I exported, powered off, 
reversed the cables, booted, imported, and the pool is currently resilvering 
with both c5t0d0  c5t1d0 present in the mirror. :) Thank you!!

Alex



On May 24, 2011, at 9:58 AM, Cindy Swearingen wrote:

 Hi Alex,
 
 If the hardware and cables were moved around then this is probably
 the root cause of your problem. You should see if you can move the
 devices/cabling back to what they were before the move.
 
 The zpool history output provides the original device name, which
 isn't c5t1d0, either:
 
 # zpool create tank c13t0d0
 
 You might grep the zpool history output to find out which disk was
 eventually attached, like this:
 
 # zpool history | grep attach
 
 But its clear from the zdb -l output, that the devid for this
 particular device changed, which we've seen happen on some hardware. If
 the devid persists, ZFS can follow the devid of the device even if its
 physical path changes and is able to recover more gracefully.
 
 If you continue to use this hardware for your storage pool, you should
 export the pool before making any kind of hardware change.
 
 Thanks,
 
 Cindy
 
 
 On 05/21/11 18:05, Alex Dolski wrote:
 Hi Cindy,
 Thanks for the advice. This is just a little old Gateway PC provisioned as 
 an informal workgroup server. The main storage is two SATA drives in an 
 external enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is 
 snv_134b, upgraded from snv_111a.
 I can't identify a cause in particular. The box has been running for several 
 months without much oversight. It's possible that the two eSATA cables got 
 reconnected to different ports after a recent move.
 The backup has been made and I will try the export  import, per your advice 
 (if zpool command works - it does again at the moment, no reboot!). I will 
 also try switching the eSATA cables to opposite ports.
 Thanks,
 Alex
 Command output follows:
 # format
 Searching for disks...done
 AVAILABLE DISK SELECTIONS:
   0. c5t1d0 ATA-WDC WD5000AAKS-0-1D05-465.76GB
  /pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0
   1. c8d0 DEFAULT cyl 9726 alt 2 hd 255 sec 63
  /pci@0,0/pci-ide@1f,2/ide@0/cmdk@0,0
   2. c9d0 DEFAULT cyl 38910 alt 2 hd 255 sec 63
  /pci@0,0/pci-ide@1f,2/ide@1/cmdk@0,0
   3. c11t0d0 WD-Ext HDD 1021-2002-931.51GB
  /pci@0,0/pci107b,5058@1a,7/storage@1/disk@0,0
 # zpool history tank
 History for 'tank':
 2010-06-18.15:14:16 zpool create tank c13t0d0
 2011-05-07.02:00:07 zpool scrub tank
 2011-05-14.02:00:08 zpool scrub tank
 2011-05-21.02:00:12 zpool scrub tank
 a million 'zfs snapshot' and 'zfs destroy' events from zfs-auto-snap 
 omitted
 # zdb -l /dev/dsk/c5t1d0s0
 
 LABEL 0
 
version: 14
name: 'tank'
state: 0
txg: 3374337
pool_guid: 6242690959503408617
hostid: 8697169
hostname: 'wdssandbox'
top_guid: 17982590661103377266
guid: 1717308203478351258
vdev_children: 1
vdev_tree:
type: 'mirror'
id: 0
guid: 17982590661103377266
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 500094468096
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 1717308203478351258
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1939879/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 27
children[1]:
type: 'disk'
id: 1
guid: 9267693216478869057
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1769949/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 893
 
 LABEL 1
 
version: 14
name: 'tank'
state: 0
txg: 3374337
pool_guid: 6242690959503408617
hostid: 8697169
hostname: 'wdssandbox'
top_guid: 17982590661103377266
guid: 1717308203478351258
vdev_children: 1
vdev_tree:
type: 'mirror'
id: 0
guid: 17982590661103377266
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 500094468096
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 1717308203478351258
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1939879/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 27
children[1]:
type: 'disk'
id: 1
guid: 9267693216478869057
   

Re: [zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

2011-05-21 Thread Alex Dolski
Hi Cindy,

Thanks for the advice. This is just a little old Gateway PC provisioned as an 
informal workgroup server. The main storage is two SATA drives in an external 
enclosure, connected to a Sil3132 PCIe eSATA controller. The OS is snv_134b, 
upgraded from snv_111a.

I can't identify a cause in particular. The box has been running for several 
months without much oversight. It's possible that the two eSATA cables got 
reconnected to different ports after a recent move.

The backup has been made and I will try the export  import, per your advice 
(if zpool command works - it does again at the moment, no reboot!). I will also 
try switching the eSATA cables to opposite ports.

Thanks,
Alex


Command output follows:

# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
   0. c5t1d0 ATA-WDC WD5000AAKS-0-1D05-465.76GB
  /pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0
   1. c8d0 DEFAULT cyl 9726 alt 2 hd 255 sec 63
  /pci@0,0/pci-ide@1f,2/ide@0/cmdk@0,0
   2. c9d0 DEFAULT cyl 38910 alt 2 hd 255 sec 63
  /pci@0,0/pci-ide@1f,2/ide@1/cmdk@0,0
   3. c11t0d0 WD-Ext HDD 1021-2002-931.51GB
  /pci@0,0/pci107b,5058@1a,7/storage@1/disk@0,0


# zpool history tank
History for 'tank':
2010-06-18.15:14:16 zpool create tank c13t0d0
2011-05-07.02:00:07 zpool scrub tank
2011-05-14.02:00:08 zpool scrub tank
2011-05-21.02:00:12 zpool scrub tank
a million 'zfs snapshot' and 'zfs destroy' events from zfs-auto-snap omitted


# zdb -l /dev/dsk/c5t1d0s0

LABEL 0

version: 14
name: 'tank'
state: 0
txg: 3374337
pool_guid: 6242690959503408617
hostid: 8697169
hostname: 'wdssandbox'
top_guid: 17982590661103377266
guid: 1717308203478351258
vdev_children: 1
vdev_tree:
type: 'mirror'
id: 0
guid: 17982590661103377266
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 500094468096
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 1717308203478351258
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1939879/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 27
children[1]:
type: 'disk'
id: 1
guid: 9267693216478869057
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1769949/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 893

LABEL 1

version: 14
name: 'tank'
state: 0
txg: 3374337
pool_guid: 6242690959503408617
hostid: 8697169
hostname: 'wdssandbox'
top_guid: 17982590661103377266
guid: 1717308203478351258
vdev_children: 1
vdev_tree:
type: 'mirror'
id: 0
guid: 17982590661103377266
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 500094468096
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 1717308203478351258
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1939879/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 27
children[1]:
type: 'disk'
id: 1
guid: 9267693216478869057
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1769949/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 893

LABEL 2

version: 14
name: 'tank'
state: 0
txg: 3374337
pool_guid: 6242690959503408617
hostid: 8697169
hostname: 'wdssandbox'
top_guid: 17982590661103377266
guid: 1717308203478351258
vdev_children: 1
vdev_tree:
type: 'mirror'
id: 0
guid: 17982590661103377266
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 500094468096
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 1717308203478351258
path: '/dev/dsk/c5t1d0s0'
devid: 'id1,sd@SATA_WDC_WD5000AAKS-0_WD-WCAWF1939879/a'
phys_path: '/pci@0,0/pci8086,2845@1c,3/pci1095,3132@0/disk@1,0:a'
whole_disk: 1
DTL: 27
children[1]:
type: 'disk'
id: 1
guid: 9267693216478869057

Re: [zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

2011-05-20 Thread Cindy Swearingen

Hi Alex

More scary than interesting to me.

What kind of hardware and which Solaris release?

Do you know what steps lead up to this problem? Any recent hardware
changes?

This output should tell you which disks were in this pool originally:

# zpool history tank

If the history identifies tank's actual disks, maybe you can determine
which disk is masquerading as c5t1d0.

If that doesn't work, accessing the individual disk entries in format
should tell which one is the problem, if its only one.

I would like to see the output of this command:

# zdb -l /dev/dsk/c5t1d0s0

Make sure you have a good backup of your data. If you need to pull a
disk to check cabling, or rule out controller issues, you should
probably export this pool first. Have a good backup.

Others have resolved minor device issues by exporting/importing the
pool but with format/zpool commands hanging on your system, I'm not
confident that this operation will work for you.

Thanks,

Cindy

On 05/19/11 12:17, Alex wrote:

I thought this was interesting - it looks like we have a failing drive in our 
mirror, but the two device nodes in the mirror are the same:

  pool: tank
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid.  Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  mirror-0  DEGRADED 0 0 0
c5t1d0  ONLINE   0 0 0
c5t1d0  FAULTED  0 0 0  corrupted data

c5t1d0 does indeed only appear once in the format list. I wonder how to go 
about correcting this if I can't uniquely identify the failing drive.

format takes forever to spill its guts, and the zpool commands all hang.. 
clearly there is hardware error here, probably causing that, but not sure how to identify 
which disk to pull.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

2011-05-19 Thread Alex
I thought this was interesting - it looks like we have a failing drive in our 
mirror, but the two device nodes in the mirror are the same:

  pool: tank
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid.  Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: scrub completed after 1h9m with 0 errors on Sat May 14 03:09:45 2011
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  mirror-0  DEGRADED 0 0 0
c5t1d0  ONLINE   0 0 0
c5t1d0  FAULTED  0 0 0  corrupted data

c5t1d0 does indeed only appear once in the format list. I wonder how to go 
about correcting this if I can't uniquely identify the failing drive.

format takes forever to spill its guts, and the zpool commands all hang.. 
clearly there is hardware error here, probably causing that, but not sure how 
to identify which disk to pull.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Same device node appearing twice in same mirror; one faulted, one not...

2011-05-19 Thread Jim Klimov
Just a random thought: if two devices have same IDs and seem to work in 
turns,

are you certain you have a mirror and not two paths to the same backend?

A few years back I was given to support a box with sporadically failing 
drives
which turned out to be two paths to the same external array, and 
configuring

MPxIO failover properly helped the system detect them as being actually one
device and stop complaining as long as one path works.

On another hand, you might have some dd if=disk1 of=disk2 kind of cloning
which may have puzzled the system...

HTH,
//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss