Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-11 Thread Cindy Swearingen

Hi Paul,

Example 11-1 in this section describes how to replace a
disk on an x4500 system:

http://docs.sun.com/app/docs/doc/819-5461/gbcet?a=view

Cindy

On 01/09/10 16:17, Paul B. Henson wrote:

On Sat, 9 Jan 2010, Eric Schrock wrote:


If ZFS removed the drive from the pool, why does the system keep
complaining about it?

It's not failing in the sense that it's returning I/O errors, but it's
flaky, so it's attaching and detaching.  Most likely it decided to attach
again and then you got transport errors.


Ok, how do I make it stop logging messages about the drive until it is
replaced? It's still filling up the logs with the same errors about the
drive being offline.

Looks like hdadm isn't it:

r...@cartman ~ # hdadm offline disk c1t2d0
/usr/bin/hdadm[1762]: /dev/rdsk/c1t2d0d0p0: cannot open
/dev/rdsk/c1t2d0d0p0 is not available

Hmm, I was able to unconfigure it with cfgadm:

r...@cartman ~ # cfgadm -c unconfigure sata1/2::dsk/c1t2d0

It went from:

sata1/2::dsk/c1t2d0disk connectedconfigured   failed

to:

sata1/2disk connectedunconfigured failed

Hopefully that will stop the errors until it's replaced and not break
anything else :).


No, it's fine.  DEGRADED just means the pool is not operating at the
ideal state.  By definition a hot spare is always DEGRADED.  As long as
the spare itself is ONLINE it's fine.


The spare shows as INUSE, but I'm guessing that's fine too.


Hope that helps


That was perfect, thank you very much for the review. Now I can not worry
about it until Monday :).


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-11 Thread Eric Schrock

On 01/11/10 17:42, Paul B. Henson wrote:

On Sat, 9 Jan 2010, Eric Schrock wrote:


No, it's fine.  DEGRADED just means the pool is not operating at the
ideal state.  By definition a hot spare is always DEGRADED.  As long as
the spare itself is ONLINE it's fine.


One more question on this; so there's no way to tell just from the status
the difference between a pool degraded due to disk failure but still with
full redundancy from a hot spare vs a pool degraded due to disk failure
that has lost redundancy due to that failure? I guess you can review the
pool details for the specifics but for large pools it seems it would be
valuable to be able to quickly distinguish these states from the short
status.


No, there is no way to tell if a pool has DTL (dirty time log) entries.

- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-11 Thread Paul B. Henson
On Mon, 11 Jan 2010, Eric Schrock wrote:

 No, there is no way to tell if a pool has DTL (dirty time log) entries.

Hmm, I hadn't heard that term before, but based on a quick search I take it
that's the list of data in the pool that is not fully redundant? So if a
2-way mirror vdev lost a half, everything written after the loss would be
on the DTL, and if the same device came back, recovery would entail just
running through the DTL and writing out what it missed? Although presumably
if the failed device was replaced with another device entirely all of the
data would need to be written out.

I'm not quite sure that answered my question. My original question was, for
example, given a 2-way mirror, one half fails. There is a hot spare
available, which is pulled in, and while the pool isn't optimal, it does
have the same number of devices that it's supposed to. On the other hand,
the same mirror loses a device, there's no hot spare, and the pool is short
one device. My understanding is that in both scenarios the pool status
would be DEGRADED, but it seems there's an important difference. In the
first case, another device could fail, and the pool would still be ok. In
the second, another device failing would result in complete loss of data.

While you can tell the difference between these two different states by
looking at the detailed output and seeing if a hot spare is in use, I was
just saying that it would be nice for the short status to have some
distinction between device failed, hot spare in use and device failed,
keep fingers crossed ;).

Back to your answer, if the existance of DTL entries means the pool doesn't
have full redundancy for some data, and you can't tell if a pool has DTL
entries, are you saying there's no way to tell if the current state of your
pool could survive a device failure? If a resilver successfully completes,
barring another device failure, doesn't that mean the pool is restored to
full redundancy? I feel like I must be misunderstanding something :(.

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-11 Thread Eric Schrock

On Jan 11, 2010, at 6:35 PM, Paul B. Henson wrote:

 On Mon, 11 Jan 2010, Eric Schrock wrote:
 
 No, there is no way to tell if a pool has DTL (dirty time log) entries.
 
 Hmm, I hadn't heard that term before, but based on a quick search I take it
 that's the list of data in the pool that is not fully redundant? So if a
 2-way mirror vdev lost a half, everything written after the loss would be
 on the DTL, and if the same device came back, recovery would entail just
 running through the DTL and writing out what it missed? Although presumably
 if the failed device was replaced with another device entirely all of the
 data would need to be written out.
 
 I'm not quite sure that answered my question. My original question was, for
 example, given a 2-way mirror, one half fails. There is a hot spare
 available, which is pulled in, and while the pool isn't optimal, it does
 have the same number of devices that it's supposed to. On the other hand,
 the same mirror loses a device, there's no hot spare, and the pool is short
 one device. My understanding is that in both scenarios the pool status
 would be DEGRADED, but it seems there's an important difference. In the
 first case, another device could fail, and the pool would still be ok. In
 the second, another device failing would result in complete loss of data.
 
 While you can tell the difference between these two different states by
 looking at the detailed output and seeing if a hot spare is in use, I was
 just saying that it would be nice for the short status to have some
 distinction between device failed, hot spare in use and device failed,
 keep fingers crossed ;).
 
 Back to your answer, if the existance of DTL entries means the pool doesn't
 have full redundancy for some data, and you can't tell if a pool has DTL
 entries, are you saying there's no way to tell if the current state of your
 pool could survive a device failure? If a resilver successfully completes,
 barring another device failure, doesn't that mean the pool is restored to
 full redundancy? I feel like I must be misunderstanding something :(.

DTLs are a more specific answer to your question.  It implies that a toplevel 
vdev has a known time when there is invalid data for it or one of its children. 
 This may because a device failed and is accumulating DTL time, a new replacing 
or spare vdev was attached, or it may be because a device was unplugged and 
then plugged back in.  Your example (hot spares) is but one of the ways in 
which this can happen, but in any of the cases it implies that data is not 
fully replicated.

There is obviously a way to detect this in the kernel, it's simply not exported 
to userland in any useful way.  The reason I focused on DTLs is that if any 
mechanism were provided to distinguish a pool lacking full redundancy, it would 
be based on DTLs - nothing else makes sense.

- Eric

 
 Thanks...
 
 
 -- 
 Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
 Operating Systems and Network Analyst  |  hen...@csupomona.edu
 California State Polytechnic University  |  Pomona CA 91768

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Paul B. Henson

We just had our first x4500 disk failure (which of course had to happen
late Friday night sigh), I've opened a ticket on it but don't expect a
response until Monday so was hoping to verify the hot spare took over
correctly and we still have redundancy pending device replacement.

This is an S10U6 box:

SunOS cartman 5.10 Generic_141445-09 i86pc i386 i86pc

Looks like the first errors started yesterday morning:

Jan  8 07:46:02 cartman marvell88sx: [ID 268337 kern.warning] WARNING:
marvell88
sx1:device on port 2 failed to reset
Jan  8 07:46:15 cartman marvell88sx: [ID 268337 kern.warning] WARNING:
marvell88
sx1:device on port 2 failed to reset
Jan  8 07:46:32 cartman sata: [ID 801593 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1:
Jan  8 07:46:32 cartman SATA device at port 2 - device failed
Jan  8 07:46:32 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  8 07:46:32 cartman Command failed to complete...Device is gone

ZFS failed the drive about 11:15PM:

Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] zpool
export
 status: One or more devices has experienced an unrecoverable error.  An
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] zpool
export
 status: attempt was made to correct the error.  Applications are
unaffected.
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] unknown
head
er see
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error]
warning: poo
l export health DEGRADED

However, the errors continue still:

Jan  9 03:54:48 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  9 03:54:48 cartman Command failed to complete...Device is gone
[...]
Jan  9 07:56:12 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  9 07:56:12 cartman Command failed to complete...Device is gone
Jan  9 07:56:12 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  9 07:56:12 cartman drive offline

If ZFS removed the drive from the pool, why does the system keep
complaining about it? Is fault management stuff still poking at it?

Here's the zpool status output:

  pool: export
 state: DEGRADED
[...]
 scrub: scrub completed after 0h6m with 0 errors on Fri Jan  8 23:21:31
2010


NAME  STATE READ WRITE CKSUM
exportDEGRADED 0 0 0

  mirror  DEGRADED 0 0 0
c0t2d0ONLINE   0 0 0
spare DEGRADED 18.9K 0 0
  c1t2d0  REMOVED  0 0 0
  c5t0d0  ONLINE   0 0 18.9K

spares
  c5t0d0  INUSE currently in use

Is the pool/mirror/spare still supposed to show up as degraded after the
hot spare is deployed?

There are 18.9K checksum errors on the disk that failed, but there are also
18.9K read errors on the hot spare?

The scrub started at 11pm last night, the disk got booted at 11:15pm,
presumably the scrub came across the failures the os had been reporting.
The last scrub status shows that scrub completing successfully. What
happened to the resilver status? How can I tell if the resilver was
successful? Did the resilver start and complete while the scrub was still
running and its status output was lost? Is there any way to see the status
of past scrubs/resilvers, or is only the most recent one available?

Fault managment doesn't report any problems:

r...@cartman ~ # fmdump
TIME UUID SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty

Shouldn't this show a failed disk?

fmdump -e shows tuns of bad stuff:

Jan 08 07:46:32.9467 ereport.fs.zfs.probe_failure
Jan 08 07:46:36.2015 ereport.fs.zfs.io
[...]
Jan 08 07:51:05.1865 ereport.fs.zfs.io

None of that results in a fault diagnosys?

Mostly I'd like to verify my hot spare is working correctly. Given the
spare status is degraded, the read errors on the spare device, and the
lack of successful resilver status output, it seems like the spare might
not have been added successfully.

Thanks for any input you might provide...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Eric Schrock

On Jan 9, 2010, at 9:45 AM, Paul B. Henson wrote:
 
 If ZFS removed the drive from the pool, why does the system keep
 complaining about it?

It's not failing in the sense that it's returning I/O errors, but it's flaky, 
so it's attaching and detaching.  Most likely it decided to attach again and 
then you got transport errors.

 Is fault management stuff still poking at it?

No.

 Is the pool/mirror/spare still supposed to show up as degraded after the
 hot spare is deployed?

Yes.

 There are 18.9K checksum errors on the disk that failed, but there are also
 18.9K read errors on the hot spare?

This is a bug recently fixed in OpenSolaris.

 The last scrub status shows that scrub completing successfully. What
 happened to the resilver status?

If there was a scrub it will show as the last thing completed.

 How can I tell if the resilver was
 successful?

If the scrub was successful.

 Did the resilver start and complete while the scrub was still
 running and its status output was lost?

No, only one can be active at any time.

 Is there any way to see the status
 of past scrubs/resilvers, or is only the most recent one available?

Only the most recent one.

 None of that results in a fault diagnosys?

When the device is in the process of going away, no.  From the OS perspective 
this disk was physically removed from the system.

 Mostly I'd like to verify my hot spare is working correctly. Given the
 spare status is degraded, the read errors on the spare device, and the
 lack of successful resilver status output, it seems like the spare might
 not have been added successfully.

No, it's fine.  DEGRADED just means the pool is not operating at the ideal 
state.  By definition a hot spare is always DEGRADED.  As long as the spare 
itself is ONLINE it's fine.

Hope that helps,

- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Ian Collins

Paul B. Henson wrote:

We just had our first x4500 disk failure (which of course had to happen
late Friday night sigh), I've opened a ticket on it but don't expect a
response until Monday so was hoping to verify the hot spare took over
correctly and we still have redundancy pending device replacement.

This is an S10U6 box:

Here's the zpool status output:

  pool: export
 state: DEGRADED
[...]
 scrub: scrub completed after 0h6m with 0 errors on Fri Jan  8 23:21:31
2010


NAME  STATE READ WRITE CKSUM
exportDEGRADED 0 0 0

  mirror  DEGRADED 0 0 0
c0t2d0ONLINE   0 0 0
spare DEGRADED 18.9K 0 0
  c1t2d0  REMOVED  0 0 0
  c5t0d0  ONLINE   0 0 18.9K

spares
  c5t0d0  INUSE currently in use

Is the pool/mirror/spare still supposed to show up as degraded after the
hot spare is deployed?

  
Yes, the spare will show as degraded until you replace it. I had a pool 
on a 4500 that lost one drive, then swapped out 3 more due to brain 
farts from that naff Marvell driver. It was a bit of a concern for a 
while seeing two degraded devices in one raidz vdev!



The scrub started at 11pm last night, the disk got booted at 11:15pm,
presumably the scrub came across the failures the os had been reporting.
The last scrub status shows that scrub completing successfully. What
happened to the resilver status? How can I tell if the resilver was
successful? Did the resilver start and complete while the scrub was still
running and its status output was lost? Is there any way to see the status
of past scrubs/resilvers, or is only the most recent one available?

  

You only see the last one, but a resilver is a scrub.


Mostly I'd like to verify my hot spare is working correctly. Given the
spare status is degraded, the read errors on the spare device, and the
lack of successful resilver status output, it seems like the spare might
not have been added successfully.

  

It has - scrub completed after 0h6m with 0 errors.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Paul B. Henson
On Sat, 9 Jan 2010, Eric Schrock wrote:

  If ZFS removed the drive from the pool, why does the system keep
  complaining about it?

 It's not failing in the sense that it's returning I/O errors, but it's
 flaky, so it's attaching and detaching.  Most likely it decided to attach
 again and then you got transport errors.

Ok, how do I make it stop logging messages about the drive until it is
replaced? It's still filling up the logs with the same errors about the
drive being offline.

Looks like hdadm isn't it:

r...@cartman ~ # hdadm offline disk c1t2d0
/usr/bin/hdadm[1762]: /dev/rdsk/c1t2d0d0p0: cannot open
/dev/rdsk/c1t2d0d0p0 is not available

Hmm, I was able to unconfigure it with cfgadm:

r...@cartman ~ # cfgadm -c unconfigure sata1/2::dsk/c1t2d0

It went from:

sata1/2::dsk/c1t2d0disk connectedconfigured   failed

to:

sata1/2disk connectedunconfigured failed

Hopefully that will stop the errors until it's replaced and not break
anything else :).

 No, it's fine.  DEGRADED just means the pool is not operating at the
 ideal state.  By definition a hot spare is always DEGRADED.  As long as
 the spare itself is ONLINE it's fine.

The spare shows as INUSE, but I'm guessing that's fine too.

 Hope that helps

That was perfect, thank you very much for the review. Now I can not worry
about it until Monday :).

-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss