Re: [zfs-discuss] Error ZFS-8000-9P

2009-04-06 Thread Jens Elkner
On Fri, Apr 03, 2009 at 10:41:40AM -0700, Joe S wrote:
 Today, I noticed this:
...
 According to http://www.sun.com/msg/ZFS-8000-9P:
 
 The Message ID: ZFS-8000-9P indicates a device has exceeded the
 acceptable limit of errors allowed by the system. See document 203768
 for additional information.
...
I've had the same on a thumper with S10u6 1|2 month ago. Since logs did
not show any disk error/warning for the last 6 month I just cleared the
pool and finally scrubbed it and put back the 'tmp hotspare' used to
the hot spare pool. No errors or warnings since then for that disk,
so it was obviously a false/brain damaged alarm ...

regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Error ZFS-8000-9P

2009-04-03 Thread Joe S
Today, I noticed this:

[...@coruscant$] zpool status
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Sat Apr  4 08:31:49 2009
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz1ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t1d0  ONLINE   0 0 0
c2t4d0  ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   0 0 4  36K resilvered
c2t5d0  ONLINE   0 0 0

errors: No known data errors


I think this means a disk is failing and that ZFS did a good job of
keeping everything sane.

According to http://www.sun.com/msg/ZFS-8000-9P:

The Message ID: ZFS-8000-9P indicates a device has exceeded the
acceptable limit of errors allowed by the system. See document 203768
for additional information.

Unfortunately, I'm not *authorized* to see that document.


Question: I'm assuming the disk is dying. How can I get more
information from the OS to confirm?

Rant: Sun, you suck for telling me to read a document for additional
information, and then denying me access.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error ZFS-8000-9P

2009-04-03 Thread Joe S
On Fri, Apr 3, 2009 at 10:41 AM, Joe S js.li...@gmail.com wrote:
 Today, I noticed this:

 [...@coruscant$] zpool status
  pool: tank
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: resilver completed after 0h0m with 0 errors on Sat Apr  4 08:31:49 
 2009
 config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
            c2t4d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c2t2d0  ONLINE       0     0     0
            c2t3d0  ONLINE       0     0     4  36K resilvered
            c2t5d0  ONLINE       0     0     0

 errors: No known data errors


 I think this means a disk is failing and that ZFS did a good job of
 keeping everything sane.

 According to http://www.sun.com/msg/ZFS-8000-9P:

 The Message ID: ZFS-8000-9P indicates a device has exceeded the
 acceptable limit of errors allowed by the system. See document 203768
 for additional information.

 Unfortunately, I'm not *authorized* to see that document.


 Question: I'm assuming the disk is dying. How can I get more
 information from the OS to confirm?

 Rant: Sun, you suck for telling me to read a document for additional
 information, and then denying me access.


Running Nevada 105.

Incidentally, I tried upgrading to Nevada 110, but the OS wouldn't
finish booting. It stopped at the part where it was trying to mount my
ZFS filesystems. I booted back into 105 and it boots, but then as I
ran a zpool status, I noticed that message.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error ZFS-8000-9P

2009-04-03 Thread Tim
On Fri, Apr 3, 2009 at 12:41 PM, Joe S js.li...@gmail.com wrote:

 Today, I noticed this:

 [...@coruscant$] zpool status
  pool: tank
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: resilver completed after 0h0m with 0 errors on Sat Apr  4 08:31:49
 2009
 config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz1ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t1d0  ONLINE   0 0 0
c2t4d0  ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   0 0 4  36K resilvered
c2t5d0  ONLINE   0 0 0

 errors: No known data errors


 I think this means a disk is failing and that ZFS did a good job of
 keeping everything sane.

 According to http://www.sun.com/msg/ZFS-8000-9P:

 The Message ID: ZFS-8000-9P indicates a device has exceeded the
 acceptable limit of errors allowed by the system. See document 203768
 for additional information.

 Unfortunately, I'm not *authorized* to see that document.


 Question: I'm assuming the disk is dying. How can I get more
 information from the OS to confirm?

 Rant: Sun, you suck for telling me to read a document for additional
 information, and then denying me access.



On that front... I'm wondering if we could get a project going to mirror all
of those pages to a non-sun hosted site.  Just in case that IBM thing really
does come to fruition :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error ZFS-8000-9P

2009-04-03 Thread Bob Friesenhahn

On Fri, 3 Apr 2009, Joe S wrote:


I think this means a disk is failing and that ZFS did a good job of
keeping everything sane.


The disk is not necessarily failing (i.e. headed for the dumpster). 
Notice that only 36K had to be resilvered.  If this is a SATA disk, 
then write down a note that this occured, clear the errors, and then 
wait for additional errors to crop up on the same disk.  If this is an 
enterprise SCSI/SAS/FC disk then the situation could indicate 
something more serious since media failures are much less common.


By all means, do a 'zfs scrub' on the pool to make sure that there is 
not other data waiting to fail.



Rant: Sun, you suck for telling me to read a document for additional
information, and then denying me access.


Yes, this sucks.  Presumably if you paid for a support contract you 
could see the additional information.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error ZFS-8000-9P

2009-04-03 Thread Tim
On Fri, Apr 3, 2009 at 12:51 PM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Fri, 3 Apr 2009, Joe S wrote:


 I think this means a disk is failing and that ZFS did a good job of
 keeping everything sane.


 The disk is not necessarily failing (i.e. headed for the dumpster).
 Notice that only 36K had to be resilvered.  If this is a SATA disk, then
 write down a note that this occured, clear the errors, and then wait for
 additional errors to crop up on the same disk.  If this is an enterprise
 SCSI/SAS/FC disk then the situation could indicate something more serious
 since media failures are much less common.

 By all means, do a 'zfs scrub' on the pool to make sure that there is not
 other data waiting to fail.

  Rant: Sun, you suck for telling me to read a document for additional
 information, and then denying me access.


 Yes, this sucks.  Presumably if you paid for a support contract you could
 see the additional information.


 Bob



I have a support contract and cannot see it.  I'm assuming it's internal
only.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error ZFS-8000-9P

2009-04-03 Thread Geoff Shipman
Joe,

I just checked the referenced document and the document is providing
steps via an example of replacing the failed/faulted device.

I found in the ZFS Administration guide the URL below on repairing a
device in a zpool.

http://docs.sun.com/app/docs/doc/819-5461/gbbvf?l=ena=view

The above URL was linked from the Chapter 11 portion of the ZFS
Administration guide on troubleshooting problems.

http://docs.sun.com/app/docs/doc/819-5461/gavwg?l=ena=view

The link was in the paragraph below.

Physically Reattaching the Device
Exactly how a missing device is reattached depends on the device in
question. If the device is a network-attached drive, connectivity should
be restored. If the device is a USB or other removable media, it should
be reattached to the system. If the device is a local disk, a controller
might have failed such that the device is no longer visible to the
system. In this case, the controller should be replaced at which point
the disks will again be available. Other pathologies can exist and
depend on the type of hardware and its configuration. If a drive fails
and it is no longer visible to the system (an unlikely event), the
device should be treated as a damaged device. Follow the procedures
outlined in Repairing a Damaged Device.

I do agree that if we (Sun) point people to additional steps that if
they are externally available those should be referenced before an
internal only link.

Geoff


On Fri, 2009-04-03 at 11:45, Joe S wrote:
 On Fri, Apr 3, 2009 at 10:41 AM, Joe S js.li...@gmail.com wrote:
  Today, I noticed this:
 
  [...@coruscant$] zpool status
   pool: tank
   state: ONLINE
  status: One or more devices has experienced an unrecoverable error.
 An
 attempt was made to correct the error.  Applications are
 unaffected.
  action: Determine if the device needs to be replaced, and clear the
 errors
 using 'zpool clear' or replace the device with 'zpool
 replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
   scrub: resilver completed after 0h0m with 0 errors on Sat Apr  4
 08:31:49 2009
  config:
 
 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   raidz1ONLINE   0 0 0
 c2t0d0  ONLINE   0 0 0
 c2t1d0  ONLINE   0 0 0
 c2t4d0  ONLINE   0 0 0
   raidz1ONLINE   0 0 0
 c2t2d0  ONLINE   0 0 0
 c2t3d0  ONLINE   0 0 4  36K resilvered
 c2t5d0  ONLINE   0 0 0
 
  errors: No known data errors
 
 
  I think this means a disk is failing and that ZFS did a good job of
  keeping everything sane.
 
  According to http://www.sun.com/msg/ZFS-8000-9P:
 
  The Message ID: ZFS-8000-9P indicates a device has exceeded the
  acceptable limit of errors allowed by the system. See document
 203768
  for additional information.
 
  Unfortunately, I'm not *authorized* to see that document.
 
 
  Question: I'm assuming the disk is dying. How can I get more
  information from the OS to confirm?
 
  Rant: Sun, you suck for telling me to read a document for additional
  information, and then denying me access.
 
 
 Running Nevada 105.
 
 Incidentally, I tried upgrading to Nevada 110, but the OS wouldn't
 finish booting. It stopped at the part where it was trying to mount my
 ZFS filesystems. I booted back into 105 and it boots, but then as I
 ran a zpool status, I noticed that message.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Geoff Shipman - (303) 272-9955
Systems Technology Service Center - Operating System
Solaris and Network Technology Domain
Americas Systems Technology Service Center

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss