Re: [zfs-discuss] Help identify failed drive

2010-07-21 Thread Marty Scholes
 If the format utility is not displaying the WD drives
 correctly,
 then ZFS won't see them correctly either. You need to
 find out why.
 
 I would export this pool and recheck all of your
 device connections.

I didn't see it in the postings, but are the same serial numbers showing up 
multiple times?  Is accidental multipathing taking place here?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-20 Thread Yuri Homchuk
Thanks Haudi, really appreciate your help.

This is Supermicro Server.
I really don't remember controller model, I set it up about 3 years ago. I just 
remember that I needed to reflush controller firmware to make it work in JBOD 
mode.

I run the script you suggested:
But it looks like it's still unable to map sd11 and sd12 to an actual c*t*d*...
I'm desperate...


cmdk0=/dev/dsk/c3d0
sd0=/dev/dsk/c1t0d0
sd1=/dev/dsk/c1t1d0
sd2=/dev/dsk/c0t0d0
sd3=/dev/dsk/c2t0d0
sd4=/dev/dsk/c2t1d0
sd5=/dev/dsk/c2t2d0
sd6=/dev/dsk/c2t3d0
sd7=/dev/dsk/c2t4d0
sd8=/dev/dsk/c2t5d0
sd9=/dev/dsk/c2t6d0
sd10=/dev/dsk/c2t7d0
sd11=/dev/dsk/sd11
sd12=/dev/dsk/sd12





From: Haudy Kazemi [mailto:kaze0...@umn.edu]
Sent: Monday, July 19, 2010 4:12 PM
To: zfs-discuss@opensolaris.org
Cc: Yuri Homchuk; Cindy Swearingen
Subject: Re: [zfs-discuss] Help identify failed drive




3.) on some systems I've found another version of the iostat command to be more 
useful, particularly when iostat -En leaves the serial number field empty or 
otherwise doesn't read the serial number correctly.  Try

this:





' iostat -Eni ' indeed outputs Device ID on some of the drives,but I still 
can't understand how it helps me to identify model of specific drive.


See below.  Some



# iostat -Eni

c3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0

Model: GIGABYTE i-RAM  Revision:  Device Id: 
id1,c...@agigabyte_i-ram=33f100336d9cc244b01d

Size: 2.15GB 2146443264 bytes

Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0

Illegal Request: 0


GIGABYTE i-RAM (2GB RAM based SSD)
Probably serial number: 33F100336D9CC244B01D



c1t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0

Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@ast3500320as=9qm34ybz

Size: 500.11GB 500107862016 bytes

Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0

Illegal Request: 56 Predictive Failure Analysis: 0


c1t0d0 = controller 1 target 0 device 0.  Match zpool status devices names with 
this.
ST3500320AS=9QM34YBZ
Model: ST3500320AS   (ST means Seagate Technologies)
Serial: 9QM34YBZ



c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0

Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@ast3500320as=9qm353d2

Size: 500.11GB 500107862016 bytes

Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0

Illegal Request: 66 Predictive Failure Analysis: 0


c1t1d0 = controller 1 target 1 device 0
ST3500320AS=9QM353D2



c0t0d0   Soft Errors: 0 Hard Errors: 198 Transport Errors: 0

Vendor: SONY Product: CD-ROM CDU5212   Revision: 5YS1 Device Id:

Size: 0.00GB 0 bytes

Media Error: 0 Device Not Ready: 198 No Device: 0 Recoverable: 0

Illegal Request: 0 Predictive Failure Analysis: 0


Sony CDU5212 52X CD optical drive



c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0

Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9f49ef

Size: 500.11GB 500107862016 bytes

Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0

Illegal Request: 0 Predictive Failure Analysis: 0


c2t0d0 = controller 2 target 0 device 0

You should seriously consider checking, and probably updating, the firmware on 
all your Seagate ST3500320AS 7200.11 drives.  Version SD15 (as reported above) 
is a known bad firmware.
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207951



c2t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0

Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id:

Size: 500.11GB 500107862016 bytes

Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0

Illegal Request: 0 Predictive Failure Analysis: 0


controller 2 again


c2t2d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0

Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9be7ab

Size: 500.11GB 500107862016 bytes

Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0

Illegal Request: 0 Predictive Failure Analysis: 0


controller 2 again


c2t3d0   Soft Errors: 0 Hard Errors: 9 Transport Errors: 9

Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9fa1ab

Size: 500.11GB 500107862016 bytes

Media Error: 7 Device Not Ready: 0 No Device: 2 Recoverable: 0

Illegal Request: 0 Predictive Failure Analysis: 0


controller 2 again.  This one is the suspect (c2t3d0) but we don't know the 
serial number.  The controller is telling us it is 'n5000c5000b9fa1ab' which 
must be a unique ID the controller made up for itself.


c2t4d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0

Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9fde5f

Size: 500.11GB 500107862016 bytes

Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0

Illegal Request: 0 Predictive Failure Analysis: 0

Re: [zfs-discuss] Help identify failed drive

2010-07-20 Thread Yuri Homchuk


Well, this is a REALLY 300 users production server with 12 VM's running on it, 
so I definitely won't play with a firmware :)
I can easily identify which drive is what by physically looking at it.
It's just sad to realize that I cannot trust solaris anymore.
I never noticed this problem before because we were always using  Seagate 
drives, so I didn't notice any difference

In my understanding there are three controllers:

C1 - build-in AHCI controller
C2 - build-in controller that I needed to reflush
C3 - PCI card old sata 1.5 controller- not in use, just ignore it.

I guess C2 is the one that gives me hassles.

Is there way to retrieve the model from solaris ?

Thanks.



From: Haudy Kazemi [mailto:kaze0...@umn.edu]
Sent: Monday, July 19, 2010 5:00 PM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org; Cindy Swearingen
Subject: Re: [zfs-discuss] Help identify failed drive



This is Supermicro Server.
I really don't remember controller model, I set it up about 3 years ago. I just 
remember that I needed to reflush controller firmware to make it work in JBOD 
mode.
Remember, changing controller firmware may affect your ability to access 
drives.  Backup first, as your array is still working although degraded.  Then 
update the firmware of the controller(s), and the firmware of your Seagate 
7200.11 drives.


Note that the preferred modes are in order of choice:
1.) plain AHCI ports connected to the PCI-E bus (includes most built-in ports 
on recent motherboards; older boards may have them on the PCI bus)
2.) RAID ports configured as single drive arrays
3.) JBOD ports configured as single drive JBODs

1 is best
2 is preferred over 3 because some controllers have lower performance in JBOD 
mode or hide features.
IDE (PATA) ports with a single master drive on them are approximately 1.5 on 
the ranking.  Putting a second drive on a PATA port is like using a SATA port 
multiplier: your bandwidth gets reduce and performance can suffer.




I run the script you suggested:
But it looks like it's still unable to map sd11 and sd12 to an actual c*t*d*...
How many different controllers do you have?  You'll need to look all this up to 
sort out the mess.  Your logs show you have at least 3 different controllers 
(c1, c2, and c3) and maybe more for the sd11 and sd12 devices.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-20 Thread marty scholes
Michael Shadle wrote:

Actually I guess my real question is why iostat hasn't logged any

 errors in its counters even though the device has been bad in there
 for months?

One of my arrays had a drive in slot 4 fault -- lots of reset something or 
other 
errors.  I cleared the errors and the pool and it did it again, even though the 
drive was showing ok in smartmontools and passed its internal self test.

I replaced the drive with my cold spare and a week later the replacement drive 
in slot 4 had the same errors.

Clearly it was the chassis and not the drive.  I blew out the connector on slot 
4 and it did again a week later.

Again I cleared error, cycled the power on the array and haven't had the 
problem 
in the past 5 weeks.

Sometimes things just happen, I guess.



  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-20 Thread Linda Messerschmidt
 No, the pool tank consists of 7 physical drives(5 of Seagate and 2 of
 Western Digital) See output below

I think you are looking at disk label name, and this is confusing you.  I had a 
similar thing happen where the label name from a 64GB SSD got written onto a 
1TB HD.

That output in format can say whatever you want I think, and if you set up 
disks by copying the layout, it will copy that label as well.

To find out really which drive is which I think would depend on what kind of 
disk controller is c2.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Cindy Swearingen

Hi--

I don't know what's up with iostat -En but I think I remember a problem
where iostat does not correctly report drives running in legacy IDE mode.

You might use the format utility to identify these devices.

Thanks,

Cindy
On 07/18/10 14:15, Alxen4 wrote:

This is a situation:

I've got an error on one of the drives in 'zpool status' output:

 zpool status tank

  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   1 0 0
c2t4d0  ONLINE   0 0 0
c2t5d0  ONLINE   0 0 0
c2t7d0  ONLINE   0 0 0

So I would like to replace 'c2t3d0'.

I know for a fact the pool has 7 physical drives : 5 of Seagate and 2 of WD.

I want to know if 'c2t3d0' Seagate or WD.

If I run 'iostat -En' it shows that all  c*t*d0 drives are Seagate and 
sd11/sd12 are WD.

This totally confuses me...
Why there are two different types of drives in iostat output : c*t*d0 and sd* 
???
How come all c*t*d0 appear as Seagate.I know for sure two of them are WD.
Why WD drives appears as sd* and not as c*t*d0 ?

Please help.


--

# iostat -En


c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 54 Predictive Failure Analysis: 0

c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t2d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t3d0   Soft Errors: 0 Hard Errors: 9 Transport Errors: 9
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 7 Device Not Ready: 0 No Device: 2 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t4d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t5d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t6d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t7d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

[b]sd11 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 1D05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

sd12 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 0K05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0[/b]





Thanks a lot.


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Cindy Swearingen

Hi--

A google search of ST3500320AS turns up Seagate Barracuda drives.

All 7 drives in the pool tank are ST3500320AS. The other two c1t0d0
and c3d0 are unknown, but are not part of this pool.

You can also use fmdump -eV to see how long c2t3d0 has had problems.

Thanks,

Cindy

On 07/19/10 09:29, Yuri Homchuk wrote:

Thanks Cindy,

But format shows exactly same thing:
All of them appear as Seagate, no WD at all...
How could it be ???

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
  /p...@0,0/pci15d9,a...@5/d...@0,0
   1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci15d9,a...@5/d...@1,0
   2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
   3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
   4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
   5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
   6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
   7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
   8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB
  /p...@1,0/pci1022,7...@2/pci-...@1/i...@1/c...@0,0
Specify disk (enter its number): ^C


Thanks again.


-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com] 
Sent: Monday, July 19, 2010 9:16 AM

To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

I don't know what's up with iostat -En but I think I remember a problem where 
iostat does not correctly report drives running in legacy IDE mode.

You might use the format utility to identify these devices.

Thanks,

Cindy
On 07/18/10 14:15, Alxen4 wrote:

This is a situation:

I've got an error on one of the drives in 'zpool status' output:

 zpool status tank

  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   1 0 0
c2t4d0  ONLINE   0 0 0
c2t5d0  ONLINE   0 0 0
c2t7d0  ONLINE   0 0 0

So I would like to replace 'c2t3d0'.

I know for a fact the pool has 7 physical drives : 5 of Seagate and 2 of WD.

I want to know if 'c2t3d0' Seagate or WD.

If I run 'iostat -En' it shows that all  c*t*d0 drives are Seagate and 
sd11/sd12 are WD.

This totally confuses me...
Why there are two different types of drives in iostat output : c*t*d0 and sd* 
???
How come all c*t*d0 appear as Seagate.I know for sure two of them are WD.
Why WD drives appears as sd* and not as c*t*d0 ?

Please help.


--

# iostat -En


c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
Request: 54 Predictive Failure Analysis: 0


c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
Request: 0 Predictive Failure Analysis: 0


c2t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
Request: 0 Predictive Failure Analysis: 0


c2t2d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
Request: 0 Predictive Failure Analysis: 0


c2t3d0   Soft Errors: 0 Hard Errors: 9 Transport Errors: 9
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 7 Device Not Ready: 0 No Device: 2 Recoverable: 0 Illegal 
Request: 0 Predictive Failure Analysis: 0


c2t4d0

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Cindy Swearingen


I think you are saying that even though format shows 9 devices (0-8) on 
this system, there's really only 7 and the pool tank has only 5 (?).


I'm not sure why some devices would show up as duplicates.

Any recent changes to this system?

You might try exporting this pool and make sure that all the device
connections are correct.

cs

On 07/19/10 09:57, Yuri Homchuk wrote:


I know that  ST3500320AS is  Seagate Barracuda.

That exactly why I am confused.
I looked physically at drives and I confirm again that 5 drives are Seagate and 
2 drives are Western Digital.
But Solaris tells me that all 7 drives are  Seagate Barracuda which is 
definetly not correct.


This is a reason of my original question.
I need to know if  c2t3d0 Seagate or Western Digital.

Thanks,

-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com] 
Sent: Monday, July 19, 2010 9:48 AM

To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

A google search of ST3500320AS turns up Seagate Barracuda drives.

All 7 drives in the pool tank are ST3500320AS. The other two c1t0d0 and c3d0 
are unknown, but are not part of this pool.

You can also use fmdump -eV to see how long c2t3d0 has had problems.

Thanks,

Cindy

On 07/19/10 09:29, Yuri Homchuk wrote:

Thanks Cindy,

But format shows exactly same thing:
All of them appear as Seagate, no WD at all...
How could it be ???

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
  /p...@0,0/pci15d9,a...@5/d...@0,0
   1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci15d9,a...@5/d...@1,0
   2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
   3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
   4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
   5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
   6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
   7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
   8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB
  /p...@1,0/pci1022,7...@2/pci-...@1/i...@1/c...@0,0
Specify disk (enter its number): ^C


Thanks again.


-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
Sent: Monday, July 19, 2010 9:16 AM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

I don't know what's up with iostat -En but I think I remember a problem where 
iostat does not correctly report drives running in legacy IDE mode.

You might use the format utility to identify these devices.

Thanks,

Cindy
On 07/18/10 14:15, Alxen4 wrote:

This is a situation:

I've got an error on one of the drives in 'zpool status' output:

 zpool status tank

  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   1 0 0
c2t4d0  ONLINE   0 0 0
c2t5d0  ONLINE   0 0 0
c2t7d0  ONLINE   0 0 0

So I would like to replace 'c2t3d0'.

I know for a fact the pool has 7 physical drives : 5 of Seagate and 2 of WD.

I want to know if 'c2t3d0' Seagate or WD.

If I run 'iostat -En' it shows that all  c*t*d0 drives are Seagate and 
sd11/sd12 are WD.

This totally confuses me...
Why there are two different types of drives in iostat output : c*t*d0 and sd* 
???
How come all c*t*d0 appear as Seagate.I know for sure two of them are WD.
Why WD drives appears as sd* and not as c*t*d0 ?

Please help.


--

# iostat -En


c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal

Request: 54 Predictive Failure Analysis: 0

c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Haudy Kazemi

A few things:

1.) did you move your drives around or change which controller each one 
was connected to sometime after installing and setting up OpenSolaris?  
If so, a pool export and re-import may be in order.


2.) are you sure the drive is failing?  Does the problem only affect 
this drive or are other drives randomly affect too?  If you've run 
'zpool clear' and the problem comes back, something is wrong but it 
could also be RAM, CPU, motherboard, controller, or power supply 
problems.  Smartmontools can read the drive SMART data and device error 
logs...run it from an Ubuntu 10.04 Live CD (sudo apt-get install 
smartmontools) or from a PartedMagic Live CD if you have trouble getting 
Smartmontools working on OpenSolaris with your hardware.


3.) on some systems I've found another version of the iostat command to 
be more useful, particularly when iostat -En leaves the serial number 
field empty or otherwise doesn't read the serial number correctly.  Try 
this:


iostat -Eni

This should give you a list of drives showing their name in the cXtYdZsN 
format, and their Device ID which may contain the drive serial numbers 
concatenated with the model.  Compare that list with your 'zpool status 
tank' output, which in your case means looking for 'c2t3d0'.  Once you 
find the serial number, you can look at labels printed on your drives 
and verify which one it is.


One tip: if your server case is hard to work in or it is otherwise 
difficult to remove drives to read the serial numbers (lots of screws, 
cables in the way, tight fits, etc.), create additional serial number 
labels for the drives and stick them on the drive in a place you can 
read them without removing the drive from the drive bay.  This will make 
it easier to find a particular drive next time you need to replace or 
upgrade one.  This problem most significant on hardware/OS combinations 
that don't provide a way to signal where a particular drive is 
physically installed.  (This includes a lot of whitebox and small server 
hardware and OSes.)




iostat relevant man page entries:

http://docs.sun.com/app/docs/doc/816-5166/iostat-1m?l=enn=1a=view 
http://docs.sun.com/app/docs/doc/816-5166/iostat-1m?l=enn=1a=view


-E
Display all device error statistics.

-i
In -E output, display the Device ID instead of the Serial No. The Device 
Id is a unique identifier registered by a driver through 
ddi_devid_register(9F).


-n
Display names in descriptive format. For example, cXtYdZ, rmt/N, 
server:/export/path.


By default, disks are identified by instance names such as ssd23 or 
md301. Combining the -n option with the -x option causes disk names to 
display in the cXtYdZsN format which is more easily associated with 
physical hardware characteristics. The cXtYdZsN format is particularly 
useful in FibreChannel (FC) environments where the FC World Wide Name 
appears in the t field.







Cindy Swearingen wrote:

Hi--

A google search of ST3500320AS turns up Seagate Barracuda drives.

All 7 drives in the pool tank are ST3500320AS. The other two c1t0d0
and c3d0 are unknown, but are not part of this pool.

You can also use fmdump -eV to see how long c2t3d0 has had problems.

Thanks,

Cindy

On 07/19/10 09:29, Yuri Homchuk wrote:

Thanks Cindy,

But format shows exactly same thing:
All of them appear as Seagate, no WD at all...
How could it be ???

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
  /p...@0,0/pci15d9,a...@5/d...@0,0
   1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci15d9,a...@5/d...@1,0
   2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
   3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
   4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
   5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
   6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
   7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
   8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB
  /p...@1,0/pci1022,7...@2/pci-...@1/i...@1/c...@0,0
Specify disk (enter its number): ^C


Thanks again.


-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com] Sent: 
Monday, July 19, 2010 9:16 AM

To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

I don't know what's up with iostat -En but I think I remember a 
problem where iostat does not correctly report drives running in 
legacy IDE mode.


You might use the format utility to identify these devices.

Thanks,

Cindy
On 07/18/10 14:15, Alxen4 wrote:

This is a situation:

I've got an error on one of the drives in 'zpool status

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Cindy Swearingen
: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c2t7d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd11 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 1D05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd12 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 0K05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0


-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
Sent: Monday, July 19, 2010 10:28 AM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive


I think you are saying that even though format shows 9 devices (0-8) on this 
system, there's really only 7 and the pool tank has only 5 (?).

I'm not sure why some devices would show up as duplicates.

Any recent changes to this system?

You might try exporting this pool and make sure that all the device connections 
are correct.

cs

On 07/19/10 09:57, Yuri Homchuk wrote:

I know that  ST3500320AS is  Seagate Barracuda.

That exactly why I am confused.
I looked physically at drives and I confirm again that 5 drives are Seagate and 
2 drives are Western Digital.
But Solaris tells me that all 7 drives are  Seagate Barracuda which is 
definetly not correct.


This is a reason of my original question.
I need to know if  c2t3d0 Seagate or Western Digital.

Thanks,

-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
Sent: Monday, July 19, 2010 9:48 AM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

A google search of ST3500320AS turns up Seagate Barracuda drives.

All 7 drives in the pool tank are ST3500320AS. The other two c1t0d0 and c3d0 
are unknown, but are not part of this pool.

You can also use fmdump -eV to see how long c2t3d0 has had problems.

Thanks,

Cindy

On 07/19/10 09:29, Yuri Homchuk wrote:

Thanks Cindy,

But format shows exactly same thing:
All of them appear as Seagate, no WD at all...
How could it be ???

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
  /p...@0,0/pci15d9,a...@5/d...@0,0
   1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci15d9,a...@5/d...@1,0
   2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
   3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
   4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
   5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
   6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
   7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
   8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB
  /p...@1,0/pci1022,7...@2/pci-...@1/i...@1/c...@0,0
Specify disk (enter its number): ^C


Thanks again.


-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
Sent: Monday, July 19, 2010 9:16 AM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

I don't know what's up with iostat -En but I think I remember a problem where 
iostat does not correctly report drives running in legacy IDE mode.

You might use the format utility to identify these devices.

Thanks,

Cindy
On 07/18/10 14:15, Alxen4 wrote:

This is a situation:

I've got an error on one of the drives in 'zpool status' output:

 zpool status tank

  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Yuri Homchuk

Thanks Cindy,

But format shows exactly same thing:
All of them appear as Seagate, no WD at all...
How could it be ???

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
  /p...@0,0/pci15d9,a...@5/d...@0,0
   1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci15d9,a...@5/d...@1,0
   2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
   3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
   4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
   5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
   6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
   7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
  /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
   8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB
  /p...@1,0/pci1022,7...@2/pci-...@1/i...@1/c...@0,0
Specify disk (enter its number): ^C


Thanks again.


-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com] 
Sent: Monday, July 19, 2010 9:16 AM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

I don't know what's up with iostat -En but I think I remember a problem where 
iostat does not correctly report drives running in legacy IDE mode.

You might use the format utility to identify these devices.

Thanks,

Cindy
On 07/18/10 14:15, Alxen4 wrote:
 This is a situation:
 
 I've got an error on one of the drives in 'zpool status' output:
 
  zpool status tank
 
   pool: tank
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
 attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
 using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: none requested
 config:
 
 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   raidz2ONLINE   0 0 0
 c1t1d0  ONLINE   0 0 0
 c2t0d0  ONLINE   0 0 0
 c2t2d0  ONLINE   0 0 0
 c2t3d0  ONLINE   1 0 0
 c2t4d0  ONLINE   0 0 0
 c2t5d0  ONLINE   0 0 0
 c2t7d0  ONLINE   0 0 0
 
 So I would like to replace 'c2t3d0'.
 
 I know for a fact the pool has 7 physical drives : 5 of Seagate and 2 of WD.
 
 I want to know if 'c2t3d0' Seagate or WD.
 
 If I run 'iostat -En' it shows that all  c*t*d0 drives are Seagate and 
 sd11/sd12 are WD.
 
 This totally confuses me...
 Why there are two different types of drives in iostat output : c*t*d0 and sd* 
 ???
 How come all c*t*d0 appear as Seagate.I know for sure two of them are WD.
 Why WD drives appears as sd* and not as c*t*d0 ?
 
 Please help.
 
 
 --
 
 # iostat -En
 
 
 c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
 Request: 54 Predictive Failure Analysis: 0
 
 c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
 Request: 0 Predictive Failure Analysis: 0
 
 c2t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
 Request: 0 Predictive Failure Analysis: 0
 
 c2t2d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
 Request: 0 Predictive Failure Analysis: 0
 
 c2t3d0   Soft Errors: 0 Hard Errors: 9 Transport Errors: 9
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 7 Device Not Ready: 0 No Device: 2 Recoverable: 0 Illegal 
 Request: 0 Predictive Failure Analysis: 0
 
 c2t4d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal 
 Request: 0 Predictive

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Yuri Homchuk


I know that  ST3500320AS is  Seagate Barracuda.

That exactly why I am confused.
I looked physically at drives and I confirm again that 5 drives are Seagate and 
2 drives are Western Digital.
But Solaris tells me that all 7 drives are  Seagate Barracuda which is 
definetly not correct.


This is a reason of my original question.
I need to know if  c2t3d0 Seagate or Western Digital.

Thanks,

-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com] 
Sent: Monday, July 19, 2010 9:48 AM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive

Hi--

A google search of ST3500320AS turns up Seagate Barracuda drives.

All 7 drives in the pool tank are ST3500320AS. The other two c1t0d0 and c3d0 
are unknown, but are not part of this pool.

You can also use fmdump -eV to see how long c2t3d0 has had problems.

Thanks,

Cindy

On 07/19/10 09:29, Yuri Homchuk wrote:
 Thanks Cindy,
 
 But format shows exactly same thing:
 All of them appear as Seagate, no WD at all...
 How could it be ???
 
 # format
 Searching for disks...done
 
 
 AVAILABLE DISK SELECTIONS:
0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
   /p...@0,0/pci15d9,a...@5/d...@0,0
1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci15d9,a...@5/d...@1,0
2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB
   /p...@1,0/pci1022,7...@2/pci-...@1/i...@1/c...@0,0
 Specify disk (enter its number): ^C
 
 
 Thanks again.
 
 
 -Original Message-
 From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
 Sent: Monday, July 19, 2010 9:16 AM
 To: Yuri Homchuk
 Cc: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] Help identify failed drive
 
 Hi--
 
 I don't know what's up with iostat -En but I think I remember a problem where 
 iostat does not correctly report drives running in legacy IDE mode.
 
 You might use the format utility to identify these devices.
 
 Thanks,
 
 Cindy
 On 07/18/10 14:15, Alxen4 wrote:
 This is a situation:

 I've got an error on one of the drives in 'zpool status' output:

  zpool status tank

   pool: tank
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
 attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
 using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: none requested
 config:

 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   raidz2ONLINE   0 0 0
 c1t1d0  ONLINE   0 0 0
 c2t0d0  ONLINE   0 0 0
 c2t2d0  ONLINE   0 0 0
 c2t3d0  ONLINE   1 0 0
 c2t4d0  ONLINE   0 0 0
 c2t5d0  ONLINE   0 0 0
 c2t7d0  ONLINE   0 0 0

 So I would like to replace 'c2t3d0'.

 I know for a fact the pool has 7 physical drives : 5 of Seagate and 2 of WD.

 I want to know if 'c2t3d0' Seagate or WD.

 If I run 'iostat -En' it shows that all  c*t*d0 drives are Seagate and 
 sd11/sd12 are WD.

 This totally confuses me...
 Why there are two different types of drives in iostat output : c*t*d0 and 
 sd* ???
 How come all c*t*d0 appear as Seagate.I know for sure two of them are WD.
 Why WD drives appears as sd* and not as c*t*d0 ?

 Please help.


 --

 # iostat -En


 c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
 Illegal
 Request: 54 Predictive Failure Analysis: 0

 c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB 500107862016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
 Illegal
 Request: 0 Predictive Failure Analysis: 0

 c2t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
 Size: 500.11GB

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Yuri Homchuk
  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd11 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 1D05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd12 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 0K05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0


-Original Message-
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
Sent: Monday, July 19, 2010 10:28 AM
To: Yuri Homchuk
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Help identify failed drive


I think you are saying that even though format shows 9 devices (0-8) on this 
system, there's really only 7 and the pool tank has only 5 (?).

I'm not sure why some devices would show up as duplicates.

Any recent changes to this system?

You might try exporting this pool and make sure that all the device connections 
are correct.

cs

On 07/19/10 09:57, Yuri Homchuk wrote:

 I know that  ST3500320AS is  Seagate Barracuda.

 That exactly why I am confused.
 I looked physically at drives and I confirm again that 5 drives are Seagate 
 and 2 drives are Western Digital.
 But Solaris tells me that all 7 drives are  Seagate Barracuda which is 
 definetly not correct.


 This is a reason of my original question.
 I need to know if  c2t3d0 Seagate or Western Digital.

 Thanks,

 -Original Message-
 From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
 Sent: Monday, July 19, 2010 9:48 AM
 To: Yuri Homchuk
 Cc: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] Help identify failed drive

 Hi--

 A google search of ST3500320AS turns up Seagate Barracuda drives.

 All 7 drives in the pool tank are ST3500320AS. The other two c1t0d0 and c3d0 
 are unknown, but are not part of this pool.

 You can also use fmdump -eV to see how long c2t3d0 has had problems.

 Thanks,

 Cindy

 On 07/19/10 09:29, Yuri Homchuk wrote:
 Thanks Cindy,

 But format shows exactly same thing:
 All of them appear as Seagate, no WD at all...
 How could it be ???

 # format
 Searching for disks...done


 AVAILABLE DISK SELECTIONS:
0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
   /p...@0,0/pci15d9,a...@5/d...@0,0
1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci15d9,a...@5/d...@1,0
2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB
   /p...@1,0/pci1022,7...@2/pci-...@1/i...@1/c...@0,0
 Specify disk (enter its number): ^C


 Thanks again.


 -Original Message-
 From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
 Sent: Monday, July 19, 2010 9:16 AM
 To: Yuri Homchuk
 Cc: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] Help identify failed drive

 Hi--

 I don't know what's up with iostat -En but I think I remember a problem 
 where iostat does not correctly report drives running in legacy IDE mode.

 You might use the format utility to identify these devices.

 Thanks,

 Cindy
 On 07/18/10 14:15, Alxen4 wrote:
 This is a situation:

 I've got an error on one of the drives in 'zpool status' output:

  zpool status tank

   pool: tank
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
 attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
 using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: none requested
 config:

 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   raidz2ONLINE   0 0 0
 c1t1d0  ONLINE   0 0 0
 c2t0d0  ONLINE   0 0 0
 c2t2d0  ONLINE   0 0 0
 c2t3d0  ONLINE   1 0 0
 c2t4d0  ONLINE   0 0 0
 c2t5d0  ONLINE   0 0 0
 c2t7d0

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Yuri Homchuk
 Device Id:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd12 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 0K05 Device Id:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0



-Original Message-
From: Haudy Kazemi [mailto:kaze0...@umn.edu]
Sent: Monday, July 19, 2010 11:07 AM
To: zfs-discuss@opensolaris.org
Cc: Cindy Swearingen; Yuri Homchuk
Subject: Re: [zfs-discuss] Help identify failed drive

A few things:

1.) did you move your drives around or change which controller each one was 
connected to sometime after installing and setting up OpenSolaris?
If so, a pool export and re-import may be in order.

2.) are you sure the drive is failing?  Does the problem only affect this drive 
or are other drives randomly affect too?  If you've run 'zpool clear' and the 
problem comes back, something is wrong but it could also be RAM, CPU, 
motherboard, controller, or power supply problems.  Smartmontools can read the 
drive SMART data and device error logs...run it from an Ubuntu 10.04 Live CD 
(sudo apt-get install
smartmontools) or from a PartedMagic Live CD if you have trouble getting 
Smartmontools working on OpenSolaris with your hardware.

3.) on some systems I've found another version of the iostat command to be more 
useful, particularly when iostat -En leaves the serial number field empty or 
otherwise doesn't read the serial number correctly.  Try
this:

iostat -Eni

This should give you a list of drives showing their name in the cXtYdZsN 
format, and their Device ID which may contain the drive serial numbers 
concatenated with the model.  Compare that list with your 'zpool status tank' 
output, which in your case means looking for 'c2t3d0'.  Once you find the 
serial number, you can look at labels printed on your drives and verify which 
one it is.

One tip: if your server case is hard to work in or it is otherwise difficult to 
remove drives to read the serial numbers (lots of screws, cables in the way, 
tight fits, etc.), create additional serial number labels for the drives and 
stick them on the drive in a place you can read them without removing the drive 
from the drive bay.  This will make it easier to find a particular drive next 
time you need to replace or upgrade one.  This problem most significant on 
hardware/OS combinations that don't provide a way to signal where a particular 
drive is physically installed.  (This includes a lot of whitebox and small 
server hardware and OSes.)



iostat relevant man page entries:

http://docs.sun.com/app/docs/doc/816-5166/iostat-1m?l=enn=1a=view
http://docs.sun.com/app/docs/doc/816-5166/iostat-1m?l=enn=1a=view

-E
Display all device error statistics.

-i
In -E output, display the Device ID instead of the Serial No. The Device Id is 
a unique identifier registered by a driver through ddi_devid_register(9F).

-n
Display names in descriptive format. For example, cXtYdZ, rmt/N, 
server:/export/path.

By default, disks are identified by instance names such as ssd23 or md301. 
Combining the -n option with the -x option causes disk names to display in the 
cXtYdZsN format which is more easily associated with physical hardware 
characteristics. The cXtYdZsN format is particularly useful in FibreChannel 
(FC) environments where the FC World Wide Name appears in the t field.






Cindy Swearingen wrote:
 Hi--

 A google search of ST3500320AS turns up Seagate Barracuda drives.

 All 7 drives in the pool tank are ST3500320AS. The other two c1t0d0
 and c3d0 are unknown, but are not part of this pool.

 You can also use fmdump -eV to see how long c2t3d0 has had problems.

 Thanks,

 Cindy

 On 07/19/10 09:29, Yuri Homchuk wrote:
 Thanks Cindy,

 But format shows exactly same thing:
 All of them appear as Seagate, no WD at all...
 How could it be ???

 # format
 Searching for disks...done


 AVAILABLE DISK SELECTIONS:
0. c1t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
   /p...@0,0/pci15d9,a...@5/d...@0,0
1. c1t1d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci15d9,a...@5/d...@1,0
2. c2t0d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@0,0
3. c2t2d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@2,0
4. c2t3d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@3,0
5. c2t4d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@4,0
6. c2t5d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@5,0
7. c2t7d0 ATA-ST3500320AS-SD15-465.76GB
   /p...@0,0/pci10de,3...@a/pci15d9,a...@0/s...@7,0
8. c3d0 GIGABYTE-100336D9CC244B01-0001-2.00GB

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Haudy Kazemi



3.) on some systems I've found another version of the iostat command to be more 
useful, particularly when iostat -En leaves the serial number field empty or 
otherwise doesn't read the serial number correctly.  Try
this:
  


' iostat -Eni ' indeed outputs Device ID on some of the drives,but I still 
can't understand how it helps me to identify model of specific drive.
  

See below.  Some


# iostat -Eni
c3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: GIGABYTE i-RAM  Revision:  Device Id: 
id1,c...@agigabyte_i-ram=33f100336d9cc244b01d
Size: 2.15GB 2146443264 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
  

GIGABYTE i-RAM (2GB RAM based SSD)
Probably serial number: 33F100336D9CC244B01D


c1t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@ast3500320as=9qm34ybz
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 56 Predictive Failure Analysis: 0
  
c1t0d0 = controller 1 target 0 device 0.  Match zpool status devices 
names with this.

ST3500320AS=9QM34YBZ
Model: ST3500320AS   (ST means Seagate Technologies)
Serial: 9QM34YBZ


c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@ast3500320as=9qm353d2
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 66 Predictive Failure Analysis: 0
  

c1t1d0 = controller 1 target 1 device 0
ST3500320AS=9QM353D2


c0t0d0   Soft Errors: 0 Hard Errors: 198 Transport Errors: 0
Vendor: SONY Product: CD-ROM CDU5212   Revision: 5YS1 Device Id:
Size: 0.00GB 0 bytes
Media Error: 0 Device Not Ready: 198 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  

Sony CDU5212 52X CD optical drive


c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9f49ef
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  

c2t0d0 = controller 2 target 0 device 0

You should seriously consider checking, and probably updating, the 
firmware on all your Seagate ST3500320AS 7200.11 drives.  Version SD15 
(as reported above) is a known bad firmware.

http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207951


c2t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  

controller 2 again

c2t2d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9be7ab
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  

controller 2 again

c2t3d0   Soft Errors: 0 Hard Errors: 9 Transport Errors: 9
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9fa1ab
Size: 500.11GB 500107862016 bytes
Media Error: 7 Device Not Ready: 0 No Device: 2 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  
controller 2 again.  This one is the suspect (c2t3d0) but we don't know 
the serial number.  The controller is telling us it is 
'n5000c5000b9fa1ab' which must be a unique ID the controller made up for 
itself.

c2t4d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9fde5f
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  

controller 2 again

c2t5d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 
id1,s...@n5000c5000b9fae42
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  

controller 2 again

c2t6d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
  

controller 2 again.  No Device Id reported.  (strange)(not really present?)


c2t7d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Device Id: 

Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Michael Shadle
On Mon, Jul 19, 2010 at 3:11 PM, Haudy Kazemi kaze0...@umn.edu wrote:

 ' iostat -Eni ' indeed outputs Device ID on some of the drives,but I still
 can't understand how it helps me to identify model of specific drive.

Curious:

[r...@nas01 ~]# zpool status -x
  pool: tank
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 14h2m with 0 errors on Sun Jul 18 18:32:38 2010
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  raidz2ONLINE   0 0 0
...
  raidz2DEGRADED 0 0 0
...
c2t5d0  DEGRADED 0 0 0  too many errors
...


c2t5d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST31500341AS Revision: SD1B Device Id:
id1,s...@sata_st31500341as9vs077gt
Size: 1500.30GB 1500301910016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0


Why has it been reported as bad (for probably 2 months now, I haven't
got around to figuring out which disk in the case it is etc.) but the
iostat isn't showing me any errors.

Note: I do a weekly scrub too. Not sure if that matters or helps reset
the device.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Marty Scholes
  ' iostat -Eni ' indeed outputs Device ID on some of
  the drives,but I still
  can't understand how it helps me to identify model
  of specific drive.

Get and install smartmontools.  Period.  I resisted it for a few weeks but it 
has been an amazing tool.  It will tell you more than you ever wanted to know 
about any disk drive in the /dev/rdsk/ tree, down to the serial number.

I have seen zfs remember original names in a pool after they have been renamed 
by the OS such that zpool status can list c22t4d0 as a drive in the pool when 
there exists no such drive on the system.

 Why has it been reported as bad (for probably 2
 months now, I haven't
 got around to figuring out which disk in the case it
 is etc.) but the
 iostat isn't showing me any errors.

Start a scrub or do an obscure find, e.g. find /tank_mointpoint -name core 
and watch the drive activity lights.  The drive in the pool which isn't 
blinking like crazy is a faulted/offlined drive.

Ugly and oh-so-hackerish, but it works.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Michael Shadle
On Mon, Jul 19, 2010 at 4:16 PM, Marty Scholes martyscho...@yahoo.com wrote:

 Start a scrub or do an obscure find, e.g. find /tank_mointpoint -name core 
 and watch the drive activity lights.  The drive in the pool which isn't 
 blinking like crazy is a faulted/offlined drive.

 Ugly and oh-so-hackerish, but it works.

that was my idea except figuring out something to make just specific
drives write one at a time. although if it has been offlined or
whatever then it shouldn't receive any requests, that sounds even
easier. :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Michael Shadle
On Mon, Jul 19, 2010 at 4:16 PM, Marty Scholes martyscho...@yahoo.com wrote:

 Start a scrub or do an obscure find, e.g. find /tank_mointpoint -name core 
 and watch the drive activity lights.  The drive in the pool which isn't 
 blinking like crazy is a faulted/offlined drive.

Actually I guess my real question is why iostat hasn't logged any
errors in its counters even though the device has been bad in there
for months?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Richard Elling
On Jul 19, 2010, at 4:21 PM, Michael Shadle wrote:
 On Mon, Jul 19, 2010 at 4:16 PM, Marty Scholes martyscho...@yahoo.com wrote:
 
 Start a scrub or do an obscure find, e.g. find /tank_mointpoint -name core 
 and watch the drive activity lights.  The drive in the pool which isn't 
 blinking like crazy is a faulted/offlined drive.
 
 Actually I guess my real question is why iostat hasn't logged any
 errors in its counters even though the device has been bad in there
 for months?

Aren't you assuming the I/O error comes from the drive?
fmdump -eV 
 -- richard

-- 
Richard Elling
rich...@nexenta.com   +1-760-896-4422
Enterprise class storage for everyone
www.nexenta.com



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Michael Shadle
On Mon, Jul 19, 2010 at 4:26 PM, Richard Elling rich...@nexenta.com wrote:

 Aren't you assuming the I/O error comes from the drive?
 fmdump -eV

okay - I guess I am. Is this just telling me hey stupid, a checksum
failed ? In which case why did this never resolve itself and the
specific device get marked as degraded?

Apr 04 2010 21:52:38.920978339 ereport.fs.zfs.checksum
nvlist version: 0
class = ereport.fs.zfs.checksum
ena = 0x64350d4040300c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0xfd80ebd352cc9271
vdev = 0x29282dc6fa073a2
(end detector)

pool = tank
pool_guid = 0xfd80ebd352cc9271
pool_context = 0
pool_failmode = wait
vdev_guid = 0x29282dc6fa073a2
vdev_type = disk
vdev_path = /dev/dsk/c2t5d0s0
vdev_devid = id1,s...@sata_st31500341as9vs077gt/a
parent_guid = 0xc2d5959dd2c07bf7
parent_type = raidz
zio_err = 0
zio_offset = 0x40abbf2600
zio_size = 0x200
zio_objset = 0x10
zio_object = 0x1c06000
zio_level = 2
zio_blkid = 0x0
__ttl = 0x1
__tod = 0x4bb96c96 0x36e503a3
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Haudy Kazemi

Marty Scholes wrote:

' iostat -Eni ' indeed outputs Device ID on some of
the drives,but I still
can't understand how it helps me to identify model
of specific drive.
  


Get and install smartmontools.  Period.  I resisted it for a few weeks but it 
has been an amazing tool.  It will tell you more than you ever wanted to know 
about any disk drive in the /dev/rdsk/ tree, down to the serial number.

I have seen zfs remember original names in a pool after they have been renamed by the OS 
such that zpool status can list c22t4d0 as a drive in the pool when there 
exists no such drive on the system.
  
Run smartmontools on a Linux LiveCD if necessary.  For a while (at least 
when OpenSolaris 2009.06 was released) smartmontools could not get drive 
information on drives on certain controllers.



Why has it been reported as bad (for probably 2
months now, I haven't
got around to figuring out which disk in the case it
is etc.) but the
iostat isn't showing me any errors.



Start a scrub or do an obscure find, e.g. find /tank_mointpoint -name core 
and watch the drive activity lights.  The drive in the pool which isn't blinking like 
crazy is a faulted/offlined drive.

Ugly and oh-so-hackerish, but it works.
  
You might also be able to figure it out from drive vibration or a lack 
thereof.  Many people rolling their own server hardware don't have 
per-drive activity lights, hence the recommendation to figure out how to 
identify drives in software via their serial numbers and then match up 
with the labels.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Richard Elling
On Jul 19, 2010, at 4:30 PM, Michael Shadle wrote:

 On Mon, Jul 19, 2010 at 4:26 PM, Richard Elling rich...@nexenta.com wrote:
 
 Aren't you assuming the I/O error comes from the drive?
 fmdump -eV
 
 okay - I guess I am. Is this just telling me hey stupid, a checksum
 failed ? In which case why did this never resolve itself and the
 specific device get marked as degraded?

I depends on if the problem was fixed or not.  What says
zpool status -xv

 -- richard

 
 Apr 04 2010 21:52:38.920978339 ereport.fs.zfs.checksum
 nvlist version: 0
class = ereport.fs.zfs.checksum
ena = 0x64350d4040300c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0xfd80ebd352cc9271
vdev = 0x29282dc6fa073a2
(end detector)
 
pool = tank
pool_guid = 0xfd80ebd352cc9271
pool_context = 0
pool_failmode = wait
vdev_guid = 0x29282dc6fa073a2
vdev_type = disk
vdev_path = /dev/dsk/c2t5d0s0
vdev_devid = id1,s...@sata_st31500341as9vs077gt/a
parent_guid = 0xc2d5959dd2c07bf7
parent_type = raidz
zio_err = 0
zio_offset = 0x40abbf2600
zio_size = 0x200
zio_objset = 0x10
zio_object = 0x1c06000
zio_level = 2
zio_blkid = 0x0
__ttl = 0x1
__tod = 0x4bb96c96 0x36e503a3
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Michael Shadle
On Mon, Jul 19, 2010 at 4:35 PM, Richard Elling rich...@nexenta.com wrote:

 I depends on if the problem was fixed or not.  What says
        zpool status -xv

  -- richard

[r...@nas01 ~]# zpool status -xv
  pool: tank
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 14h2m with 0 errors on Sun Jul 18 18:32:38 2010
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  raidz2ONLINE   0 0 0
c0t3d0  ONLINE   0 0 0
c0t2d0  ONLINE   0 0 0
c0t4d0  ONLINE   0 0 0
c0t1d0  ONLINE   0 0 0
c0t6d0  ONLINE   0 0 0
c0t7d0  ONLINE   0 0 0
c0t0d0  ONLINE   0 0 0
c0t5d0  ONLINE   0 0 0
  raidz2DEGRADED 0 0 0
c2t0d0  ONLINE   0 0 0
c2t1d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   0 0 0
c2t4d0  ONLINE   0 0 0
c2t5d0  DEGRADED 0 0 0  too many errors
c2t6d0  ONLINE   0 0 0
c2t7d0  ONLINE   0 0 0

was never fixed. I thought I needed to replace the drive. Should I
mark it as resolved or whatever the syntax is and re-run a scrub?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Haudy Kazemi

Yuri Homchuk wrote:


 

 

Well, this is a REALLY 300 users production server with 12 VM's 
running on it, so I definitely won't play with a firmware J


I can easily identify which drive is what by physically looking at it.

It's just sad to realize that I cannot trust solaris anymore.

I never noticed this problem before because we were always using 
 Seagate drives, so I didn't notice any difference


 


In my understanding there are three controllers:

 


C1 -- build-in AHCI controller

C2 -- build-in controller that I needed to reflush

C3 -- PCI card old sata 1.5 controller- not in use, just ignore it.


Which drives are physically connected to which controller?


 


I guess C2 is the one that gives me hassles.

 


Is there way to retrieve the model from solaris ?


dmidecode might do it.  I don't know exactly what syntax it needs though.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help identify failed drive

2010-07-19 Thread Richard Elling
more below...

On Jul 19, 2010, at 4:42 PM, Michael Shadle wrote:

 On Mon, Jul 19, 2010 at 4:35 PM, Richard Elling rich...@nexenta.com wrote:
 
 I depends on if the problem was fixed or not.  What says
zpool status -xv
 
  -- richard
 
 [r...@nas01 ~]# zpool status -xv
  pool: tank
 state: DEGRADED
 status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 14h2m with 0 errors on Sun Jul 18 18:32:38 2010
 config:
 
NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  raidz2ONLINE   0 0 0
c0t3d0  ONLINE   0 0 0
c0t2d0  ONLINE   0 0 0
c0t4d0  ONLINE   0 0 0
c0t1d0  ONLINE   0 0 0
c0t6d0  ONLINE   0 0 0
c0t7d0  ONLINE   0 0 0
c0t0d0  ONLINE   0 0 0
c0t5d0  ONLINE   0 0 0
  raidz2DEGRADED 0 0 0
c2t0d0  ONLINE   0 0 0
c2t1d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   0 0 0
c2t4d0  ONLINE   0 0 0
c2t5d0  DEGRADED 0 0 0  too many errors
c2t6d0  ONLINE   0 0 0
c2t7d0  ONLINE   0 0 0
 
 was never fixed. I thought I needed to replace the drive. Should I
 mark it as resolved or whatever the syntax is and re-run a scrub?


I'm really looking for the last lines of the output which should say
something like:
errors: No known data errors

zpool clear  will zero the counters and change the state from DEGRADED.
Try that, then re-run a scrub to verify no more errors were found.  
 -- richard

-- 
Richard Elling
rich...@nexenta.com   +1-760-896-4422
Enterprise class storage for everyone
www.nexenta.com



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Help identify failed drive

2010-07-18 Thread Alxen4
This is a situation:

I've got an error on one of the drives in 'zpool status' output:

 zpool status tank

  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   1 0 0
c2t4d0  ONLINE   0 0 0
c2t5d0  ONLINE   0 0 0
c2t7d0  ONLINE   0 0 0

So I would like to replace 'c2t3d0'.

I know for a fact the pool has 7 physical drives : 5 of Seagate and 2 of WD.

I want to know if 'c2t3d0' Seagate or WD.

If I run 'iostat -En' it shows that all  c*t*d0 drives are Seagate and 
sd11/sd12 are WD.

This totally confuses me...
Why there are two different types of drives in iostat output : c*t*d0 and sd* 
???
How come all c*t*d0 appear as Seagate.I know for sure two of them are WD.
Why WD drives appears as sd* and not as c*t*d0 ?

Please help.


--

# iostat -En


c1t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 54 Predictive Failure Analysis: 0

c2t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t1d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t2d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t3d0   Soft Errors: 0 Hard Errors: 9 Transport Errors: 9
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 7 Device Not Ready: 0 No Device: 2 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t4d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t5d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t6d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c2t7d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST3500320AS  Revision: SD15 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

[b]sd11 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 1D05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

sd12 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: WDC WD5001AALS-0 Revision: 0K05 Serial No:
Size: 500.11GB 500107862016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0[/b]





Thanks a lot.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss