Re: [OmniOS-discuss] disk failure causing reboot?

Dominik Hassler Mon, 18 May 2015 14:12:12 -0700

Jeff,

I have them WD40001FYYG drives in my home server but just as a simplemirror. AFAIK those drives are equivalent to the SATA WD Re 4GB drivesbut just w/ a SAS controller instead a SATA controller on top and just alittle more expensive than their SATA equivalents...

I have no real facts but I assume that these SAS drives (they call them"nearline SAS") are not 100% like "real" SAS drives... E.g. they don'trun automated background scans, that's what I observed. In what extentthey differ from "real" SAS drives, I don't know.


On 05/18/2015 09:01 PM, Jeff Stockett wrote:

Hi Dan,

The pool is made up of 36 disks - 6 x 6 raidz2 vdevs with some SSDs for l2arc 
and slog.  I already replaced the drive and the rebuild is nearly done, but I 
was mostly curious why a disk failure would cause a reboot?  I get that it was 
apparently hanging the pool up, and that according to some posts I read the 
developers seem to think it is better the panic/dump/reboot than leave it hung 
until someone notices, but wouldn't it really be better just to drop the failed 
drive out of the array? Is it because the system in question is using a SAS 
expander or is this only expected behavior sometimes depending on how the drive 
fails?  I guess I might expect this with consumer grade SATA drives, but wasn't 
expecting it with $$$ enterprise SAS drives.

Thanks,  Jeff

-----Original Message-----
From: Dan McDonald [mailto:[email protected]]
Sent: Monday, May 18, 2015 11:33 AM
To: Jeff Stockett
Cc: omnios-discuss
Subject: Re: [OmniOS-discuss] disk failure causing reboot?

On May 18, 2015, at 2:25 PM, Jeff Stockett <[email protected]> wrote:

A drive failed in one of our supermicro 5048R-E1CR36L servers running omnios 
r151012 last night, and somewhat unexpectedly, the whole system seems to have 
panicked.


The panic was done for protection of your pool:

May 18 04:44:36 zfs01 genunix: [ID 918906 kern.notice] I/O to pool 'dpool' 
appears to be hung.


<SNIP!>


The disks are all 4TB WD40001FYYG enterprise SAS drives.  Googling seems to 
indicate it is a known problem with the way the various subsystems sometimes 
interact. Is there any way to fix/workaround this issue?


Pull the drive.  I'm assuming you have a raidz or mirrored setup where you can 
do that, right?  Or is it a question of finding *which* drive failed?

Dan


_______________________________________________
OmniOS-discuss mailing list
[email protected]
http://lists.omniti.com/mailman/listinfo/omnios-discuss

_______________________________________________
OmniOS-discuss mailing list
[email protected]
http://lists.omniti.com/mailman/listinfo/omnios-discuss

Re: [OmniOS-discuss] disk failure causing reboot?

Reply via email to