Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-15 Thread Jeremy Chadwick
On Mon, Sep 15, 2008 at 05:02:39PM +0200, Fabian Keil wrote:
 Jeremy Chadwick [EMAIL PROTECTED] wrote:
 
  On Mon, Sep 15, 2008 at 10:37:18AM +0800, Wilkinson, Alex wrote:
   0n Fri, Sep 12, 2008 at 09:32:07AM -0700, Jeremy Chadwick wrote: 
   
About the only real improvement I'd like to see in this setup
is the ability to spin down idle drives.  That would be an
ideal setup for the home RAID array.
   
   There is a FreeBSD port which handles this, although such a
   feature should ideally be part of the ata(4) system (as should
   TCQ/NCQ and a slew of other things -- some of those are being
   worked on).
   
   And the port is ?
  
  Is it that hard to use 'make search' or grep?  :-)  sysutils/ataidle
 
 You also might want to have a look at atacontrol(8)'s spindown command.

The appropriate ata(4) changes and extension of atacontrol(8) to support
spindown was MFC'd (to RELENG_7) only 5 weeks ago.  It's fairly
unlikely that most users know this feature was MFC'd (case in point, I
was not).

http://www.freebsd.org/cgi/cvsweb.cgi/src/sbin/atacontrol/atacontrol.c
has the details, see Revision 1.43.2.2.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-14 Thread Jeremy Chadwick
On Mon, Sep 15, 2008 at 01:23:39PM +0800, Wilkinson, Alex wrote:
 0n Sun, Sep 14, 2008 at 09:28:28PM -0700, Jeremy Chadwick wrote: 
 
 On Mon, Sep 15, 2008 at 10:37:18AM +0800, Wilkinson, Alex wrote:
  0n Fri, Sep 12, 2008 at 09:32:07AM -0700, Jeremy Chadwick wrote: 
  
   About the only real improvement I'd like to see in this setup 
 is the ability
   to spin down idle drives.  That would be an ideal setup for the 
 home RAID
   array.
  
  There is a FreeBSD port which handles this, although such a 
 feature
  should ideally be part of the ata(4) system (as should TCQ/NCQ 
 and a
  slew of other things -- some of those are being worked on).
  
  And the port is ?
 
 Is it that hard to use 'make search' or grep?  :-)  sysutils/ataidle
 
 When you dont know the string to search on ... yes.

Give me a break.  :-)  idle|sleep|suspend|spin


-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-14 Thread Wilkinson, Alex
0n Sun, Sep 14, 2008 at 09:28:28PM -0700, Jeremy Chadwick wrote: 

On Mon, Sep 15, 2008 at 10:37:18AM +0800, Wilkinson, Alex wrote:
 0n Fri, Sep 12, 2008 at 09:32:07AM -0700, Jeremy Chadwick wrote: 
 
  About the only real improvement I'd like to see in this setup is 
the ability
  to spin down idle drives.  That would be an ideal setup for the 
home RAID
  array.
 
 There is a FreeBSD port which handles this, although such a feature
 should ideally be part of the ata(4) system (as should TCQ/NCQ and a
 slew of other things -- some of those are being worked on).
 
 And the port is ?

Is it that hard to use 'make search' or grep?  :-)  sysutils/ataidle

When you dont know the string to search on ... yes.

 -aW

IMPORTANT: This email remains the property of the Australian Defence 
Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 
1914.  If you have received this email in error, you are requested to contact 
the sender and delete the email.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-14 Thread Wilkinson, Alex
0n Fri, Sep 12, 2008 at 09:32:07AM -0700, Jeremy Chadwick wrote: 

 About the only real improvement I'd like to see in this setup is the 
ability
 to spin down idle drives.  That would be an ideal setup for the home RAID
 array.

There is a FreeBSD port which handles this, although such a feature
should ideally be part of the ata(4) system (as should TCQ/NCQ and a
slew of other things -- some of those are being worked on).

And the port is ?

 -aW

IMPORTANT: This email remains the property of the Australian Defence 
Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 
1914.  If you have received this email in error, you are requested to contact 
the sender and delete the email.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Karl Pielorz


Hi,

Recently, a ZFS pool on my FreeBSD box started showing lots of errors on 
one drive in a mirrored pair.


The pool consists of around 14 drives (as 7 mirrored pairs), hung off of a 
couple of SuperMicro 8 port SATA controllers (1 drive of each pair is on 
each controller).


One of the drives started picking up a lot of errors (by the end of things 
it was returning errors pretty much for any reads/writes issued) - and 
taking ages to complete the I/O's.


However, ZFS kept trying to use the drive - e.g. as I attached another 
drive to the remaining 'good' drive in the mirrored pair, ZFS was still 
trying to read data off the failed drive (and remaining good one) in order 
to complete it's re-silver to the newly attached drive.


Having posted on the Open Solaris ZFS list - it appears, under Solaris 
there's an 'FMA Engine' which communicates drive failures and the like to 
ZFS - advising ZFS when a drive should be marked as 'failed'.


Is there anything similar to this on FreeBSD yet? - i.e. Does/can anything 
on the system tell ZFS This drives experiencing failures rather than ZFS 
just seeing lots of timed out I/O 'errors'? (as appears to be the case).


In the end, the failing drive was timing out literally every I/O - I did 
recover the situation by detaching it from the pool (which hung the machine 
- probably caused by ZFS having to update the meta-data on all drives, 
including the failed one). A reboot bought the pool back, minus the 
'failed' drive, so enough of the 'detach' must have completed.


The newly attached drive completed the re-silver in half an hour (as 
opposed to an estimated 755 hours and climbing with the other drive still 
in the pool, limping along).


-Kp

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 10:45:24AM +0100, Karl Pielorz wrote:
 Recently, a ZFS pool on my FreeBSD box started showing lots of errors on  
 one drive in a mirrored pair.

 The pool consists of around 14 drives (as 7 mirrored pairs), hung off of 
 a couple of SuperMicro 8 port SATA controllers (1 drive of each pair is 
 on each controller).

 One of the drives started picking up a lot of errors (by the end of 
 things it was returning errors pretty much for any reads/writes issued) - 
 and taking ages to complete the I/O's.

 However, ZFS kept trying to use the drive - e.g. as I attached another  
 drive to the remaining 'good' drive in the mirrored pair, ZFS was still  
 trying to read data off the failed drive (and remaining good one) in 
 order to complete it's re-silver to the newly attached drive.

 Having posted on the Open Solaris ZFS list - it appears, under Solaris  
 there's an 'FMA Engine' which communicates drive failures and the like to 
 ZFS - advising ZFS when a drive should be marked as 'failed'.

 Is there anything similar to this on FreeBSD yet? - i.e. Does/can 
 anything on the system tell ZFS This drives experiencing failures 
 rather than ZFS just seeing lots of timed out I/O 'errors'? (as appears 
 to be the case).

As far as I know, there is no such standard mechanism in FreeBSD.  If
the drive falls off the bus entirely (e.g. detached), I would hope ZFS
would notice that.  I can imagine it (might) also depend on if the disk
subsystem you're using is utilising CAM or not (e.g. disks should be daX
not adX); Scott Long might know if something like this is implemented in
CAM.  I'm fairly certain nothing like this is implemented in ata(4).

Ideally, it would be the job of the controller and controller driver to
announce to underlying I/O operations fail/success.  Do you agree?

I hope this FMA Engine on Solaris only *tells* underlying pieces of
I/O errors, rather than acting on them (e.g. automatically yanking the
disk off the bus for you).  I'm in no way shunning Solaris, I'm simply
saying such a mechanism could be as risky/deadly as it could be useful.

 In the end, the failing drive was timing out literally every I/O - I did  
 recover the situation by detaching it from the pool (which hung the 
 machine - probably caused by ZFS having to update the meta-data on all 
 drives, including the failed one). A reboot bought the pool back, minus 
 the 'failed' drive, so enough of the 'detach' must have completed.

 The newly attached drive completed the re-silver in half an hour (as  
 opposed to an estimated 755 hours and climbing with the other drive still 
 in the pool, limping along).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Karl Pielorz



--On 12 September 2008 06:21 -0700 Jeremy Chadwick [EMAIL PROTECTED] 
wrote:



As far as I know, there is no such standard mechanism in FreeBSD.  If
the drive falls off the bus entirely (e.g. detached), I would hope ZFS
would notice that.  I can imagine it (might) also depend on if the disk
subsystem you're using is utilising CAM or not (e.g. disks should be daX
not adX); Scott Long might know if something like this is implemented in
CAM.  I'm fairly certain nothing like this is implemented in ata(4).


For ATA, at the moment - I don't think it'll notice even if a drive 
detaches. I think like my system the other day, it'll just keep issuing I/O 
commands to the drive, even if it's disappeared (it might get much 'quicker 
failures' if the device has 'gone' to the point of FreeBSD just quickly 
returning 'fail' for every request).



Ideally, it would be the job of the controller and controller driver to
announce to underlying I/O operations fail/success.  Do you agree?

I hope this FMA Engine on Solaris only *tells* underlying pieces of
I/O errors, rather than acting on them (e.g. automatically yanking the
disk off the bus for you).  I'm in no way shunning Solaris, I'm simply
saying such a mechanism could be as risky/deadly as it could be useful.


Yeah, I guess so - I think the way it's meant to happen (and this is only 
AFAIK) is that FMA 'detects' a failing drive by applying some configurable 
policy to it. That policy would also include notifying ZFS, so that ZFS 
could then decide to stop issuing I/O commands to that device.


None of this seems to be in place, at least for ATA under FreeBSD - when a 
drive goes bad, you can just end up with 'hours' worth of I/O timeouts, 
until someone intervenes.


I did enquire on the Open Solaris list about setting limits for 'errors' in 
ZFS, which netted me a reply that it's FMA (at least in Solaris) that's 
responsible for this - it just then informs ZFS of the condition. We don't 
appear (again at least for ATA) to have anything similar for FreeBSD yet :(


-Kp

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Oliver Fromme
Karl Pielorz wrote:
  Recently, a ZFS pool on my FreeBSD box started showing lots of errors on 
  one drive in a mirrored pair.
  
  The pool consists of around 14 drives (as 7 mirrored pairs), hung off of a 
  couple of SuperMicro 8 port SATA controllers (1 drive of each pair is on 
  each controller).
  
  One of the drives started picking up a lot of errors (by the end of things 
  it was returning errors pretty much for any reads/writes issued) - and 
  taking ages to complete the I/O's.
  
  However, ZFS kept trying to use the drive - e.g. as I attached another 
  drive to the remaining 'good' drive in the mirrored pair, ZFS was still 
  trying to read data off the failed drive (and remaining good one) in order 
  to complete it's re-silver to the newly attached drive.
  
  Having posted on the Open Solaris ZFS list - it appears, under Solaris 
  there's an 'FMA Engine' which communicates drive failures and the like to 
  ZFS - advising ZFS when a drive should be marked as 'failed'.
  
  Is there anything similar to this on FreeBSD yet? - i.e. Does/can anything 
  on the system tell ZFS This drives experiencing failures rather than ZFS 
  just seeing lots of timed out I/O 'errors'? (as appears to be the case).
  
  In the end, the failing drive was timing out literally every I/O - I did 
  recover the situation by detaching it from the pool (which hung the machine 
  - probably caused by ZFS having to update the meta-data on all drives, 
  including the failed one). A reboot bought the pool back, minus the 
  'failed' drive, so enough of the 'detach' must have completed.

Did you try atacontrol detach to remove the disk from
the bus?  I haven't tried that with ZFS, but gmirror
automatically detects when a disk has gone away, and
doesn't try to do anything with it anymore.  It certainly
should not hang the machine.  After all, what's the
purpose of a RAID when you have to reboot upon drive
failure.  ;-)

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

C++ is over-complicated nonsense. And Bjorn Shoestrap's book
a danger to public health. I tried reading it once, I was in
recovery for months.
-- Cliff Sarginson
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Freddie Cash
On September 12, 2008 02:45 am Karl Pielorz wrote:
 Recently, a ZFS pool on my FreeBSD box started showing lots of errors
 on one drive in a mirrored pair.

 The pool consists of around 14 drives (as 7 mirrored pairs), hung off
 of a couple of SuperMicro 8 port SATA controllers (1 drive of each pair
 is on each controller).

 One of the drives started picking up a lot of errors (by the end of
 things it was returning errors pretty much for any reads/writes issued)
 - and taking ages to complete the I/O's.

 However, ZFS kept trying to use the drive - e.g. as I attached another
 drive to the remaining 'good' drive in the mirrored pair, ZFS was still
 trying to read data off the failed drive (and remaining good one) in
 order to complete it's re-silver to the newly attached drive.

For the one time I've had a drive fail, and the three times I've replaced 
drives for larger ones, the process used was:

  zpool offline pool old device
  remove old device
  insert new device
  zpool replace pool old device new device

For one machine, I had to shut it off after the offline, as it didn't have 
hot-swappable drive bays.  For the other machine, it did everything while 
online and running.

IOW, the old device never had a chance to interfere with anything.  Same 
process we've used with hardware RAID setups in the past.

 Is there anything similar to this on FreeBSD yet? - i.e. Does/can
 anything on the system tell ZFS This drives experiencing failures
 rather than ZFS just seeing lots of timed out I/O 'errors'? (as appears
 to be the case).

Beyond the periodic script that checks for things like this, and sends 
root an e-mail, I haven't seen anything.

-- 
Freddie Cash
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 03:34:30PM +0100, Karl Pielorz wrote:
 --On 12 September 2008 06:21 -0700 Jeremy Chadwick [EMAIL PROTECTED]  
 wrote:

 As far as I know, there is no such standard mechanism in FreeBSD.  If
 the drive falls off the bus entirely (e.g. detached), I would hope ZFS
 would notice that.  I can imagine it (might) also depend on if the disk
 subsystem you're using is utilising CAM or not (e.g. disks should be daX
 not adX); Scott Long might know if something like this is implemented in
 CAM.  I'm fairly certain nothing like this is implemented in ata(4).

 For ATA, at the moment - I don't think it'll notice even if a drive  
 detaches. I think like my system the other day, it'll just keep issuing 
 I/O commands to the drive, even if it's disappeared (it might get much 
 'quicker failures' if the device has 'gone' to the point of FreeBSD just 
 quickly returning 'fail' for every request).

I know ATA will notice a detached channel, because I myself have done
it: administratively, that is -- atacontrol detach ataX.  But the only
time that can happen automatically is if the actual controller does
so itself, or if FreeBSD is told to do it administratively.

What this does to other parts of the kernel and userland applications is
something I haven't tested.  I *can* tell you that there are major,
major problems with detach/reattach/reinit on ata(4) causing kernel
panics and other such things.  I've documented this quite thoroughly in
my Common FreeBSD issues wiki:

http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues

I am also very curious to know the exact brand/model of 8-port SATA
controller from Supermicro you are using, *especially* if it uses ata(4)
rather than CAM and da(4).  Such Supermicro controllers were recently
discussed on freebsd-stable (or was it -hardware?), and no one was able
to come to a concise decision as to whether or not they were decent or
even remotely trusted.  Supermicro provides a few different SATA HBAs.

 Ideally, it would be the job of the controller and controller driver to
 announce to underlying I/O operations fail/success.  Do you agree?

 I hope this FMA Engine on Solaris only *tells* underlying pieces of
 I/O errors, rather than acting on them (e.g. automatically yanking the
 disk off the bus for you).  I'm in no way shunning Solaris, I'm simply
 saying such a mechanism could be as risky/deadly as it could be useful.

 Yeah, I guess so - I think the way it's meant to happen (and this is only 
 AFAIK) is that FMA 'detects' a failing drive by applying some 
 configurable policy to it. That policy would also include notifying ZFS, 
 so that ZFS could then decide to stop issuing I/O commands to that 
 device.

It sounds like that is done very differently than on FreeBSD.  If such a
condition happens on FreeBSD (disk errors scrolling by, etc.), the only
way I know of to get FreeBSD to stop sending commands through the ATA
subsystem is to detach the channel (atacontrol detach ataX).

 None of this seems to be in place, at least for ATA under FreeBSD - when 
 a drive goes bad, you can just end up with 'hours' worth of I/O timeouts, 
 until someone intervenes.

I can see the usefulness in Solaris's FMA thing.  My big concern is
whether or not FMA actually pulls the disk off the channel, or if it
just leaves the disk/channel connected and simply informs kernel pieces
not to use it.  If it pulls the disk off the channel, I have serious
qualms with it.

There are also chips on SATA and SCSI controllers which can cause chaos
as well -- specifically, SES/SES2 chips (I'm looking at you, QLogic).
These are supposed to be smart chips that detect when there are a
large number of transport or hardware errors (implying cabling issues,
etc.) and *automatically* yank the disk off the bus.  Sounds great on
paper, but in the field, I see these chips start pulling disks off the
bus, changing SCSI IDs on devices, or induce what appear to be full SCSI
subsystem timeouts (e.g. the SES/SES2 chip has locked up/crashed in some
way, and now your entire bus is dead in the water).  I have seen all of
the above bugs with onboard Adaptec 320 controllers, the systems running
Solaris 8, 9, and OpenSolaris.  Most times it turns out to be the
SES/SES2 chip getting in the way.

 I did enquire on the Open Solaris list about setting limits for 'errors' 
 in ZFS, which netted me a reply that it's FMA (at least in Solaris) 
 that's responsible for this - it just then informs ZFS of the condition. 
 We don't appear (again at least for ATA) to have anything similar for 
 FreeBSD yet :(

My recommendation to people these days is to avoid ata(4) on FreeBSD at
all costs if they expect to encounter disk or hardware failures.  The
ata(4) layer is in no way shape or form reliable in the case of
transport or disk failures, and even sometimes in the case of hot-
swapping.  Try your hardest to find a physical controller that supports
SATA disks and uses CAM/da(4), which WILL provide that 

Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Zaphod Beeblebrox
On Fri, Sep 12, 2008 at 11:44 AM, Oliver Fromme [EMAIL PROTECTED]wrote:


 Did you try atacontrol detach to remove the disk from
 the bus?  I haven't tried that with ZFS, but gmirror
 automatically detects when a disk has gone away, and
 doesn't try to do anything with it anymore.  It certainly
 should not hang the machine.  After all, what's the
 purpose of a RAID when you have to reboot upon drive
 failure.  ;-)


To be fair, many home users run RAID without the expectation of being able
to hot swap the drives.  While RAID can provide high availability, but it
can also provide simple data security.

In my home environment, I have a number of machines running.  I have a few
things on non-redundant disks --- mostly operating systems or local archives
of internet data (like a cvsup server, for instance).  Those disks can be
lost, and while it's a nuisance, it's not catastrophic.

Other things (from family photos to mp3s to other media) I keep on home RAID
arrays.  They're not hot swap... but I've had quite a few disks go bad over
the years.  I actually welcome ZFS for this --- the idea that checksums are
kepts makes me feel a lot more secure about my data.  I have observed some
bitrot over time on some data.

To your point... I suppose you have to reboot at some point after the drive
failure, but my experience has been that the reboot has been under my
control some time after the failure (usually when I have the replacement
drive).

For the home user, this can be quite inexpensive, too.  I've found a case
that can take 19 drives internally (and has good cooling for about $125).
If you used some of the 5-to-3 drive bays, that number would increase to 25.

About the only real improvement I'd like to see in this setup is the ability
to spin down idle drives.  That would be an ideal setup for the home RAID
array.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 09:04:22AM -0700, Jeremy Chadwick wrote:
 What this does to other parts of the kernel and userland applications is
 something I haven't tested.  I *can* tell you that there are major,
 major problems with detach/reattach/reinit on ata(4) causing kernel
 panics and other such things.  I've documented this quite thoroughly in
 my Common FreeBSD issues wiki:
 
 http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues

This should have read: ... in my ATA/SATA issues and troubleshooting
methods page:

http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Zaphod Beeblebrox
On Fri, Sep 12, 2008 at 10:34 AM, Karl Pielorz [EMAIL PROTECTED]wrote:


 --On 12 September 2008 06:21 -0700 Jeremy Chadwick [EMAIL PROTECTED]
 wrote:

  As far as I know, there is no such standard mechanism in FreeBSD.  If
 the drive falls off the bus entirely (e.g. detached), I would hope ZFS
 would notice that.  I can imagine it (might) also depend on if the disk
 subsystem you're using is utilising CAM or not (e.g. disks should be daX
 not adX); Scott Long might know if something like this is implemented in
 CAM.  I'm fairly certain nothing like this is implemented in ata(4).


 For ATA, at the moment - I don't think it'll notice even if a drive
 detaches. I think like my system the other day, it'll just keep issuing I/O
 commands to the drive, even if it's disappeared (it might get much 'quicker
 failures' if the device has 'gone' to the point of FreeBSD just quickly
 returning 'fail' for every request).


Since I had the opportunity, I tested this recently for both CAM and ATA.
Now the RAID engine was gmirror in both cases (my production hardware
doesn't do ZFS yet), but I expect the reaction to be somewhat the same.

Both systems were Dell 1U's.  One, an R200, had SATA disks attached to a
plain SATA controller.  I believe it may have supported RAID1, but I didn't
use that functionality.  When a drive was removed from it, it stalled for
some time (30 minutes?) and then resumed working.  by the time I could type
on the machine again, gmirror had decided that the drive was gone and marked
the mirror as degraded.

The other system was a 1950-III with a SCSI SAS controller attached to an
SAS hot-swap backplane.  The drives themselves were 750G SATA drives.
Yanking one of them resulted in about 5 seconds of disruption followed by
gmirror realizing the problem and marking the mirror degraded.

Neither system was heavily loaded during the test.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 12:04:27PM -0400, Zaphod Beeblebrox wrote:
 On Fri, Sep 12, 2008 at 11:44 AM, Oliver Fromme [EMAIL PROTECTED]wrote:
  Did you try atacontrol detach to remove the disk from
  the bus?  I haven't tried that with ZFS, but gmirror
  automatically detects when a disk has gone away, and
  doesn't try to do anything with it anymore.  It certainly
  should not hang the machine.  After all, what's the
  purpose of a RAID when you have to reboot upon drive
  failure.  ;-)
 
 To be fair, many home users run RAID without the expectation of being able
 to hot swap the drives.  While RAID can provide high availability, but it
 can also provide simple data security.

RAID only ensures a very, very tiny part of data security, and it
depends greatly on what RAID implementation you use.  No RAID
implementation I know of provides against transparent data corruption
(bit-rot), and many RAID controllers and RAID drivers have bugs that
induce corruption (to date, that's (very old ATA) Highpoint chips,
nVidia/nForce chips, JMicron or Silicon Image chips -- all of these are
used on consumer boards).

A big problem is also that end-users *still* think RAID is a replacement
for doing backups.  :-(

 To your point... I suppose you have to reboot at some point after the drive
 failure, but my experience has been that the reboot has been under my
 control some time after the failure (usually when I have the replacement
 drive).

For home use, sure.  Since most home/consumer systems do not include
hot-swappable drive bays, rebooting is required.  Although more and more
consumer motherboards are offering AHCI -- which is the only reliable
way you'll get that capability with SATA.

In my case with servers in a co-lo, it's not acceptable.  Our systems
contain SATA backplanes that support hot-swapping, and it works how it
should (yank the disk, replace with a new one) on Linux -- there is no
need to do a bunch of hoopla like on FreeBSD.  On FreeBSD, with that
hoopla, also take the risk of inducing a kernel panic.  That risk does
not sit well with me, but thankfully I've only been in that situation
(replacing a bad disk + using hot-swapping) once -- and it did work.

At my home, I have a pseudo-NAS system running FreeBSD.  The case is
from Supermicro, a mid-tower, and has a SATA backplane that supports
hot-swapping.  I use ZFS on this system, sporting 3 disks and one
(non-ZFS) for boot/OS.  But because I'm using ata(4) -- see above.

Individuals on -stable and other lists using ZFS have posted their
experiences with disk failures.  I believe to date I've seen one which
worked flawlessly, and the others reporting strange issues with
resilvering, or in a couple cases, lost all their zpools permanently.
Of course, it's very rare in this day and age for people to mail a
mailing list reporting *successes* with something -- people usually only
mail if something *fails*.  :-)

That said, pjd@'s dedication to getting ZFS working reliably on FreeBSD
is outstanding.  It's a great filesystem replacement, and even the Linux
folks are a bit jealous over how simple and painless it is.  I can
share their jealousy -- I've looked at the LVM docs... never again.

 About the only real improvement I'd like to see in this setup is the ability
 to spin down idle drives.  That would be an ideal setup for the home RAID
 array.

There is a FreeBSD port which handles this, although such a feature
should ideally be part of the ata(4) system (as should TCQ/NCQ and a
slew of other things -- some of those are being worked on).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Zaphod Beeblebrox
On Fri, Sep 12, 2008 at 12:32 PM, Jeremy Chadwick [EMAIL PROTECTED]wrote:

 On Fri, Sep 12, 2008 at 12:04:27PM -0400, Zaphod Beeblebrox wrote:
  On Fri, Sep 12, 2008 at 11:44 AM, Oliver Fromme [EMAIL PROTECTED]
 wrote:
   Did you try atacontrol detach to remove the disk from
   the bus?  I haven't tried that with ZFS, but gmirror
   automatically detects when a disk has gone away, and
   doesn't try to do anything with it anymore.  It certainly
   should not hang the machine.  After all, what's the
   purpose of a RAID when you have to reboot upon drive
   failure.  ;-)
 
  To be fair, many home users run RAID without the expectation of being
 able
  to hot swap the drives.  While RAID can provide high availability, but it
  can also provide simple data security.

 RAID only ensures a very, very tiny part of data security, and it
 depends greatly on what RAID implementation you use.  No RAID
 implementation I know of provides against transparent data corruption
 (bit-rot), and many RAID controllers and RAID drivers have bugs that


Well... this is/was a thread about ZFS.  ZFS does detect that bitrot _and_
correct it if it is possible.


 A big problem is also that end-users *still* think RAID is a replacement
 for doing backups


Well... this comment seems a bit off topic, but maybe (in some cases) RAID
is a substitute for doing backups.  I suppose it depends on your tolerance
and data value.  The sheer size of some datasets these days makes backup
prohibitively time consuming and/or expensive.  Then again (this is a ZFS
thread), ZFS helps with this: the ability to export snapshots to other
spinning spool makes a lot of sense.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Freddie Cash
On September 12, 2008 09:32 am Jeremy Chadwick wrote:
 For home use, sure.  Since most home/consumer systems do not include
 hot-swappable drive bays, rebooting is required.  Although more and
 more consumer motherboards are offering AHCI -- which is the only
 reliable way you'll get that capability with SATA.

 In my case with servers in a co-lo, it's not acceptable.  Our systems
 contain SATA backplanes that support hot-swapping, and it works how it
 should (yank the disk, replace with a new one) on Linux -- there is no
 need to do a bunch of hoopla like on FreeBSD.  On FreeBSD, with that
 hoopla, also take the risk of inducing a kernel panic.  That risk does
 not sit well with me, but thankfully I've only been in that situation
 (replacing a bad disk + using hot-swapping) once -- and it did work.

Hrm, is this with software RAID or hardware RAID?

With our hardware RAID systems, the process has always been the same, 
regardless of which OS (Windows 2003 Servers, Debian Linux, FreeBSD) is 
on the system:
  - go into RAID management GUI, remove drive
  - pull dead drive from system
  - insert new drive into system
  - go into RAID management GUI, make sure it picked up new drive and 
started the rebuild

We've been lucky so far, and not had to do any drive replacements on our 
non-ZFS software RAID systems (md on Debian, gmirror on FreeBSD).  I'm 
not looking forward to a drive failing, as these systems have 
non-hot-pluggable SATA setups.

On the ZFS systems, we just zpool offline the drive, physically replace 
the drive, and zpool replace the drive.  On one system, this was done 
via hot-pluggable SATA backplane, on another, it required a reboot.

-- 
Freddie Cash
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 10:12:09AM -0700, Freddie Cash wrote:
 On September 12, 2008 09:32 am Jeremy Chadwick wrote:
  For home use, sure.  Since most home/consumer systems do not include
  hot-swappable drive bays, rebooting is required.  Although more and
  more consumer motherboards are offering AHCI -- which is the only
  reliable way you'll get that capability with SATA.
 
  In my case with servers in a co-lo, it's not acceptable.  Our systems
  contain SATA backplanes that support hot-swapping, and it works how it
  should (yank the disk, replace with a new one) on Linux -- there is no
  need to do a bunch of hoopla like on FreeBSD.  On FreeBSD, with that
  hoopla, also take the risk of inducing a kernel panic.  That risk does
  not sit well with me, but thankfully I've only been in that situation
  (replacing a bad disk + using hot-swapping) once -- and it did work.
 
 Hrm, is this with software RAID or hardware RAID?

I do not use either, but have tried software RAID (Intel MatrixRAID) in
the past (and major, MAJOR bugs are why I do not any longer).  Speaking
(mostly) strictly of FreeBSD, let me list off the problems with both:

Software RAID:

1) Buggy as hell.  Using Intel MatrixRAID as an example, even with
   RAID 1, due to ata(4) driver bugs, you are practically guaranteed
   to lose your data,
3) Limited userland interface to RAID BIOS; many operations do not
   work with atacontrol, requiring a system reboot + entering BIOS
   to do things like add/remove disks or rebuild an array
3) SMART monitoring lost; if the card or BIOS supports passthrough
   (basically ATA version of pass(4)), FreeBSD will see the disks
   natively (e.g. arX for the RAID, ad4 and ad8 for the disks), and
   you can use smartmontools.  Otherwise, you're screwed
4) Support is questionable; numerous mainstream chips unsupported,
   including Adaptec HostRAID

Hardware RAID:

1) You are locked in to that controller.  Your data is at the
   mercy of the company who makes the HBA; if your controller dies
   and is no longer made, your data is dead in the water.  Chances
   are a newer model/revision of controller will not understand the
   the disk metadata from the previous controller
2) Performance problems as a result of excessive caching levels;
   onboard hardware cache vs. system memory cache vs. disk layer
   cache in OS vs. other kernel caching mechanisms
3) Controller firmware upgrades are risky -- 3Ware has a very nasty
   history of this, for sake of example.  I've heard of some upgrades
   changing the metadata format, requiring complete array re-creation
   
I can pull Ade Lovett [EMAIL PROTECTED] into this conversation if you
think any of the above is exaggerated.  :-)

The only hardware RAID controller I'd trust at this point would be
Areca -- but hardware RAID is not what I want.  On the other hand, I
really want Areca to make a standard 4 or 8-port SATA controller --
no RAID, but full driver support under arcmsr(4) (which uses CAM and
da(4)).  This would be perfect.

 With our hardware RAID systems, the process has always been the same, 
 regardless of which OS (Windows 2003 Servers, Debian Linux, FreeBSD) is 
 on the system:
   - go into RAID management GUI, remove drive
   - pull dead drive from system
   - insert new drive into system
   - go into RAID management GUI, make sure it picked up new drive and 
 started the rebuild

The simplicity there is correct -- that's really how simple it should
be.  But a GUI?  What card is this that requires a GUI?  Does it require
a reboot?  No command-line support?

 We've been lucky so far, and not had to do any drive replacements on our 
 non-ZFS software RAID systems (md on Debian, gmirror on FreeBSD).  I'm 
 not looking forward to a drive failing, as these systems have 
 non-hot-pluggable SATA setups.

I'm hearing you loud and clear.  :-)

 On the ZFS systems, we just zpool offline the drive, physically replace 
 the drive, and zpool replace the drive.  On one system, this was done 
 via hot-pluggable SATA backplane, on another, it required a reboot.

If this was done on the hardware RAID controller (presuming it uses
CAM and da(4)), I'm not surprised it worked perfectly.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

-- Thread Karl Pielorz
->














  
  Re: ZFS w/failing drives - any equivalent of Solaris FMA?
  
  
  
  
  
  








	

	freebsd-hackers 

	
		
			-- Thread --
			-- Date --
			





			
		
	



	
	
	




 




<!--
google_ad_client = "pub-7266757337600734";
google_alternate_ad_url = "http://www.mail-archive.com/blank.png";
google_ad_width = 160;
google_ad_height = 600;
google_ad_format = "160x600_as";
google_ad_channel = "8427791634";
google_color_border = "FF";
google_color_bg = "FF";
google_color_link = "006792";
google_color_url = "006792";
google_color_text = "00";
//-->












Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Karl Pielorz


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Nathanael Hoyle





Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Zaphod Beeblebrox




Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Oliver Fromme


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Zaphod Beeblebrox


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Jeremy Chadwick


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Zaphod Beeblebrox


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Freddie Cash


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Jeremy Chadwick




Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Wilkinson, Alex



Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Jeremy Chadwick
 


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Wilkinson, Alex


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Jeremy Chadwick




Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Fabian Keil


Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Jeremy Chadwick














Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Freddie Cash





 






  
  





Reply via email to