"L. V. Lammert" writes:
> On Fri, 3 Aug 2007, Joel Knight wrote:
> 
> > --- Quoting HDC on 2007/08/02 at 20:26 -0300:
> >
> > > Read this...
> > > http://www.packetmischief.ca/openbsd/doc/raidadmin/<http://www.packetmisc
> hief
> > > .ca/openbsd/>
> > >
> >
> > I used to use raidframe and followed the procedures in that doc for
> > doing so, but now there's no point. If the system requires any type of
> > raid, go hardware. Long live bio(4).
> >
> IF you choose to NOT use a h/w controller, use rsync instead. Permits
> quick recovery in the case of a drive failure (swap drive cables &
> reboot), does not require lengthy parity rebuild.

And you only lose the data written since the last rsync... 
and your system probably goes down instead of staying up until you 
can fix it.. 

RAIDframe, like hardware RAID and rsync, is just another tool.  
Understand the pros and cons of each, but be willing to accept the 
risks associated with whatever you choose... (if you think hardware 
RAID is riskless, then you've never had a 2TB RAID set suddenly 
decide that all components were "offline" and mark them as such!)

For the folks who dislike the "long parity checks"... If you're 
willing to accept a window during which some of your data *might* be 
at risk, change: 
 raidctl -P all
to something like
 sleep 3600 ; raidctl -P all &
in /etc/rc .  This will, of course, delay the start of the parity 
computation for an hour or so, giving your system a chance to do the 
fscks and get back to multi-user as quickly as possible.

The risk here is as follows (this is for RAID 1.. risks for RAID 5 
are slightly higher): 
  1) even though parity is marked 'dirty', it might actually be in 
sync.  In this case if you have a component failure, your data is 
fine.
  2) until the parity check is done, only the 'master' component is 
used for reading.  But any writes will be done are mirrored to both 
components.  That means that when the fsck is being done, any 
problems found will be fixed on *both* components, and writes will 
keep the two in sync even before parity is checked.
  3) Where the risk of data loss comes in is if the master dies 
before the parity check gets done.  In this case, data on the master 
that was not re-written or that was out-of-sync with the slave will 
be lost.  This could result in the loss of pretty much anything.

The important thing here is for you to evaluate your situation and 
decide whether this level of risk is acceptable... For me, I use the 
equivalent to 'sleep 3600' on my home desktop.. and slightly modified 
versions of it on other home servers and other boxen I look after.. 
But don't blindly listen to me or anyone else -- learn what the risks 
are for your situation, determine what level of risk you can accept, 
and go from there...

Later...

Greg Oster

Reply via email to