Re: disks becoming slow but not explicitly failing anyone?

2006-05-04 Thread Bill Davidsen

Nix wrote:


On 23 Apr 2006, Mark Hahn stipulated:
 


I've seen a lot of cheap disks say (generally deep in the data sheet
that's only available online after much searching and that nobody ever
reads) that they are only reliable if used for a maximum of twelve hours
a day, or 90 hours a week, or something of that nature. Even server
 

I haven't, and I read lots of specs.  they _will_ sometimes say that 
non-enterprise drives are intended or designed for a 8x5 desktop-like

usage pattern.
   



That's the phrasing, yes: foolish me assumed that meant `if you leave it
on for much longer than that, things will go wrong'.

 

   to the normal way of thinking about reliability, this would 
simply mean a factor of 4.2x lower reliability - say from 1M to 250K hours
MTBF.  that's still many times lower rate of failure than power supplies or 
fans.
   



Ah, right, it's not a drastic change.

 


It still stuns me that anyone would ever voluntarily buy drives that
can't be left switched on (which is perhaps why the manufacturers hide
 

I've definitely never seen any spec that stated that the drive had to be 
switched off.  the issue is really just what is the designed duty-cycle?
   



I see. So it's just `we didn't try to push the MTBF up as far as we would
on other sorts of disks'.

 


I run a number of servers which are used as compute clusters.  load is
definitely 24x7, since my users always keep the queues full.  but the servers
are not maxed out 24x7, and do work quite nicely with desktop drives
for years at a time.  it's certainly also significant that these are in a 
decent machineroom environment.
   



Yeah; i.e., cooled. I don't have a cleanroom in my house so the RAID
array I run there is necessarily uncooled, and the alleged aircon in the
room housing work's array is permanently on the verge of total collapse
(I think it lowers the temperature, but not by much).

 


it's unfortunate that disk vendors aren't more forthcoming with their drive
stats.  for instance, it's obvious that wear in MTBF terms would depend 
nonlinearly on the duty cycle.  it's important for a customer to know where 
that curve bends, and to try to stay in the low-wear zone.  similarly, disk
   



Agreed! I tend to assume that non-laptop disks hate being turned on and
hate temperature changes, so just keep them running 24x7. This seems to be OK,
with the only disks this has ever killed being Hitachi server-class disks in
a very expensive Sun server which was itself meant for 24x7 operation; the
cheaper disks in my home systems were quite happy. (Go figure...)

 

specs often just give a max operating temperature (often 60C!), which is 
almost disingenuous, since temperature has a superlinear effect on reliability.
   



I'll say. I'm somewhat twitchy about the uncooled 37C disks in one of my
machines: but one of the other disks ran at well above 60C for *years*
without incident: it was an old one with no onboard temperature sensing,
and it was perhaps five years after startup that I opened that machine
for the first time in years and noticed that the disk housing nearly
burned me when I touched it. The guy who installed it said that yes, it
had always run that hot, and was that important? *gah*

I got a cooler for that disk in short order.

 


a system designer needs to evaluate the expected duty cycle when choosing
disks, as well as many other factors which are probably more important.
for instance, an earlier thread concerned a vast amount of read traffic 
to disks resulting from atime updates.
   



Oddly, I see a steady pulse of write traffic, ~100Kb/s, to one dm device
(translating into read+write on the underlying disks) even when the
system is quiescient, all daemons killed, and all fsen mounted with
noatime. One of these days I must fish out blktrace and see what's
causing it (but that machine is hard to quiesce like that: it's in heavy
use).

 

simply using more disks also decreases the load per disk, though this is 
clearly only a win if it's the difference in staying out of the disks 
duty-cycle danger zone (since more disks divide system MTBF).
   



Well, yes, but if you have enough more you can make some of them spares
and push up the MTBF again (and the cooling requirements, and the power
consumption: I wish there was a way to spin down spares until they were
needed, but non-laptop controllers don't often seem to provide a way to
spin anything down at all that I know of).

 

hdparam will let you set the spindown time. I have all mine set that way 
for power and heat reasons, they tend to be in burst use. Dropped the CR 
temp by enough to notice, but I need some more local cooling for that 
room still.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

Re: Two-disk RAID5?

2006-05-04 Thread Bill Davidsen

Erik Mouw wrote:


On Wed, Apr 26, 2006 at 03:22:38PM -0400, Jon Lewis wrote:
 


On Wed, 26 Apr 2006, Jansen, Frank wrote:

   


It is not possible to flip a bit to change a set of disks from RAID 1 to
RAID 5, as the physical layout is different.
 

As Tuomas pointed out though, a 2 disk RAID5 is kind of a special case 
where all you have is data and parity which is actually also just data. 
   



No, the other way around: RAID1 is a special case of RAID5.

No it isn't. If you have N drives in RAID1 you have N independent copies 
of the data and no parity, there's just no corresponding thing in RAID5, 
which has one copy of the data, plus parity. There is no special case, 
it just doesn't work that way. Set N2 and report back.


Sorry, I couldn't find a diplomatic way to say you're completely wrong.

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Two-disk RAID5?

2006-05-04 Thread Bill Davidsen

John Rowe wrote:


I'm about to create a RAID1 file system and a strange thought occurs to
me: if I create a two-disk RAID5 array then I can grow it later by the
simple expedient of adding a third disk and hence doubling its size.

Is there any real down-side to this, such as performance? Alternatively
is it likely that mdadm will soon be able to convert a RAID1 pair to
RAID5 any time soon? (Just how different are they anyway? Isn't the
RAID4/5 checksum just an OR?)

I think it works, I just set up a little test case with two 20MB files 
and loopback mount. The mdadm seems to work, the mke2fs seems to work, 
the f/s is there. Please verify, this system is a bit (okay a bunch) hacked.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html