Re: RAID? Was: PATA hard disks, anyone?

2018-03-29 Thread Chuck Guzis via cctalk
On 03/29/2018 03:48 PM, Alexander Schreiber via cctalk wrote:

> Also, AFS is built around volumes (think "virtual disks") and you have
> the concept of a r/w volume with (potentially) a pile of r/o volumes
> snapshotted from it. So one thing I did was that every (r/w) volume
> had a directory .backup in its root where there was mounted a r/o
> volume snapshooted from the r/w volume around midnight every day.

CDC 6000 SCOPE 3.3 and later implemented permanent files with file
"cycles".   That is, earlier versions of a file were kept around.

The approach was a little different.  You initially started a job or
session with no files, but for INPUT (wherever it came from) OUTPUT
(display or print output) and optionally PUNCH (obvious meaning).  A
dayfile was also maintained, but the individual user could only add to
it, not otherwise manipulate it.

To do real work on an ongoing project involved ATTACH-ing a permanent
file that had been CATALOG-ed.  Paswords (up to 3) and permissions
needed to be specified to ATTACH a file.  This, IIRC, created a local
copy of the file.   If you mistakenly deleted the local copy, you still
had the permanent copy.  If you saved the local copy after modifying it,
it was saved as a new cycle.

A user could PURGE old permanent file cycles.

The beauty of this was that a user had access to the files that were
needed for a session.  A user could, of course, create as many local
files as desired, but these were all disposed of at the end of the
job/session, so there wasn't a lot of garbage floating around in the system.

A side benefit was that permanent files could be archived to tape, so
when an ATTACH was issued for an archived file, the job was suspended
until the relevant tape was located and read.

I suspect that modern users would consider the system to be too
restrictive for today's tastes, but it was fine back then.

--Chuck



Re: RAID? Was: PATA hard disks, anyone?

2018-03-29 Thread Alexander Schreiber via cctalk
On Wed, Mar 28, 2018 at 01:17:08PM -0400, Ethan via cctalk wrote:
> > I know of no RAID setup that can save me >from stupid.
> 
> I use rsync. I manually rsync the working disks to the backup disks every
> week or two. Working disks have the shares to other hosts. If something
> happens to that data, deleted by accident or encrypted by malware. Meh.
> 
> Hardware like netapp and maybe filesystems in open source have those awesome
> snapshot systems with there is directory tree that has past time version of
> data. A directory of 15 minutes ago, one of 6 hours ago, etc is what we had
> setup at a prior gig.

At a prior job, I replaced the standard NFS+Samba filesharing mess (with
the regular "I need you to twiddle permissions" fun) with an AFS server.
Native clients for both Linux and Windows2000. With access to the ACLs
built right into the native interfaces, so that regular call went away.

Also, AFS is built around volumes (think "virtual disks") and you have
the concept of a r/w volume with (potentially) a pile of r/o volumes
snapshotted from it. So one thing I did was that every (r/w) volume
had a directory .backup in its root where there was mounted a r/o
volume snapshooted from the r/w volume around midnight every day.

That killed about 95% of the "I accidently deleted $FILE, can you please
dig it out of the backup" calls.

Plus, it made backups darn easy.

Last I heard, after I left that place, they setup a second AFS server.

Oh, AFS as in: the Andrew File System

Kind regards,
   Alex.
-- 
"Opportunity is missed by most people because it is dressed in overalls and
 looks like work."  -- Thomas A. Edison


Re: RAID? Was: PATA hard disks, anyone?

2018-03-29 Thread Alexander Schreiber via cctalk
On Tue, Mar 27, 2018 at 10:26:53PM -0300, Paul Berger via cctalk wrote:
> 
> 
> On 2018-03-27 10:05 PM, Ali via cctalk wrote:
> > 
> > 
> >  Original message 
> > From: Fred Cisin via cctalk 
> > Date: 3/27/18  5:51 PM  (GMT-08:00)
> > To: "General Discussion: On-Topic and Off-Topic Posts" 
> > 
> > Subject: RAID? Was: PATA hard disks, anyone?
> > 
> > How many drives would you need, to be able to set up a RAID, or hot
> > swappable RAUD (Redundant Array of Unreliable Drives), that could give
> > decent reliability with such drives?
> > 10 -
> > Two sets of 5 drive  RAID 6 volumes in a RAID 1 array.
> > You would then need to lose 5 drives before data failure is imminent. The 
> > 6th one will do you in. If you haven't fixed 50 percent failure then you 
> > deserve to lose your data.
> > Disclaimer: this is my totally unscientific unprofessional and biased 
> > estimate. My daily activities of life have nothing to do with the IT 
> > industry. Proceed at your own peril. Etc. Etc.
> > -Ali
> > 
> > 
> To meet Fred's original criteria you would only need 4 to create a minimal
> RAID 6 array.  In theory a RAID 1 array (mirrored) of 4 or more disk could
> also survive a second disk failure as long as one copy of all the pairs in
> the array survive but you are starting to play the odds, and I know of some
> cases where people have lost . You can improve the odds by having a hot
> spare that automatically take over for a failed disk.  One of  the most
> important things is the array manager has to have some way of notifying you
> that there has been a failure so that you can take action, however my
> observations as a hardware support person is that even when there is error
> notification it is often missed or ignored until subsequent failures kill
> off the array.   It also appears to be a fairly common notion that if you
> have RAID there is no need to ever backup, but I assure you RAID is not
> foolproof and arrays do fail.

Repeat 10 times after me: "RAID is NOT backup".

If you only have online backup, you don't have backup, you have easy
to erase/corrupt copies.

If you don't have offline offsite backup, you don't have backup, you have
copies that will die when your facility/house/datacenter burns down/gets
flooded/broken into and looted.

And yes, in a previous job I did data recovery from a machine that
sat in a flooded store. Was nicely light-brown (from the muck in the water)
until about 2cm below the tape drive, so the last backup tape survived.
It missed about 24h of store sales data - which _did_ exist as paper
copies, but typing those in by hand ... yuck.

So we shipped the machine to the head office, removed the covers,
put it into a room together with some space heaters and fans blowing
directly on it and left if for two weeks to dry out.

Then fired it up and managed to scrape all the database data off it
while hearing and seeing (in the system logs) the disks dying.

Why didn't they have offsite backups? Well, that was about 12 years ago
and at that time, having sufficiently fat datalinks between every store
(lots of them) and the head office was deemed just way too [obscenity]
expensive. We did have datalinks to all of them, so at least we got
realtime monitoring.

There are good reasons why part of my private backup strategy is
tapes sitting in a bank vault.

I'm also currently dumping a it-would-massively-suck-to-lose-this dataset
to mdisc BD media. There I'm reasonably confident about the long term
survival of the media, what worries me is the long term availability of
the _drives_. Ah well, if you care about the data, you'll eternally have
to forward-copy anyway.

>   One of the big problems facing using large
> disks to build arrays is the number of accesses just to build the array may
> put a serious dent in the speced number of accesses before error or in some
> cases even exceed it.

That is actually becoming a problem, yes. Moreso, for rebuild - with
RAID5, you might encounter a second disk failure during rebuild, at
which point you are ... in a bad place. Forget about RAID5, go straight
to RAID6.

Kind regards,
Alex.
-- 
"Opportunity is missed by most people because it is dressed in overalls and
 looks like work."  -- Thomas A. Edison


Re: RAID? Was: PATA hard disks, anyone?

2018-03-29 Thread Peter Corlett via cctalk
On Wed, Mar 28, 2018 at 05:40:29PM -0700, Richard Pope via cctalk wrote:
> I have been kind of following this thread. I have a question about MTBF. I
> have four HGST UltraStar Enterprise 2TB drives setup in a Hardware RAID 10
> configuration. If the the MTBF is 100,000 Hrs for each drive does this mean
> that the total MTBF is 25,000 Hrs?

That's the mean time before any one disk fails, but not the MTBF for the array
as a whole because failure of an individual disk doesn't cause the array to
fail. There needs to be at least one more disk failure for that to happen.

MTBF is also an overly simple measure which fails to account for the bathtub
curve and correlated failures. Attempts to compute the MTBF of an array from
the MTBF of the individual components will come up with a plausible number
which is technically correct yet bears no relation to the real world.

In practice, the only numbers on a typical hard disk datasheet which aren't
fantasy marketing puff are the physical dimensions and the number of sectors,
and even that is because those are industry-wide standards that disks must
conform to.



Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Paul Koning via cctalk
It's not quite that bad.  The answer is that the MTBF of four drives is 
probably not simply the MTBF of one drive divided by four.  If you have a good 
description of the probability of failure as a function of drive age (i.e., a 
picture of its particular "bathtub curve") you can then work out the 
corresponding curve for multiple drives.  I like to leave the details of how to 
do this to appropriate mathematicians.

If all you have is a data sheet that says "MTBF is 1M hours" then you don't 
have enough information.  You can assume some distribution and figure 
accordingly, but if the actual distribution is sufficiently different from the 
guess then the answers you calculated may be significantly off.

BTW, specified MTBF for modern drives is a whole lot higher than 100k hours.  
Real MTBF may differ from specified, and derating the manufacturer's number 
according to your preferred level of pessimism is probably a good idea. 

paul


> On Mar 28, 2018, at 9:57 PM, Richard Pope via cctalk  
> wrote:
> 
> Fred,
>I appreciate the explanation. So with out a 1,000, 10,000, or even 100,000 
> drives there is no way to know how long my drives in the RAID will last. All 
> I know for sure is that I can lose anyone drive and the RAID can be rebuilt.
> GOD Bless and Thanks,
> rich!
> 
> On 3/28/2018 4:43 PM, Fred Cisin via cctalk wrote:
>> On Wed, 28 Mar 2018, Richard Pope via cctalk wrote:
>>>   I have been kind of following this thread. I have a question about MTBF. 
>>> I have four HGST UltraStar Enterprise 2TB drives setup in a Hardware RAID 
>>> 10 configuration. If the the MTBF is 100,000 Hrs for each drive does this 
>>> mean that the total MTBF is 25,000 Hrs?
>> 
>> 
>> Probably NOT.
>> It depends extremely heavily on the shape of the curve of failure times.
>> MEAN Time Before Failure, of course, means that for a large enough sample, 
>> half the drives fail before 100,000 hours, and half after.  Thus, at 100,000 
>> hours, half are dead.
>> 
>> But, how evenly distributed are the failures? ...



Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Richard Pope via cctalk

Fred,
I appreciate the explanation. So with out a 1,000, 10,000, or even 
100,000 drives there is no way to know how long my drives in the RAID 
will last. All I know for sure is that I can lose anyone drive and the 
RAID can be rebuilt.

GOD Bless and Thanks,
rich!

On 3/28/2018 4:43 PM, Fred Cisin via cctalk wrote:

On Wed, 28 Mar 2018, Richard Pope via cctalk wrote:
   I have been kind of following this thread. I have a question about 
MTBF. I have four HGST UltraStar Enterprise 2TB drives setup in a 
Hardware RAID 10 configuration. If the the MTBF is 100,000 Hrs for 
each drive does this mean that the total MTBF is 25,000 Hrs?



Probably NOT.
It depends extremely heavily on the shape of the curve of failure times.
MEAN Time Before Failure, of course, means that for a large enough 
sample, half the drives fail before 100,000 hours, and half after.  
Thus, at 100,000 hours, half are dead.


But, how evenly distributed are the failures?
Besides the MTBF, it would help to know the variance or standard 
deviation.
It is unlikely that the failures follow a "normal distribution" (or 
"Laplace-Gauss") bell curve.  And, other distributions are certainly 
not ABnormal :-)


If the curve is symmetrical, then the mean, median, and mode will all 
be the same.  If it is not symmetrical, then they won't be. Hence the 
use of MEDIAN - at that point half are dead, half are still alive.
In toxicology, there is a concept of an LD-50 dosage - the dosage that 
will kill half, since for example, antibiotic resistant bacteria might 
require an incredibly large dosage to get that last one, but LD-50 
provides a convenient way to get a single number.

100,000 hours is the LD-50 of those drives.


If it turns out that the drives last 100,000 hours, plus or minus 10%, 
then you have a curve with a very steep slope.  It is still half dead 
at 100,000, but maybe hardly any dead until 90,000, hardly any left 
alive at 110,000.


OTOH, if the failures were evenly distributed throughout a life of 0 
to 200,000 hours, with the same number going every day, then that also 
would have a MTBF of 100,000.   In THAT case, then yes, the MTBF of 
first failure may well be 25,000.



They rarely work that way.  Often our devices will have what is 
sometimes called a "bathtub curve".  There are a few failures 
IMMEDIATELY ("infant mortality") falling off rapidly, and then few 
failures for quite a while, and then, as random parts start to wear 
out, the failures rise. In fact, with the same MTBF of 100,000, it 
could be that once the early demise ones are discarded, that the MTBF 
of the REMAINDER might be 200,000.


IFF you are willing to deal with the DOA and infant mortality cases, 
then by discarding or ignoring those outlying numbers, you might get a 
more realistic evaluation of what to expect.



--
Grumpy Ol' Fred ci...@xenosoft.com





Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Fred Cisin via cctalk

On Wed, 28 Mar 2018, Richard Pope via cctalk wrote:
   I have been kind of following this thread. I have a question about MTBF. 
I have four HGST UltraStar Enterprise 2TB drives setup in a Hardware RAID 10 
configuration. If the the MTBF is 100,000 Hrs for each drive does this mean 
that the total MTBF is 25,000 Hrs?



Probably NOT.
It depends extremely heavily on the shape of the curve of failure times.
MEAN Time Before Failure, of course, means that for a large enough sample, 
half the drives fail before 100,000 hours, and half after.  Thus, at 
100,000 hours, half are dead.


But, how evenly distributed are the failures?
Besides the MTBF, it would help to know the variance or standard 
deviation.
It is unlikely that the failures follow a "normal distribution" 
(or "Laplace-Gauss") bell curve.  And, other distributions are 
certainly not ABnormal :-)


If the curve is symmetrical, then the mean, median, and mode will all be 
the same.  If it is not symmetrical, then they won't be.  Hence the use of 
MEDIAN - at that point half are dead, half are still alive.
In toxicology, there is a concept of an LD-50 dosage - the dosage that 
will kill half, since for example, antibiotic resistant bacteria might 
require an incredibly large dosage to get that last one, but LD-50 
provides a convenient way to get a single number.

100,000 hours is the LD-50 of those drives.


If it turns out that the drives last 100,000 hours, plus or minus 10%, 
then you have a curve with a very steep slope.  It is still half dead at 
100,000, but maybe hardly any dead until 90,000, hardly any left alive at 
110,000.


OTOH, if the failures were evenly distributed throughout a life of 0 to 
200,000 hours, with the same number going every day, then that also would 
have a MTBF of 100,000.   In THAT case, then yes, the MTBF of first 
failure may well be 25,000.



They rarely work that way.  Often our devices will have what is sometimes 
called a "bathtub curve".  There are a few failures IMMEDIATELY ("infant 
mortality") falling off rapidly, and then few failures for quite a while, 
and then, as random parts start to wear out, the failures rise. 
In fact, with the same MTBF of 100,000, it could be that once the early 
demise ones are discarded, that the MTBF of the REMAINDER might be 
200,000.


IFF you are willing to deal with the DOA and infant mortality cases, then 
by discarding or ignoring those outlying numbers, you might get a more 
realistic evaluation of what to expect.



--
Grumpy Ol' Fred ci...@xenosoft.com


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Richard Pope via cctalk

Hello all,
I have been kind of following this thread. I have a question about 
MTBF. I have four HGST UltraStar Enterprise 2TB drives setup in a 
Hardware RAID 10 configuration. If the the MTBF is 100,000 Hrs for each 
drive does this mean that the total MTBF is 25,000 Hrs?

GOD Bless and Thanks,
rich!

On 3/28/2018 6:33 AM, Paul Koning via cctalk wrote:



On Mar 27, 2018, at 8:51 PM, Fred Cisin via cctalk  
wrote:

Well outside my realm of expertise (as if I had a realm!), . . .

How many drives would you need, to be able to set up a RAID, or hot swappable 
RAUD (Redundant Array of Unreliable Drives), that could give decent reliability 
with such drives?

How many to be able to not have data loss if a second one dies before the first 
casualty is replaced?
How many to be able to avoid data loss if a third one dies before the first two 
are replaced?

These are straightforward questions of probability math, but it takes some time 
to get the details right.  For one thing, you need believable numbers for the 
underlying error probabilities.  And you have to analyze the cases carefully.

The basic assumption is that failures are "fail stop", i.e., a drive refuses to deliver 
data.  (In particular, it doesn't lie -- deliver wrong data.  You can build systems that deal with 
lying drives but RAID is not such a system.)  The failure may be the whole drive ("it's a 
door-stop") or individual blocks (hard read errors).

In either case, RAID-1 and RAID-5 handle single faults.  RAID-6 isn't a single 
well-defined thing but as normally defined it is a system that handles double 
faults.  So a RAID-1 system with a double fault may fail to give you your data. 
 (It may also be ok -- it depends on where the faults are.)  RAID-5 ditto.

The tricky part is what happens when a drive breaks.  Consider RAID-5 with a 
single dead drive, and the others are 100% ok.  Your data is still good.  When 
the broken drive is replaced, RAID rebuilds the bits that belong on that drive. 
 Once that rebuild finishes, you're once again fault tolerant.  But a second 
failure prior to rebuild completion means loss of data.

So one way to look at it: given the MTBF, calculate the probability of two 
drives failing within N hours (where N is the time required to replace the 
failed drive and then rebuild the data onto the new drive).  But that is not 
the whole story.

The other part of the story is that drives have a non-zero probability of a 
hard read error.  So during rebuild, you may encounter a sector on one of the 
remaining drives that can't be read.  If so, that sector is lost.

The probability of hard read error varies with drive technology.  And of 
course, the larger the drive, the greater the probability (all else being 
equal) of having SOME sector be unreadable.  For drives small enough to have 
PATA interfaces, the probability of hard read error is probably low enough that 
you can *usually* read the whole drive without error.  That translates to: 
RAID-1 and RAID-5 are generally adequate for PATA disks.

On the very large drives currently available, it's a different story, and the 
published drive specs make this quite clear.  This is why RAID-6 is much more 
popular now than it was earlier.  It isn't the probability of two nearly 
simultaneous drive failures, but rather the probability of a hard sector read 
error while a drive has failed, that argues for the use of RAID-6 in modern 
storage systems.

paul







Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Grant Taylor via cctalk

On 03/28/2018 12:32 PM, Fred Cisin via cctalk wrote:
With very unreliable drives, that isn't acceptable.  If each "drive" 
within the RAID were itself a RAID, . . .  Getting to be a complicated 
controller, or cascading controllers, . . .


Many of the SCSI / SAS RAID controllers that I've worked with over the 
last 10+ years have this cascading controller functionality.  Most of 
the RAID controlelrs that I've worked with would let you build a mirror 
or stripe across some sort of underlying RAID.  Typical examples are 
striping (RAID 0) across mirrors (RAID 1) or multiple RAID 5 arrays.


'course not.  Besides MTBF for calculating the probability of a second 
drive failing within N hours, must also consider other factors, such as 
external influences causing more than one drive to go, and the 
essentially non-linear aspect of a failure rate curve.
You also need to take into account the additional I/O load imposed on 
the remaining drives during a rebuild.


I usedto routinely run into software (Solstice Disk Suite?) RAID 1 
mirrors on Solaris boxen for the OS (/) where different parts of each 
drive would fail.  So we'd end up with a situation where we had a decent 
RAID, but we couldn't replace either disk.  This usually involved taking 
an entire backup of the machine, replacing both disks, and restoring the 
data.




--
Grant. . . .
unix || die


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Peter Corlett via cctalk
On Wed, Mar 28, 2018 at 09:33:38AM -0400, Paul Koning via cctalk wrote:
[...]
> The basic assumption is that failures are "fail stop", i.e., a drive refuses
> to deliver data. (In particular, it doesn't lie -- deliver wrong data. You
> can build systems that deal with lying drives but RAID is not such a system.)
> The failure may be the whole drive ("it's a door-stop") or individual blocks
> (hard read errors).

The assumption that disks don't lie is demonstrably false, and anybody who
still designs or sells a system in 2018 which makes that assumption is a
charlatan. I have hardware which proves it.

Sun's ZFS filesystem applies an extra "trust but verify" layer of protection
using strong checksums. I have a server with a pair of mirrored 3TB enterprise
disks which are "zfs scrub"bed (surface-scanned and checksums verified) weekly.
Every few months, the scrub will hit a bad checksum which shows that the disk
read back different data to that which was written, even though the disk
claimed the read was OK. At best (and most likely) the problem was a single bit
flip, i.e. roughly a 1 in 1.8e13 error rate. So much for the manufacturer's
claim of less than 1 in 1e15 for that model of disk.

A workstation with a pair of 512GB consumer-grade SSDs has a half-dozen bad
stripes in every scrub performed after the machine has been powered down for a
week or so. The SSDs have just a few hundred hours on the clock and perhaps
three full drive writes. I love the performance of SSDs, but they are
appallingly unreliable for even medium-term storage.

Fortunately, ZFS can tell from the checksums which half of the mirror is lying,
and thus rewrite the stripe based on the known-good copy. It even handles the
case where both disks have some errors. Traditional RAID just cannot self-heal
like that.



Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Paul Koning via cctalk


> On Mar 28, 2018, at 2:32 PM, Fred Cisin via cctalk  
> wrote:
> 
>>> How many drives would you need, to be able to set up a RAID, or hot 
>>> swappable RAUD (Redundant Array of Unreliable Drives), that could give 
>>> decent reliability with such drives?
>>> How many to be able to not have data loss if a second one dies before the 
>>> first casualty is replaced?
>>> How many to be able to avoid data loss if a third one dies before the first 
>>> two are replaced?
> 
> On Wed, 28 Mar 2018, Paul Koning wrote:
> ...
>> The basic assumption is that failures are "fail stop", i.e., a drive refuses 
>> to deliver data.  (In particular, it doesn't lie -- deliver wrong data.  You 
>> can build systems that deal with lying drives but RAID is not such a 
>> system.)  The failure may be the whole drive ("it's a door-stop") or 
>> individual blocks (hard read errors).
> 
> So, in addition to the "RAID" configuration, you would also need additional 
> redundancy to compare multiple reads for error detection.
> At the simplest level, if the reads don't match, then there is an error.
> If a retry produces different dataa, then that drive has an error.
> If two drives agree against a third, then there is a high probability that 
> the variant drive is in error.

If you don't trust drives to deliver correct data often enough, you need your 
own error detection.  Comparing redundant copies is possible.  More efficient 
is various EDC or ECC codes.  Some file systems used hashes like SHA-1 to 
detect data corruption with extremely high probability.

> ...
>> So one way to look at it: given the MTBF, calculate the probability of two 
>> drives failing within N hours (where N is the time required to replace the 
>> failed drive and then rebuild the data onto the new drive). But that is not 
>> the whole story.
> 
> 'course not.  Besides MTBF for calculating the probability of a second drive 
> failing within N hours, must also consider other factors, such as external 
> influences causing more than one drive to go, and the essentially non-linear 
> aspect of a failure rate curve.

Yes, RAID has an underlying assumption that drive failures are independent 
random events.  If that isn't valid then you have a big problem.  This 
occasionally happens; there have been drive enclosures with inadequate 
mechanical design, resulting in excessive vibration which caused rapid and 
correlated drive failure.  The answer to that is "test it properly and don't 
ship stuff like that".

>> The other part of the story is that drives have a non-zero probability of a 
>> hard read error.  So during rebuild, you may encounter a sector on one of 
>> the remaining drives that can't be read.  If so, that sector is lost.
> 
> If we consider that to be a "drive failure", then we are back to designing 
> around multiple failures.

Correct, and that is why RAID-6 is prevalent now that drives are large enough 
that there is a nontrivial error of getting a sector read error if you read the 
whole drive (as RAID-1 rebuild does) and especially if you read multiple whole 
drives (as in RAID-5).

> 
>> The probability of hard read error varies with drive technology.  And of 
>> course, the larger the drive, the greater the probability (all else being 
>> equal) of having SOME sector be unreadable.  For drives small enough to have 
>> PATA interfaces, the probability of hard read error is probably low enough 
>> that you can *usually* read the whole drive without error.  That translates 
>> to: RAID-1 and RAID-5 are generally adequate for PATA disks.
> 
> "generally".
> The original thought behind this silly suggestion was whether it would be 
> possible to make use of MANY very unreliable drives.

Definitely.  You'd have to analyze the model just as I described.  If things 
are bad enough, you may find that RAID-6 is inadequate and you instead need a 
N-fault redundant code with N>2.  Such things are mathematically 
straightforward but compute intensive.  I've seen this done in the "Self-star" 
distributed storage research system at Carnegie-Mellon about a decade ago.  
Partly the reason was to deal with cheap unreliable devices, and partly was as 
an intellectual exercise "because we can".

paul



Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Fred Cisin via cctalk

How many drives would you need, to be able to set up a RAID, or hot swappable 
RAUD (Redundant Array of Unreliable Drives), that could give decent reliability 
with such drives?
How many to be able to not have data loss if a second one dies before the first 
casualty is replaced?
How many to be able to avoid data loss if a third one dies before the first two 
are replaced?


On Wed, 28 Mar 2018, Paul Koning wrote:
These are straightforward questions of probability math, but it takes 
some time to get the details right.  For one thing, you need believable 
numbers for the underlying error probabilities.  And you have to analyze 
the cases carefully.


THANK YOU for the detailed explanation!

The basic assumption is that failures are "fail stop", i.e., a drive 
refuses to deliver data.  (In particular, it doesn't lie -- deliver 
wrong data.  You can build systems that deal with lying drives but RAID 
is not such a system.)  The failure may be the whole drive ("it's a 
door-stop") or individual blocks (hard read errors).


So, in addition to the "RAID" configuration, you would also need 
additional redundancy to compare multiple reads for error detection.

At the simplest level, if the reads don't match, then there is an error.
If a retry produces different dataa, then that drive has an error.
If two drives agree against a third, then there is a high probability that 
the variant drive is in error.


In either case, RAID-1 and RAID-5 handle single faults.  RAID-6 isn't a 
single well-defined thing but as normally defined it is a system that 
handles double faults.  So a RAID-1 system with a double fault may fail 
to give you your data.  (It may also be ok -- it depends on where the 
faults are.)  RAID-5 ditto.


The tricky part is what happens when a drive breaks.  Consider RAID-5 
with a single dead drive, and the others are 100% ok.  Your data is 
still good.  When the broken drive is replaced, RAID rebuilds the bits 
that belong on that drive.  Once that rebuild finishes, you're once 
again fault tolerant.  But a second failure prior to rebuild completion 
means loss of data.


With very unreliable drives, that isn't acceptable.
If each "drive" within the RAID were itself a RAID, . . .
Getting to be a complicated controller, or cascading controllers, . . .

So one way to look at it: given the MTBF, calculate the probability of 
two drives failing within N hours (where N is the time required to 
replace the failed drive and then rebuild the data onto the new drive). 
But that is not the whole story.


'course not.  Besides MTBF for calculating the probability of a second 
drive failing within N hours, must also consider other factors, such as 
external influences causing more than one drive to go, and the essentially 
non-linear aspect of a failure rate curve.


The other part of the story is that drives have a non-zero probability 
of a hard read error.  So during rebuild, you may encounter a sector on 
one of the remaining drives that can't be read.  If so, that sector is 
lost.


If we consider that to be a "drive failure", then we are back to designing 
around multiple failures.


The probability of hard read error varies with drive technology.  And of 
course, the larger the drive, the greater the probability (all else 
being equal) of having SOME sector be unreadable.  For drives small 
enough to have PATA interfaces, the probability of hard read error is 
probably low enough that you can *usually* read the whole drive without 
error.  That translates to: RAID-1 and RAID-5 are generally adequate for 
PATA disks.


"generally".
The original thought behind this silly suggestion was whether it would be 
possible to make use of MANY very unreliable drives.


On the very large drives currently available, it's a different story, 
and the published drive specs make this quite clear.  This is why RAID-6 
is much more popular now than it was earlier.  It isn't the probability 
of two nearly simultaneous drive failures, but rather the probability of 
a hard sector read error while a drive has failed, that argues for the 
use of RAID-6 in modern storage systems.


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Chuck Guzis via cctalk
On 03/28/2018 10:17 AM, Ethan via cctalk wrote:
>> I know of no RAID setup that can save me >from stupid.
> 
> I use rsync. I manually rsync the working disks to the backup disks
> every week or two. Working disks have the shares to other hosts. If
> something happens to that data, deleted by accident or encrypted by
> malware. Meh.

I don't even use rsync.   I have a duplicate system just off to my left
installed with the same software.   Every time I reach a "good enough"
point, I simply do a "tar czf ..." on my work area and ftp the resulting
code to the other machine.  i.e., it's basically a "rolling" backup.

The second system is always one set of system updates behind the current
one, which provides some insurance.

Should disaster strike, I need only use my other system with the latest
backup and I'm back up and running in a matter minutes.

But that still doesn't save me from stupid.  A second cup of coffee
could do that...


--Chuck



Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Toby Thain via cctalk
On 2018-03-28 1:17 PM, Ethan via cctalk wrote:
>> I know of no RAID setup that can save me >from stupid.
> 
> I use rsync. I manually rsync the working disks to the backup disks
> every week or two. Working disks have the shares to other hosts. If
> something happens to that data, deleted by accident or encrypted by
> malware. Meh.
> 
> Hardware like netapp and maybe filesystems in open source have those
> awesome snapshot systems with there is directory tree that has past time
> version of data. A directory of 15 minutes ago, one of 6 hours ago, etc
> is what we had setup at a prior gig.
> 

Yeah was wondering when the discussion might bring up ZFS...

> 
> -- 
> : Ethan O'Toole
> 
> 
> 



Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Grant Taylor via cctalk

On 03/28/2018 11:51 AM, David Brownlee via cctalk wrote:
A step up from rsync can be dirvish - it uses rsync, but before each 
backup it creates a hardlink tree of the previous backup, then rsyncs 
over it. The net effect is you only pay the block cost of one copy of 
unchanged files, plus an inode per copy. Can be very handy


I've used something like that with great success.  -  This is commonly 
known as "Single Instance Store".


The only down side is that there is a single copy of the file, and if 
something happens to it, the entire backup set is impacted.


This is easy to get around by periodically creating a new set backup set 
that future days are hard linked against.  (Play with the numbers to 
match your preference.)




--
Grant. . . .
unix || die


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Grant Taylor via cctalk

On 03/28/2018 11:17 AM, Paul Berger via cctalk wrote:
You mean something like someone who writes a script that does blind cd 
to the directory and then proceeds to delete the contents?


This is one of the primary reasons that I prefer to see the full path 
specified on the rm command.




--
Grant. . . .
unix || die


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread David Brownlee via cctalk
On 28 March 2018 at 18:17, Ethan via cctalk  wrote:

> I know of no RAID setup that can save me >from stupid.
>>
>
> I use rsync. I manually rsync the working disks to the backup disks every
> week or two. Working disks have the shares to other hosts. If something
> happens to that data, deleted by accident or encrypted by malware. Meh.
>
> Hardware like netapp and maybe filesystems in open source have those
> awesome snapshot systems with there is directory tree that has past time
> version of data. A directory of 15 minutes ago, one of 6 hours ago, etc is
> what we had setup at a prior gig.
>

Possible helpful note on the off chance anyone wasn't already aware
(apologies for those who already know)

A step up from rsync can be dirvish - it uses rsync, but before each backup
it creates a hardlink tree of the previous backup, then rsyncs over it. The
net effect is you only pay the block cost of one copy of unchanged files,
plus an inode per copy. Can be very handy :)

David


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Paul Berger via cctalk



On 2018-03-28 2:09 PM, Ali via cctalk wrote:
 



 Original message 
From: Chuck Guzis via cctalk <cctalk@classiccmp.org>
Date: 3/28/18  10:02 AM  (GMT-08:00)
To: Paul Koning via cctalk <cctalk@classiccmp.org>
Subject: Re: RAID? Was: PATA hard disks, anyone?


I know of no RAID setup that can save me >from stupid.

Chuck,

As we say in my day job "there is no curing stupid" ;)
-Ali
You mean something like someone who writes a script that does blind cd 
to the directory and then proceeds to delete the contents? Customer's 
system started crashing at noon every day after they did an OS update 
that eliminated the target directory.  The script was kicked of by cron 
and the cd would fail and since they where not checking if it succeeded 
or if they ended up in the right place the script proceed to delete 
everything from its current directory, which unfortunately was where all 
the libraries live... oopps...


Paul.


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Ethan via cctalk

I know of no RAID setup that can save me >from stupid.


I use rsync. I manually rsync the working disks to the backup disks every 
week or two. Working disks have the shares to other hosts. If something 
happens to that data, deleted by accident or encrypted by malware. Meh.


Hardware like netapp and maybe filesystems in open source have those 
awesome snapshot systems with there is directory tree that has past time 
version of data. A directory of 15 minutes ago, one of 6 hours ago, etc 
is what we had setup at a prior gig.



--
: Ethan O'Toole




Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Ali via cctalk




 Original message 
From: Chuck Guzis via cctalk <cctalk@classiccmp.org> 
Date: 3/28/18  10:02 AM  (GMT-08:00) 
To: Paul Koning via cctalk <cctalk@classiccmp.org> 
Subject: Re: RAID? Was: PATA hard disks, anyone? 

>I know of no RAID setup that can save me >from stupid.

Chuck,

As we say in my day job "there is no curing stupid" ;)
-Ali

Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Chuck Guzis via cctalk
On 03/28/2018 06:33 AM, Paul Koning via cctalk wrote:

> These are straightforward questions of probability math, but it takes
> some time to get the details right.  For one thing, you need
> believable numbers for the underlying error probabilities.  And you
> have to analyze the cases carefully.

A great discussion, Paul!

My biggest danger to data loss is--me.  I've modified my working
procedures as much as possible and keep backups, but every once in a
while, I'll do something stupid, such as "rm -rf *", in the mistaken
assumption that the system I'm on contains a .bashrc file with "alias
rm='rm -i'" in it somewhere.

I know of no RAID setup that can save me from stupid.

--Chuck


Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Paul Koning via cctalk


> On Mar 27, 2018, at 8:51 PM, Fred Cisin via cctalk  
> wrote:
> 
> Well outside my realm of expertise (as if I had a realm!), . . .
> 
> How many drives would you need, to be able to set up a RAID, or hot swappable 
> RAUD (Redundant Array of Unreliable Drives), that could give decent 
> reliability with such drives?
> 
> How many to be able to not have data loss if a second one dies before the 
> first casualty is replaced?
> How many to be able to avoid data loss if a third one dies before the first 
> two are replaced?

These are straightforward questions of probability math, but it takes some time 
to get the details right.  For one thing, you need believable numbers for the 
underlying error probabilities.  And you have to analyze the cases carefully.

The basic assumption is that failures are "fail stop", i.e., a drive refuses to 
deliver data.  (In particular, it doesn't lie -- deliver wrong data.  You can 
build systems that deal with lying drives but RAID is not such a system.)  The 
failure may be the whole drive ("it's a door-stop") or individual blocks (hard 
read errors).

In either case, RAID-1 and RAID-5 handle single faults.  RAID-6 isn't a single 
well-defined thing but as normally defined it is a system that handles double 
faults.  So a RAID-1 system with a double fault may fail to give you your data. 
 (It may also be ok -- it depends on where the faults are.)  RAID-5 ditto.

The tricky part is what happens when a drive breaks.  Consider RAID-5 with a 
single dead drive, and the others are 100% ok.  Your data is still good.  When 
the broken drive is replaced, RAID rebuilds the bits that belong on that drive. 
 Once that rebuild finishes, you're once again fault tolerant.  But a second 
failure prior to rebuild completion means loss of data.

So one way to look at it: given the MTBF, calculate the probability of two 
drives failing within N hours (where N is the time required to replace the 
failed drive and then rebuild the data onto the new drive).  But that is not 
the whole story.

The other part of the story is that drives have a non-zero probability of a 
hard read error.  So during rebuild, you may encounter a sector on one of the 
remaining drives that can't be read.  If so, that sector is lost.

The probability of hard read error varies with drive technology.  And of 
course, the larger the drive, the greater the probability (all else being 
equal) of having SOME sector be unreadable.  For drives small enough to have 
PATA interfaces, the probability of hard read error is probably low enough that 
you can *usually* read the whole drive without error.  That translates to: 
RAID-1 and RAID-5 are generally adequate for PATA disks.

On the very large drives currently available, it's a different story, and the 
published drive specs make this quite clear.  This is why RAID-6 is much more 
popular now than it was earlier.  It isn't the probability of two nearly 
simultaneous drive failures, but rather the probability of a hard sector read 
error while a drive has failed, that argues for the use of RAID-6 in modern 
storage systems.

paul




Re: RAID? Was: PATA hard disks, anyone?

2018-03-28 Thread Liam Proven via cctalk
On 28 March 2018 at 02:51, Fred Cisin via cctalk  wrote:
> Well outside my realm of expertise (as if I had a realm!), . . .
>
> How many drives would you need, to be able to set up a RAID, or hot
> swappable RAUD (Redundant Array of Unreliable Drives), that could give
> decent reliability with such drives?

For what little it's worth, and little relevance, I had a hardware
RAID of 6 × 80GB UltraIDE drives that ran faultlessly for several
years. Not full-time, to be fair, but surviving multiple power cycles.

In an old HP Proliant ML110 G1 -- space-heater Pentium 4 version --
with only about 2GB of RAM (because it used some kind of weird
expensive ECC RAM) and a hacked copy of Windows Server 2008.

It was a box made from free leftovers and proved to be one of my most
reliable workhorse PCs ever. Figures, really.

-- 
Liam Proven • Profile: https://about.me/liamproven
Email: lpro...@cix.co.uk • Google Mail/Hangouts/Plus: lpro...@gmail.com
Twitter/Facebook/Flickr: lproven • Skype/LinkedIn: liamproven
UK: +44 7939-087884 • ČR (+ WhatsApp/Telegram/Signal): +420 702 829 053


Re: RAID? Was: PATA hard disks, anyone?

2018-03-27 Thread Paul Berger via cctalk



On 2018-03-27 10:05 PM, Ali via cctalk wrote:
 



 Original message 
From: Fred Cisin via cctalk 
Date: 3/27/18  5:51 PM  (GMT-08:00)
To: "General Discussion: On-Topic and Off-Topic Posts" 
Subject: RAID? Was: PATA hard disks, anyone?

How many drives would you need, to be able to set up a RAID, or hot
swappable RAUD (Redundant Array of Unreliable Drives), that could give
decent reliability with such drives?
10 -
Two sets of 5 drive  RAID 6 volumes in a RAID 1 array.
You would then need to lose 5 drives before data failure is imminent. The 6th 
one will do you in. If you haven't fixed 50 percent failure then you deserve to 
lose your data.
Disclaimer: this is my totally unscientific unprofessional and biased estimate. 
My daily activities of life have nothing to do with the IT industry. Proceed at 
your own peril. Etc. Etc.
-Ali


To meet Fred's original criteria you would only need 4 to create a 
minimal RAID 6 array.  In theory a RAID 1 array (mirrored) of 4 or more 
disk could also survive a second disk failure as long as one copy of all 
the pairs in the array survive but you are starting to play the odds, 
and I know of some cases where people have lost . You can improve the 
odds by having a hot spare that automatically take over for a failed 
disk.  One of  the most important things is the array manager has to 
have some way of notifying you that there has been a failure so that you 
can take action, however my observations as a hardware support person is 
that even when there is error notification it is often missed or ignored 
until subsequent failures kill off the array.   It also appears to be a 
fairly common notion that if you have RAID there is no need to ever 
backup, but I assure you RAID is not foolproof and arrays do fail.   One 
of the big problems facing using large disks to build arrays is the 
number of accesses just to build the array may put a serious dent in the 
speced number of accesses before error or in some cases even exceed it.


Paul.


RE: RAID? Was: PATA hard disks, anyone?

2018-03-27 Thread Ali via cctalk




 Original message 
From: Fred Cisin via cctalk  
Date: 3/27/18  5:51 PM  (GMT-08:00) 
To: "General Discussion: On-Topic and Off-Topic Posts"  
Subject: RAID? Was: PATA hard disks, anyone? 

How many drives would you need, to be able to set up a RAID, or hot 
swappable RAUD (Redundant Array of Unreliable Drives), that could give 
decent reliability with such drives?
10 -
Two sets of 5 drive  RAID 6 volumes in a RAID 1 array.
You would then need to lose 5 drives before data failure is imminent. The 6th 
one will do you in. If you haven't fixed 50 percent failure then you deserve to 
lose your data. 
Disclaimer: this is my totally unscientific unprofessional and biased estimate. 
My daily activities of life have nothing to do with the IT industry. Proceed at 
your own peril. Etc. Etc.
-Ali