On 26 Jan 2024 16:39 +0100, from h...@adminart.net (hw):
>> RAID is for uptime.
> 
> It's also for saving you from the hassle involved with loosing data
> when a disk fails.

Which translates to more quickly fully recovering from the loss of a
storage device.

When used for redundancy and staying within the redundancy threshold
of your setup, _done right_, RAID reduces unplanned downtime to zero
in case of loss of a storage device. Depending on your setup, the
replacement can either be made online or planned for, and once
complete, (hopefully) no data whatsoever has been lost and everything
has happened at minimal time cost. I'm assuming that _the physical
act_ of replacing the storage device is similar regardless of how it
is configured in software: the same number of screws, clamps or
similar need removing and reinstalling; the same amount of time is
required to physically move the old and the new storage device; etc.


>> If a week-long outage (to get replacement hardware and restore the
>> most recent backup) and a day's worth of data loss is largely
>> inconsequential, as quite frankly it likely is for most home users
>> save for the cost of replacement hardware, that's a very different
>> scenario from if that same outage costs $$€€¥¥ and could destroy
>> your livelihood; and consequently the choices made _should_ likely
>> be different.
> 
> That's assuming your time isn't worth anything and ignores whatever
> the loss of data may cost you.  If that isn't relevant to you, you
> don't backups, either.

Sure, you can tune the RPO (recovery point objective) for your backup
solution as needed. If you need a RPO of five minutes after
catastrophic storage failure (meaning that you lose no more than five
minutes of data regardless of when failure happens), then you're going
to make different choices than if a 24-48 hour RPO is fine. But I'm
willing to say that _most_ home users can recover from a loss of a
day's worth of changes to their data without that being a major blow
to whatever they are doing; and if the user does something important,
nothing prevents running an extra backup to capture those changes.
(I've done that myself on occasion.)

Similarly, you are going to make different choices based on your RTO
(recovery time objective). RAID gives you essentially zero RTO as long
as you retain sufficient redundancy, but _not_ past that. Once your
storage drops below whatever level of redundancy you have, you're
looking at rebuilding from backups, at which point backup frequency
and backup restoration time are the minimums which will dictate your
RPO and RTO respectively. So even if you have RAID, you need to know
what your target RPO and RTO are, respectively, because again those
values are going to be a major factor in the design of your backup
regimen.


>> _Mirrored backups_ makes very little sense to me. [...]
> 
> Having multiple generations of backups already increases the needed
> storage space by a bit more than half.  That makes it already arguable
> if it's better to make (multiple generations of) backups on a single
> RAID or on N single disks.  Any of the disks can fail at any time.  If
> you go with N == 2, a RAID (with multiple generations of backups on
> it) can be better because when a disk fails, the RAID will very likely
> survive and the non-RAID may not.

I'm not sure how you figure that. To survive the loss of N > 0 storage
devices within a set, a storage solution needs to have a raw capacity
greater than the usable capacity. An illustrative case would be a
three-way mirror: it can survive the loss of two out of three storage
devices, but only has the usable storage capacity of a single
(typically the smallest) of the devices. It's not really any different
from that a commodity HDD or SSD probably won't survive the failure of
one of its platters or flash chips respectively, even ignoring cascade
effects of either the failure or the cause of failure.

How much extra storage space you need to keep multiple generations of
backups is going to depend a lot on the rate of churn of your dataset.
Again as an illustrative example only, suppose you have 1 TB of data
and modify 1 MB per day, and make backups daily; with two devices each
capable of storing 2 TB you can have two full copies plus 1M days'
worth of history on each. This is irrespective of how those storage
devices themselves are furnished.

> Trying to make things appear easier by pointing out that failed disks
> can be replaced is not helpful.

It's a _backup_. _By definition_, a backup is only critical once the
primary copy becomes inaccessible for some reason. Hence:

>> The only time when something
>> like mirrored backups will help you is when you have only one backup
>> set, the backup itself works fine, but a backup drive fails, _and_ the
>> source fails before you've been able to make a new backup.

For a primary copy, _of course_ the calculus is different.

-- 
Michael Kjörling                     🔗 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”

Reply via email to