Would it be reasonably accurate to say "btrfs' RAID5 implementation is
likely working well enough and safe enough if you are backing up
regularly and are willing and able to restore from backup if necessary
if a device failure goes horribly wrong", then?
This is a reasonably serious question. My typical scenario runs along
the lines of two identical machines with regular filesystem replication
between them; in the event of something going horribly horribly wrong
with the production machine, I just spin up services on the replicated
machine - making it "production" - and then deal with the broken one at
relative leisure.
If the worst thing wrong with RAID5/6 in current btrfs is "might not
deal as well as you'd like with a really nasty example of single-drive
failure", that would likely be livable for me.
On 01/21/2014 12:08 PM, Duncan wrote:
> What you're missing is that device death and replacement rarely happens
> as neatly as your test (clean unmounts and all, no middle-of-process
> power-loss, etc). You tested best-case, not real-life or worst-case.
>
> Try that again, setting up the raid5, setting up a big write to it,
> disconnect one device in the middle of that write (I'm not sure if just
> dropping the loop works or if the kernel gracefully shuts down the loop
> device), then unplugging the system without unmounting... and /then/ see
> what sense btrfs can make of the resulting mess. In theory, with an
> atomic write btree filesystem such as btrfs, even that should work fine,
> minus perhaps the last few seconds of file-write activity, but the
> filesystem should remain consistent on degraded remount and device add,
> device remove, and rebalance, even if another power-pull happens in the
> middle of /that/.
>
> But given btrfs' raid5 incompleteness, I don't expect that will work.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html