Re: Convert from RAID 5 to 10

Austin S. Hemmelgarn Wed, 30 Nov 2016 04:51:10 -0800

On 2016-11-30 00:38, Roman Mamedov wrote:

On Wed, 30 Nov 2016 00:16:48 +0100
Wilson Meier <wilson.me...@gmail.com> wrote:

That said, btrfs shouldn't be used for other then raid1 as every other
raid level has serious problems or at least doesn't work as the expected
raid level (in terms of failure recovery).


RAID1 shouldn't be used either:

*) Read performance is not optimized: all metadata is always read from the
first device unless it has failed, data reads are supposedly balanced between
devices per PID of the process reading. Better implementations dispatch reads
per request to devices that are currently idle.

Based on what I've seen, the metadata reads get balanced too.

As far as the read balancing in general, while it doesn't work very wellfor single processes, but if you have a large number of processesstarted sequentially (for example, a thread-pool based server), itactually works out to being near optimal with a lot less logic than DMand MD have. Aggregated over an entire system it's usually near optimalas well.


*) Write performance is not optimized, during long full bandwidth sequential
writes it is common to see devices writing not in parallel, but with a long
periods of just one device writing, then another. (Admittedly have been some
time since I tested that).

I've never seen this be an issue in practice, especially if you're usingtransparent compression (which caps extent size, and therefore I/O sizeto a given device, at 128k). I'm also sane enough that I'm not doingbulk streaming writes to traditional HDD's or fully saturating thebandwidth on my SSD's (you should be over-provisioning wheneverpossible). For a desktop user, unless you're doing real-time videorecording at higher than HD resolution with high quality surround sound,this probably isn't going to hit you (and even then you should berecording to a temporary location with much faster write speeds (tmpfsor ext4 without a journal for example) because you'll likely get hitwith fragmentation).

This also has overall pretty low impact compared to a number of otherthings that BTRFS does (BTRFS on a single disk with single profile foreverything versus 2 of the same disks with raid1 profile for everythinggets less than a 20% performance difference in all the testing I've done).


*) A degraded RAID1 won't mount by default.

If this was the root filesystem, the machine won't boot.

To mount it, you need to add the "degraded" mount option.
However you have exactly a single chance at that, you MUST restore the RAID to
non-degraded state while it's mounted during that session, since it won't ever
mount again in the r/w+degraded mode, and in r/o mode you can't perform any
operations on the filesystem, including adding/removing devices.

There is a fix pending for the single chance to mount degraded thing,and even then, it only applies to a 2 device raid1 array (with moredevices, new chunks are still raid1 if you're missing 1 device, so thechecks don't trigger and refuse the mount).

As far as not mounting degraded by default, that's a conscious designchoice that isn't going to change. There's a switch (adding 'degraded'to the mount options) to enable this behavior per-mount, so we're stillon-par in that respect with LVM and MD, we just picked a differentdefault. In this case, I actually feel it's a better default for mostcases, because most regular users aren't doing exhaustive monitoring,and thus are not likely to notice the filesystem being mounted degradeduntil it's far too late. If the filesystem is degraded, then_something_ has happened that the user needs to know about, and untilsome sane monitoring solution is implemented, the easiest way to ensurethis is to refuse to mount.


*) It does not properly handle a device disappearing during operation. (There
is a patchset to add that).

*) It does not properly handle said device returning (under a
different /dev/sdX name, for bonus points).

These are not an easy problem to fix completely, especially consideringthat the device is currently guaranteed to reappear under a differentname because BTRFS will still have an open reference on the originaldevice name.

On top of that, if you've got hardware that's doing this without manualintervention, you've got much bigger issues than how BTRFS reacts to it.No correctly working hardware should be doing this.


Most of these also apply to all other RAID levels.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Convert from RAID 5 to 10

Reply via email to