On 2016-11-30 00:38, Roman Mamedov wrote:
On Wed, 30 Nov 2016 00:16:48 +0100
Wilson Meier <wilson.me...@gmail.com> wrote:
That said, btrfs shouldn't be used for other then raid1 as every other
raid level has serious problems or at least doesn't work as the expected
raid level (in terms of failure recovery).
RAID1 shouldn't be used either:
*) Read performance is not optimized: all metadata is always read from the
first device unless it has failed, data reads are supposedly balanced between
devices per PID of the process reading. Better implementations dispatch reads
per request to devices that are currently idle.
Based on what I've seen, the metadata reads get balanced too.
As far as the read balancing in general, while it doesn't work very well
for single processes, but if you have a large number of processes
started sequentially (for example, a thread-pool based server), it
actually works out to being near optimal with a lot less logic than DM
and MD have. Aggregated over an entire system it's usually near optimal
as well.
*) Write performance is not optimized, during long full bandwidth sequential
writes it is common to see devices writing not in parallel, but with a long
periods of just one device writing, then another. (Admittedly have been some
time since I tested that).
I've never seen this be an issue in practice, especially if you're using
transparent compression (which caps extent size, and therefore I/O size
to a given device, at 128k). I'm also sane enough that I'm not doing
bulk streaming writes to traditional HDD's or fully saturating the
bandwidth on my SSD's (you should be over-provisioning whenever
possible). For a desktop user, unless you're doing real-time video
recording at higher than HD resolution with high quality surround sound,
this probably isn't going to hit you (and even then you should be
recording to a temporary location with much faster write speeds (tmpfs
or ext4 without a journal for example) because you'll likely get hit
with fragmentation).
This also has overall pretty low impact compared to a number of other
things that BTRFS does (BTRFS on a single disk with single profile for
everything versus 2 of the same disks with raid1 profile for everything
gets less than a 20% performance difference in all the testing I've done).
*) A degraded RAID1 won't mount by default.
If this was the root filesystem, the machine won't boot.
To mount it, you need to add the "degraded" mount option.
However you have exactly a single chance at that, you MUST restore the RAID to
non-degraded state while it's mounted during that session, since it won't ever
mount again in the r/w+degraded mode, and in r/o mode you can't perform any
operations on the filesystem, including adding/removing devices.
There is a fix pending for the single chance to mount degraded thing,
and even then, it only applies to a 2 device raid1 array (with more
devices, new chunks are still raid1 if you're missing 1 device, so the
checks don't trigger and refuse the mount).
As far as not mounting degraded by default, that's a conscious design
choice that isn't going to change. There's a switch (adding 'degraded'
to the mount options) to enable this behavior per-mount, so we're still
on-par in that respect with LVM and MD, we just picked a different
default. In this case, I actually feel it's a better default for most
cases, because most regular users aren't doing exhaustive monitoring,
and thus are not likely to notice the filesystem being mounted degraded
until it's far too late. If the filesystem is degraded, then
_something_ has happened that the user needs to know about, and until
some sane monitoring solution is implemented, the easiest way to ensure
this is to refuse to mount.
*) It does not properly handle a device disappearing during operation. (There
is a patchset to add that).
*) It does not properly handle said device returning (under a
different /dev/sdX name, for bonus points).
These are not an easy problem to fix completely, especially considering
that the device is currently guaranteed to reappear under a different
name because BTRFS will still have an open reference on the original
device name.
On top of that, if you've got hardware that's doing this without manual
intervention, you've got much bigger issues than how BTRFS reacts to it.
No correctly working hardware should be doing this.
Most of these also apply to all other RAID levels.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html