Re: Best practices for raid 1

Austin S. Hemmelgarn Wed, 11 Jan 2017 12:08:31 -0800

On 2017-01-10 16:49, Chris Murphy wrote:

On Tue, Jan 10, 2017 at 2:07 PM, Vinko Magecic
<vinko.mage...@construction.com> wrote:

Hello,


I set up a raid 1 with two btrfs devices and came across some situations in my 
testing that I can't get a straight answer on.

1) When replacing a volume, do I still need to `umount /path` and then `mount 
-o degraded ...` the good volume before doing the `btrfs replace start ...` ?


No. If the device being replaced is unreliable, use -r to limit the
reads from the device being replaced.

I didn't see anything that said I had to and when I tested it without mounting 
the volume it was able to replace the device without any issue. Is that 
considered bad and could risk damage or has `replace` made it possible to 
replace devices without umounting the filesystem?


It's always been possible even before 'replace'.
btrfs dev add <dev3>
btrfs dev rem <dev1>

But there are some bugs in dev replace that Qu is working on; I think
they mainly negatively impact raid56 though.

The one limitation of 'replace' is that the new block device must be
equal to or larger than the block device being replaced; where dev add

dev rem doesn't require this.

The other thing to remember is that you can resize the FS on the devicebeing replaced so that it will fit on the new device. I actuallyregularly do this when re-partitioning or moving filesystems betweendevices as a safety precaution so that I can be sure it will fit in thenew location. I would only suggest doing this though if that device isstill reliable, as it may move data around on that device, and itobviously doesn't work if the device being replaced is missing.

2) Everything I see about replacing a drive says to use `/old/device 
/new/device` but what if the old device can't be read or no longer exists?


The command works whether the device is present or not; but if it's
present and working then any errors on one device can be corrected by
the other, whereas if the device is missing, then any errors on the
remaining device can't be corrected. Off hand I'm not sure if the
replace continues and an error just logged...I think that's what
should happen.

IIRC, that's what happens up to some (arbitrary) threshold, at whichpoint the replace fails.

Would that be a `btrfs device add /new/device; btrfs balance start /new/device` 
?


dev add then dev rem; the balance isn't necessary.

A better way to put it is that the balance is implicit in the removal ofthe device. The data that was on that device has to go somewhere, andthe easiest way to do that is just to run a balance that's not allowedto allocate anything on the device being removed.


3) When I have the RAID1 with two devices and I want to grow it out, which is 
the better practice? Create a larger volume, replace the old device with the 
new device and then do it a second time for the other device, or attaching the 
new volumes to the label/uuid one at a time and with each one use `btrfs 
filesystem resize devid:max /mountpoint`.


If you're replacing a 2x raid1 with two bigger replacements, you'd use
'btrfs replace' twice. Maybe it'd work concurrently, I've never tried
it, but useful for someone to test and see if it explodes because if
it's allowed, it should work or fail gracefully.

In theory, it _might_ be possible to get dev replace to workconcurrently. As of right now, I know that the current implementationdoes not work with more than one instance running per FS (because ituses devid 0 for the new device during the replace, and devids have tobe unique), but I don't know for certain what it does if you try to runanother (it _should_ refuse to start, I'm not certain if that's what itactually does, and I don't have the time to check right now).

That said, there are many reasons to just serialize replaces most of thetime, the most notable being that replace does not just read from thedevice being replaced (although most of the reads go to that device),and that serializing the replace operations has less impact on the restof the system (it is designed to be used on live systems).


There's no need to do filesystem resizes when doing either 'replace'
or 'dev add' followed by 'dev rem' because the fs resize is implied.
First it's resized/grown with add; and then it's resized/shrink with
remove. For replace there's a consolidation of steps, it's been a
while since I've looked at the code so I can't tell you what steps it
skips, what the state of the devices are in during the replace, which
one active writes go to.

Last time I checked, this was not the case for replace, and a resize tomax size was still necessary. That was almost 3 months ago though (I'vebeen lucky and not needed to replace anything since then), so I may beincorrect about the current state of things.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Best practices for raid 1

Reply via email to