On Wed, Oct 12, 2016 at 12:25:51PM +0500, Roman Mamedov wrote: > Zygo Blaxell <ce3g8...@umail.furryterror.org> wrote: > > > A btrfs -dsingle -mdup array on a mdadm raid device might have a > > snowball's chance in hell of surviving a disk failure on a live array > > with only data losses. This would work if mdadm and btrfs successfully > > arrange to have each dup copy of metadata updated separately, and one > > of the copies survives the raid5 write hole. I've never tested this > > configuration, and I'd test the heck out of it before considering > > using it. > > Not sure what you mean here, a non-fatal disk failure (i.e. within being > compensated by redundancy) is invisible to the upper layers on mdadm arrays. > They do not need to "arrange" anything, on such failure from the point of view > of Btrfs nothing whatsoever has happened to the /dev/mdX block device, it's > still perfectly and correctly readable and writable.
btrfs hurls a bunch of writes for one metadata copy to mdadm, mdadm forwards those writes to the disks. btrfs sends a barrier to mdadm, mdadm must properly forward that barrier to all the disks and wait until they're all done. Repeat the above for the other metadata copy. If that's all implemented correctly in mdadm, all is well; otherwise, mdadm and btrfs fail to arrange to have each dup copy of metadata updated separately. The present state of the disks is irrelevant. The array could go degraded due to a disk failure at any time, so for practical failure analysis purposes, only the behavior in degraded mode is relevant. > > -- > With respect, > Roman
Description: Digital signature