Am Fri, 3 Mar 2017 07:19:06 -0500 schrieb "Austin S. Hemmelgarn" <ahferro...@gmail.com>:
> On 2017-03-03 00:56, Kai Krakow wrote: > > Am Thu, 2 Mar 2017 11:37:53 +0100 > > schrieb Adam Borowski <kilob...@angband.pl>: > > > >> On Wed, Mar 01, 2017 at 05:30:37PM -0700, Chris Murphy wrote: > [...] > >> > >> Well, there's Qu's patch at: > >> https://www.spinics.net/lists/linux-btrfs/msg47283.html > >> but it doesn't apply cleanly nor is easy to rebase to current > >> kernels. > [...] > >> > >> Well, yeah. The current check is naive and wrong. It does have a > >> purpose, just fails in this, very common, case. > > > > I guess the reasoning behind this is: Creating any more chunks on > > this drive will make raid1 chunks with only one copy. Adding > > another drive later will not replay the copies without user > > interaction. Is that true? > > > > If yes, this may leave you with a mixed case of having a raid1 drive > > with some chunks not mirrored and some mirrored. When the other > > drives goes missing later, you are loosing data or even the whole > > filesystem although you were left with the (wrong) imagination of > > having a mirrored drive setup... > > > > Is this how it works? > > > > If yes, a real patch would also need to replay the missing copies > > after adding a new drive. > > > The problem is that that would use some serious disk bandwidth > without user intervention. The way from userspace to fix this is to > scrub the FS. It would essentially be the same from kernel space, > which means that if you had a multi-TB FS and this happened, you'd be > running at below capacity in terms of bandwidth for quite some time. > If this were to be implemented, it would have to be keyed off of the > per-chunk degraded check (so that _only_ the chunks that need it get > touched), and there would need to be a switch to disable it. Well, I'd expect that a replaced drive would involve reduced bandwidth for a while. Every traditional RAID does this. The key solution there is that you can limit bandwidth and/or define priorities (BG rebuild rate). Btrfs OTOH could be a lot more smarter, only rebuilding chunks that are affected. The kernel can already do IO priorities and some sort of bandwidth limiting should also be possible. I think IO throttling is already implemented in the kernel somewhere (at least with 4.10) and also in btrfs. So the basics are there. In a RAID setup, performance should never have priority over redundancy by default. If performance is an important factor, I suggest working with SSD writeback caches. This is already possible with different kernel techniques like mdcache or bcache. Proper hardware controllers also support this in hardware. It's cheap to have a mirrored SSD writeback cache of 1TB or so if your setup already contains a multiple terabytes array. Such a setup has huge performance benefits in setups we deploy (tho, not btrfs related). Also, adding/replacing a drive is usually not a totally unplanned event. Except for hot spares, a missing drive will be replaced at the time you arrive on-site. If performance is a factor, this can be done the same time as manually starting the process. So why not should it be done automatically? -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html