Re: write corruption due to bio cloning on raid5/6

Janos Toth F. Fri, 28 Jul 2017 20:03:22 -0700

The read-only scrub finished without errors/hangs (with kernel
4.12.3). So, I guess the hangs were caused by:
1: other bug in 4.13-RC1
2: crazy-random SATA/disk-controller issue
3: interference between various btrfs tools [*]
4: something in the background did DIO write with 4.13-RC1 (but all
affected content was eventually overwritten/deleted between the scrub
attempts)


[*] I expected scrub to finish in ~5 rather than ~40 hours (and didn't
expect interference issues), so I didn't disable the scheduled
maintenance script which deletes old files, recursively defrags the
whole fs and runs a balance with usage=33 filters. I guess either of
those (especially balance) could potentially cause scrub to hang.

On Thu, Jul 27, 2017 at 10:44 PM, Duncan <[email protected]> wrote:
> Janos Toth F. posted on Thu, 27 Jul 2017 16:14:47 +0200 as excerpted:
>
>> * This is off-topic but raid5 scrub is painful. The disks run at
>> constant ~100% utilization while performing at ~1/5 of their sequential
>> read speeds. And despite explicitly asking idle IO priority when
>> launching scrub, the filesystem becomes unbearably slow (while scrub
>> takes a days or so to finish ... or get to the point where it hung the
>> last time around, close to the end).
>
> That's because basically all the userspace scrub command does is make the
> appropriate kernel calls to have it do the real scrub.  So priority-
> idling the userspace scrub doesn't do what it does on normal userspace
> jobs that do much of the work themselves.
>
> The problem is that idle-prioritizing the kernel threads actually doing
> the work could risk a deadlock due to lock inversion, since they're
> kernel threads and aren't designed with the idea of people messing with
> their priority in mind.
>
> Meanwhile, that's yet another reason btrfs raid56 mode isn't recommended
> at this time.  Try btrfs raid1 or raid10 mode instead, or possible btrfs
> raid1, single or raid0 mode on top of a pair of mdraid5s or similar.  Tho
> parity-raid mode in general (that is, not btrfs-specific) is known for
> being slow in various cases, with raid10 normally being the best
> performing closest alternative.  (Tho in the btrfs-specific case, btrfs
> raid1 on top of a pair of mdraid/dmraid/whatever raid0s, is the normally
> recommended higher performance reasonably low danger alternative.)

If this applies to all RAID flavors then I consider the built-in help
and the manual pages of scrub misleading (if it's RAID56-only, the
manual should still mention how RAID56 is an exception).

Also, a resumed scrub seems to skip a lot of data. It picks up where
it left but then prematurely reports a job well done. I remember
noticing a similar behavior with balance cancel/resume on RAID5 a few
years ago (it went on for a few more chunks but left the rest alone
and reported completion --- I am not sure if that's fixed now or these
have a common root cause).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: write corruption due to bio cloning on raid5/6

Reply via email to