Re: Repeatable, raid1+O_DIRECT, hang/warn

Vjaceslavs Klimovs Mon, 15 Jun 2026 18:26:45 -0700

Hi Keith,

Thanks. I tested both patches on current mainline
(v7.1-rc7-271-g424280953322) with my QEMU + LVM "--type mirror"
reproducer (virtio-blk, cache=none, aio=native).


With only the "block: check bio split for unaligned bvec" patch, the
hang still reproduces. The WARN fires from a kmirrord worker:

  WARNING: block/bio.c:1044 at bio_add_page+0x108/0x200
  Workqueue: kmirrord do_mirror
  Call Trace:
   bio_add_page+0x108/0x200
   do_region+0x21d/0x270
   dispatch_io+0xf1/0x150
   dm_io+0x136/0x240
   do_reads+0x13e/0x210
   do_mirror+0x117/0x2b0

and the VM then wedges.

With the dm-io.c clone patch applied on top, the WARN and the hang are
both gone. dm-mirror just fails the read instead:

  device-mapper: raid1: Mirror read failed from 252:0. Trying
alternative device.
  device-mapper: raid1: All sides of mirror have failed.
  device-mapper: raid1: Read failure on mirror device 252:1.  Failing I/O.

The guest still gets an I/O error, as you expected, but the host stays
up: no splat, no stuck task. For comparison, on the same kernel the
"--type raid1" case boots the guest and reads fine, and the 128 MB
mirror seed write goes through the clone path without trouble, so
normal I/O looks unaffected.

Thanks,
Vjaceslavs

On Mon, Jun 15, 2026 at 5:06 PM Keith Busch <[email protected]> wrote:
>
> On Mon, Jun 15, 2026 at 04:16:12PM -0700, Vjaceslavs Klimovs wrote:
> > Your trace looks like what the two earlier reports hit: a read reaching
> > a leaf device with sectors > 0 but phys_seg 0 (an empty bio). One aside
> > that may help read the trace: blk_io_trace.error is a __u16, so the
> > bracketed values on your C lines are errnos as u16 (65514 = -EINVAL,
> > 65531 = -EIO).
> >
> > The WARN itself is new, the bad bio isn't. bio_add_page() only started
> > rejecting len == 0 in 643893647cac ("block: reject zero length in
> > bio_add_page()", v7.1-rc1); on 7.0.8 the same empty bio tripped
> > scsi_alloc_sgtables()'s !nr_segs instead, which matches what you saw.
> > That fits your "not a recent regression": the condition is older, v7.1
> > just made it loud.
> >
> > For Tomas's and my reports (QEMU O_DIRECT to the LV block device) the
> > origin looks like 5ff3f74e145a ("block: simplify direct io validity
> > check", v6.18): blkdev_dio_invalid() now checks only aggregate
> > ki_pos | count alignment and dropped the per-segment
> > bdev_iter_is_aligned() walk, so a degenerate or misaligned O_DIRECT no
> > longer gets -EINVAL at the fops boundary. But your reproducer reads a
> > file, which goes through the filesystem O_DIRECT path and never calls
> > blkdev_dio_invalid(), and still makes the empty bio. So it isn't only
> > that one entry point.
> >
> > dm-mirror then hangs because Keith's f7b24c7b41f2 only covers md
> > raid1/raid10; legacy dm-mirror (dm-raid1.c) has no equivalent and
> > rebuilds the empty read onto the other leg. Note the leg's status isn't
> > even consistent (your SATA path returns BLK_STS_IOERR, not
> > BLK_STS_INVAL), so copying that status check into dm-mirror probably
> > wouldn't catch every case.
> >
> > For what it's worth, that points me toward rejecting the empty or
> > misaligned bio once, at submission, with -EINVAL, rather than teaching
> > each consumer to tolerate it. But you'll know the tradeoffs far better
> > than I do.
> >
> > I have a small QEMU + LVM raid1/mirror setup that reproduces the
> > block-device variant and bisects to 5ff3f74e. Happy to run your file
> > reproducer with some instrumentation at the dm-mirror read entry
> > (bi_size vs bio_sectors vs bvec lengths) to see whether the bio is
> > already empty on arrival or built that way on the retry, and to test
> > any patch.
>
> Thanks for following up here. I didn't initially see your follow-up
> until Thorsten linked it. I apologize for missing that, this feature is
> important so I don't want to see anything regress for it.
>
> There is a known bug fix I think future tests should include:
>
>   https://lore.kernel.org/linux-block/[email protected]/
>
> This likely isn't the fix you're looking for, but including it rules out
> conditions that are not important here.
>
> After that, can we try this suggestion and see if the hang goes away?
>
>   https://lore.kernel.org/linux-block/ajBb8tK-0aJBpIgF@kbusch-mbp/
>
> I expect the original test case to still return an error (and I think it
> was designed to), but it shouldn't produce the warn or bug splats with a
> stuck uninterruptable task.

Re: Repeatable, raid1+O_DIRECT, hang/warn

Reply via email to