ping..
在 2026/5/13 17:13, Fengnan Chang 写道:
> dm_poll_bio() is the ->poll_bio() callback for a stacked dm device.
> The caller only knows about the dm queue, so it may decide to do a
> spinning poll if it thinks a single queue is being polled. Passing those
> flags unchanged to the mapped clone lets blk_mq_poll() spin on a target
> queue from inside dm_poll_bio().
>
> With io_uring IOPOLL on a dm-stripe target this can keep a task in
>
> dm_poll_bio() -> bio_poll() -> blk_mq_poll()
>
> long enough to trigger an RCU CPU stall, before io_uring gets back to
> io_iopoll_check() and its need_resched() check.
>
> Keep dm's ->poll_bio() bounded by forcing one-shot polling for target
> bios. The caller can invoke dm_poll_bio() again if it wants to keep
> polling, and it also gets a chance to reap completions or reschedule
> between passes.
>
> Fixes: f22ecf9c14c1 ("blk-mq: delete task running check in blk_hctx_poll()")
> Signed-off-by: Fengnan Chang <[email protected]>
> ---
> drivers/md/dm.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index e178fe19973ea..8f44fbbcf3da2 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2098,8 +2098,17 @@ static bool dm_poll_dm_io(struct dm_io *io, struct
> io_comp_batch *iob,
> WARN_ON_ONCE(!dm_tio_is_normal(&io->tio));
>
> /* don't poll if the mapped io is done */
> - if (atomic_read(&io->io_count) > 1)
> - bio_poll(&io->tio.clone, iob, flags);
> + if (atomic_read(&io->io_count) > 1) {
> + /*
> + * DM hides the target queues from the upper poller, which may
> + * decide it is safe to spin on a single stacked queue. Do not
> + * pass that spinning policy down to a target queue: one slow
> + * clone could keep the task inside dm_poll_bio() for a long
> + * time. Poll target bios once and let the caller decide
> + * whether to keep polling, reap completions or reschedule.
> + */
> + bio_poll(&io->tio.clone, iob, flags | BLK_POLL_ONESHOT);
> + }
>
> /* bio_poll holds the last reference */
> return atomic_read(&io->io_count) == 1;