On Wed, Mar 18, 2026 at 09:21:23AM -0400, Aaron Tomlin wrote: > On Wed, Mar 18, 2026 at 08:38:20AM +0900, Damien Le Moal wrote: > > Looks OK to me, but I have some suggestions below.
Hi Damien, Laurence,
Upon reviewing the source code once more, it becomes apparent that tracking
"active requests" within this specific trace point is essentially redundant.
If a thread is compelled to invoke io_schedule(), it is mathematically
certain that the number of active requests perfectly equals the total
number of tags.
Now, it would almost always print active=0 in the following scenarios:
1. "mq-deadline" Scheduler Starvation: The thread sleeps waiting for a
scheduler tag. Because the request has not been dispatched to
hardware yet, blk_mq_inc_active_requests() was never called.
hctx->nr_active is 0.
2. NVMe Hardware Starvation, "none" scheduler: The thread sleeps
waiting for a hardware tag. Because NVMe drives do not share tags,
blk_mq_inc_active_requests() instantly aborts to save CPU-cycles.
hctx->nr_active remains 0.
3. RAID Hardware Starvation, "none" scheduler: The thread sleeps
waiting for a shared hardware tag. Because it is HCTX_SHARED, the
kernel tracks the active requests in
hctx->queue->nr_active_requests_shared_tags. The local
hctx->nr_active counter is completely bypassed and remains 0.
Rather than attempting to print the active count, the trace point should be
modified to indicate exactly which pool experienced starvation: the
hardware pool or the software scheduler pool.
I will submit a follow-up patch.
Kind regards,
--
Aaron Tomlin
signature.asc
Description: PGP signature
