On Wed, Apr 8, 2026 at 11:58 PM Aaron Tomlin <[email protected]> wrote: > > On Mon, Apr 06, 2026 at 11:29:38AM +0800, Ming Lei wrote: > > I don't think there is such breaking isolation thing. For iopoll, if > > applications won't submit polled IO on isolated CPUs, everything is just > > fine. If they do it, IO may be reaped from isolated CPUs, that is just their > > choice, anything is wrong? > > Hi Ming, > > Thank you for your follow up. You make a fair point regarding polling > queues and application choice; if an application explicitly binds to an > isolated CPU and submits polled operations, it is indeed actively electing > to utilise that core and accept the resulting behaviour. > > However, the architectural challenge arises from how the kernel handles > these queues structurally when the application does not explicitly make > that choice. Because poll queues never utilise interrupts, they are > completely invisible to the managed interrupt subsystem. > > If we were to rely exclusively on the managed irq flag, the block layer > would blindly map these non interrupt driven polling queues to isolated > CPUs. If a general background storage operation were then routed to > that queue, the isolated core would be forced to spin actively in a tight
How can the isolated core be scheduled for running polling task? Who triggered it? > loop waiting for the hardware completion. This would completely monopolise > the core and destroy any real time isolation guarantees without the user > space application ever having requested it. No. IOPOLL queue doesn't have interrupt, and the ->poll() is only run from the submission context. So if you don't submitted polled IO on isolated CPU cores, everything is just fine. This is simpler than irq IO actually. > > This illustrates precisely why the io queue flag is a mechanical necessity. > Its primary objective is to act as a comprehensive block layer isolation > boundary. It structurally restricts both hardware queue placement and > managed interrupt affinity strictly to housekeeping CPUs, ensuring that no > storage queue operations of any kind are mapped to an isolated CPU. > > To achieve this reliably, this series expands the struct irq affinity > structure to incorporate a new CPU mask [1]. This mask is explicitly set to > the result of blk mq online queue affinity. By passing this housekeeping > mask directly through the interrupt affinity parameters, we ensure that the > native affinity calculation is strictly bounded to non isolated CPUs from > the moment the device probes. > > This structural enhancement allows device drivers to seamlessly inherit the > isolation constraints without requiring bespoke, driver specific logic. A > clear example of this application can be seen in the modifications to the > Broadcom MPI3 Storage Controller [2]. By leveraging the expanded struct irq > affinity, the driver guarantees that its queues and corresponding managed > interrupts are perfectly aligned with the system housekeeping > configuration, completely avoiding the isolated CPUs during allocation. > > [1]: https://lore.kernel.org/lkml/[email protected]/ > [2]: https://lore.kernel.org/lkml/[email protected]/ > > I hope this better illustrates the mechanical necessity of the io_queue > flag and the corresponding changes to the interrupt affinity structures. Can you share one example in which managed irq can't address? > > > > Every logical CPU, including the isolated ones, must logically map to a > > > hardware context in order to submit input and output requests, saying they > > > are completely restricted is indeed stale and technically inaccurate. The > > > isolation mechanism actually ensures that the hardware contexts themselves > > > are serviced by the housekeeping CPUs, while the isolated CPUs are simply > > > mapped onto these housekeeping queues for submission purposes. I will > > > rewrite this paragraph to accurately reflect this topology, ensuring it > > > aligns perfectly with the behaviour introduced in patch 10. > > > > I am not sure if the above words is helpful from administrator viewpoint > > about > > the two kernel parameters. > > > > IMO, only two differences from this viewpoint: > > > > 1) `io_queue` may reduce nr_hw_queues > > > > 2) when application submits IO from isolated CPUs, `io_queue` can complete > > IO from housekeeping CPUs. > > Acknowledged. Are there other major differences besides the two mentioned above? Thanks, Ming

