On Mon, Mar 30, 2026 at 06:10:44PM -0400, Aaron Tomlin wrote: > From: Daniel Wagner <[email protected]> > > Extend the capabilities of the generic CPU to hardware queue (hctx) > mapping code, so it maps houskeeping CPUs and isolated CPUs to the > hardware queues evenly. > > A hctx is only operational when there is at least one online > housekeeping CPU assigned (aka active_hctx). Thus, check the final > mapping that there is no hctx which has only offline housekeeing CPU and > online isolated CPUs. > > Example mapping result: > > 16 online CPUs > > isolcpus=io_queue,2-3,6-7,12-13 > > Queue mapping: > hctx0: default 0 2 > hctx1: default 1 3 > hctx2: default 4 6 > hctx3: default 5 7 > hctx4: default 8 12 > hctx5: default 9 13 > hctx6: default 10 > hctx7: default 11 > hctx8: default 14 > hctx9: default 15 > > IRQ mapping: > irq 42 affinity 0 effective 0 nvme0q0 > irq 43 affinity 0 effective 0 nvme0q1 > irq 44 affinity 1 effective 1 nvme0q2 > irq 45 affinity 4 effective 4 nvme0q3 > irq 46 affinity 5 effective 5 nvme0q4 > irq 47 affinity 8 effective 8 nvme0q5 > irq 48 affinity 9 effective 9 nvme0q6 > irq 49 affinity 10 effective 10 nvme0q7 > irq 50 affinity 11 effective 11 nvme0q8 > irq 51 affinity 14 effective 14 nvme0q9 > irq 52 affinity 15 effective 15 nvme0q10 > > A corner case is when the number of online CPUs and present CPUs > differ and the driver asks for less queues than online CPUs, e.g. > > 8 online CPUs, 16 possible CPUs > > isolcpus=io_queue,2-3,6-7,12-13 > virtio_blk.num_request_queues=2 > > Queue mapping: > hctx0: default 0 1 2 3 4 5 6 7 8 12 13 > hctx1: default 9 10 11 14 15 > > IRQ mapping > irq 27 affinity 0 effective 0 virtio0-config > irq 28 affinity 0-1,4-5,8 effective 5 virtio0-req.0 > irq 29 affinity 9-11,14-15 effective 0 virtio0-req.1 > > Noteworthy is that for the normal/default configuration (!isoclpus) the > mapping will change for systems which have non hyperthreading CPUs. The > main assignment loop will completely rely that group_mask_cpus_evenly to > do the right thing. The old code would distribute the CPUs linearly over > the hardware context: > > queue mapping for /dev/nvme0n1 > hctx0: default 0 8 > hctx1: default 1 9 > hctx2: default 2 10 > hctx3: default 3 11 > hctx4: default 4 12 > hctx5: default 5 13 > hctx6: default 6 14 > hctx7: default 7 15 > > The assign each hardware context the map generated by the > group_mask_cpus_evenly function: > > queue mapping for /dev/nvme0n1 > hctx0: default 0 1 > hctx1: default 2 3 > hctx2: default 4 5 > hctx3: default 6 7 > hctx4: default 8 9 > hctx5: default 10 11 > hctx6: default 12 13 > hctx7: default 14 15 > > In case of hyperthreading CPUs, the resulting map stays the same. > > Signed-off-by: Daniel Wagner <[email protected]> > --- > block/blk-mq-cpumap.c | 177 +++++++++++++++++++++++++++++++++++++----- > 1 file changed, 158 insertions(+), 19 deletions(-) > > diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c > index 8244ecf87835..3b4fa3b291c9 100644 > --- a/block/blk-mq-cpumap.c > +++ b/block/blk-mq-cpumap.c > @@ -22,7 +22,18 @@ static unsigned int blk_mq_num_queues(const struct cpumask > *mask, > { > unsigned int num; > > - num = cpumask_weight(mask); > + if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) { > + const struct cpumask *hk_mask; > + struct cpumask avail_mask;
This may overflow kernel stack. Thanks, Ming

