On 4/1/26 6:23 PM, Aaron Tomlin wrote:
From: Daniel Wagner <[email protected]>Extend the capabilities of the generic CPU to hardware queue (hctx) mapping code, so it maps houskeeping CPUs and isolated CPUs to the hardware queues evenly. A hctx is only operational when there is at least one online housekeeping CPU assigned (aka active_hctx). Thus, check the final mapping that there is no hctx which has only offline housekeeing CPU and online isolated CPUs. Example mapping result: 16 online CPUs isolcpus=io_queue,2-3,6-7,12-13 Queue mapping: hctx0: default 0 2 hctx1: default 1 3 hctx2: default 4 6 hctx3: default 5 7 hctx4: default 8 12 hctx5: default 9 13 hctx6: default 10 hctx7: default 11 hctx8: default 14 hctx9: default 15 IRQ mapping: irq 42 affinity 0 effective 0 nvme0q0 irq 43 affinity 0 effective 0 nvme0q1 irq 44 affinity 1 effective 1 nvme0q2 irq 45 affinity 4 effective 4 nvme0q3 irq 46 affinity 5 effective 5 nvme0q4 irq 47 affinity 8 effective 8 nvme0q5 irq 48 affinity 9 effective 9 nvme0q6 irq 49 affinity 10 effective 10 nvme0q7 irq 50 affinity 11 effective 11 nvme0q8 irq 51 affinity 14 effective 14 nvme0q9 irq 52 affinity 15 effective 15 nvme0q10 A corner case is when the number of online CPUs and present CPUs differ and the driver asks for less queues than online CPUs, e.g. 8 online CPUs, 16 possible CPUs isolcpus=io_queue,2-3,6-7,12-13 virtio_blk.num_request_queues=2 Queue mapping: hctx0: default 0 1 2 3 4 5 6 7 8 12 13 hctx1: default 9 10 11 14 15 IRQ mapping irq 27 affinity 0 effective 0 virtio0-config irq 28 affinity 0-1,4-5,8 effective 5 virtio0-req.0 irq 29 affinity 9-11,14-15 effective 0 virtio0-req.1 Noteworthy is that for the normal/default configuration (!isoclpus) the mapping will change for systems which have non hyperthreading CPUs. The main assignment loop will completely rely that group_mask_cpus_evenly to do the right thing. The old code would distribute the CPUs linearly over the hardware context: queue mapping for /dev/nvme0n1 hctx0: default 0 8 hctx1: default 1 9 hctx2: default 2 10 hctx3: default 3 11 hctx4: default 4 12 hctx5: default 5 13 hctx6: default 6 14 hctx7: default 7 15 The assign each hardware context the map generated by the group_mask_cpus_evenly function: queue mapping for /dev/nvme0n1 hctx0: default 0 1 hctx1: default 2 3 hctx2: default 4 5 hctx3: default 6 7 hctx4: default 8 9 hctx5: default 10 11 hctx6: default 12 13 hctx7: default 14 15 In case of hyperthreading CPUs, the resulting map stays the same. Signed-off-by: Daniel Wagner <[email protected]> [atomlin: Fixed absolute vs. relative hardware queue index mix-up in blk_mq_map_queues and validation checks; fixed typographical errors.] Signed-off-by: Aaron Tomlin <[email protected]> --- block/blk-mq-cpumap.c | 175 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 157 insertions(+), 18 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 8244ecf87835..8d09af49a142 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -22,7 +22,18 @@ static unsigned int blk_mq_num_queues(const struct cpumask *mask, { unsigned int num;- num = cpumask_weight(mask);+ if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) { + const struct cpumask *hk_mask; + struct cpumask avail_mask; + + hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE); + cpumask_and(&avail_mask, mask, hk_mask); + + num = cpumask_weight(&avail_mask);
As said before by Ming Lei, struct cpumask can be rather big in size if NR_CPUS is large. I will suggest using cpumask_weight_and() instead which will eliminate the need of the local variables.
Cheers, Longman

