At present, the managed interrupt spreading algorithm distributes vectors
across all available CPUs within a given node or system. On systems
employing CPU isolation (e.g., "isolcpus=io_queue"), this behaviour
defeats the primary purpose of isolation by routing hardware interrupts
(such as NVMe completion queues) directly to isolated cores.
Update irq_create_affinity_masks() to respect the housekeeping CPU mask.
Introduce irq_spread_hk_filter() to intersect the natively calculated
affinity mask with the HK_TYPE_IO_QUEUE mask, thereby keeping managed
interrupts off isolated CPUs.
To ensure strict isolation whilst guaranteeing a valid routing destination:
1. Fallback mechanism: Should the initial spreading logic assign a
vector exclusively to isolated CPUs (resulting in an empty
intersection), the filter safely falls back to the system's
online housekeeping CPUs.
2. Hotplug safety: The fallback utilises data_race(cpu_online_mask)
instead of allocating a local cpumask snapshot. This circumvents
CONFIG_CPUMASK_OFFSTACK stack bloat hazards on high-core-count
systems. Furthermore, it prevents deadlocks with concurrent CPU
hotplug operations (e.g., during storage driver error recovery)
by eliminating the need to hold the CPU hotplug read lock.
3. Fast-path optimisation: The filtering logic is conditionally
executed only if housekeeping is enabled, thereby ensuring zero
overhead for standard configurations.
Signed-off-by: Aaron Tomlin <[email protected]>
---
kernel/irq/affinity.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 076a5ef1e306..dd9e7f5fbdec 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -8,6 +8,24 @@
#include <linux/slab.h>
#include <linux/cpu.h>
#include <linux/group_cpus.h>
+#include <linux/sched/isolation.h>
+
+/**
+ * irq_spread_hk_filter - Restrict an interrupt affinity mask to housekeeping
CPUs
+ * @mask: The interrupt affinity mask to filter (in/out)
+ * @hk_mask: The system's housekeeping CPU mask
+ *
+ * Intersects @mask with @hk_mask to keep interrupts off isolated CPUs.
+ * If this intersection is empty (meaning all targeted CPUs were isolated),
+ * it falls back to the online housekeeping CPUs to guarantee a valid
+ * routing destination.
+ */
+static void irq_spread_hk_filter(struct cpumask *mask,
+ const struct cpumask *hk_mask)
+{
+ if (!cpumask_and(mask, mask, hk_mask))
+ cpumask_and(mask, hk_mask, data_race(cpu_online_mask));
+}
static void default_calc_sets(struct irq_affinity *affd, unsigned int affvecs)
{
@@ -27,6 +45,8 @@ irq_create_affinity_masks(unsigned int nvecs, struct
irq_affinity *affd)
{
unsigned int affvecs, curvec, usedvecs, i;
struct irq_affinity_desc *masks = NULL;
+ const struct cpumask *hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE);
+ bool hk_enabled = housekeeping_enabled(HK_TYPE_IO_QUEUE);
/*
* Determine the number of vectors which need interrupt affinities
@@ -83,8 +103,12 @@ irq_create_affinity_masks(unsigned int nvecs, struct
irq_affinity *affd)
return NULL;
}
- for (int j = 0; j < nr_masks; j++)
+ for (int j = 0; j < nr_masks; j++) {
cpumask_copy(&masks[curvec + j].mask, &result[j]);
+ if (hk_enabled)
+ irq_spread_hk_filter(&masks[curvec + j].mask,
+ hk_mask);
+ }
kfree(result);
curvec += nr_masks;
--
2.51.0