When allocating space for load_balance_mask, in sched_init, when CPUMASK_OFFSTACK is set, we've managed to spill over KMALLOC_MAX_SIZE on our 6144 core machine. The patch below breaks up the allocations so that they don't overflow the max alloc size. It also allocates the masks on the the node from which they'll most commonly be accessed, to minimize remote accesses on NUMA machines.
Any input is appreciated! - Alex Signed-off-by: Alex Thorlton <[email protected]> Suggested-by: George Beshers <[email protected]> Cc: George Beshers <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] --- kernel/sched/core.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4499950..8f06655 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6981,9 +6981,6 @@ void __init sched_init(void) #ifdef CONFIG_RT_GROUP_SCHED alloc_size += 2 * nr_cpu_ids * sizeof(void **); #endif -#ifdef CONFIG_CPUMASK_OFFSTACK - alloc_size += num_possible_cpus() * cpumask_size(); -#endif if (alloc_size) { ptr = (unsigned long)kzalloc(alloc_size, GFP_NOWAIT); @@ -7003,12 +7000,10 @@ void __init sched_init(void) ptr += nr_cpu_ids * sizeof(void **); #endif /* CONFIG_RT_GROUP_SCHED */ -#ifdef CONFIG_CPUMASK_OFFSTACK - for_each_possible_cpu(i) { - per_cpu(load_balance_mask, i) = (void *)ptr; - ptr += cpumask_size(); - } -#endif /* CONFIG_CPUMASK_OFFSTACK */ + } + for_each_possible_cpu(i) { + per_cpu(load_balance_mask, i) = kzalloc_node( + cpumask_size(), GFP_KERNEL, cpu_to_node(i)); } init_rt_bandwidth(&def_rt_bandwidth, -- 1.7.12.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

