Currently we cap the batch count at max(32, 2*nr_online_cpus), which these days is kind of silly as systems have gotten much bigger than in 2009 when this heuristic was introduced.
Bump it to capping it at 256 instead. This has a noticeable improvement for certain io_uring workloads, as io_uring tracks per-task inflight count using percpu counters. Signed-off-by: Jens Axboe <[email protected]> --- diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 00f666d94486..c3a9af5462ba 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -188,7 +188,7 @@ static int compute_batch_value(unsigned int cpu) { int nr = num_online_cpus(); - percpu_counter_batch = max(32, nr*2); + percpu_counter_batch = max(256, nr*2); return 0; } -- Jens Axboe

