> > with kernel functions: > > var1 = 65536*nr_cpus / (45426 * ilog2(nr_cpus) + 65536) > > var2 = DIV_ROUND_UP(65536*nr_cpus, 45426 * ilog2(nr_cpus) + 65536) > > var3 = roundup_pow_of_two(var2)
Consider a 1024-CPU machine with a cpuset-constrained cgroup using only 2 CPUs. Its unavoidable batching error is just 2MB, yet the global threshold imposes 256MB (harmonic-mean) or 32MB (sqrt) of additional error — 128x and 16x overprovisioning respectively. Both overshoot, but sqrt stays bit closer to the ideal. -- Regards, Li Wang

