On Thu, Dec 03, 2020 at 05:14:31PM +0000, Alexey Klimov wrote:
> When a CPU offlined and onlined via device_offline() and device_online()
> the userspace gets uevent notification. If, after receiving uevent,
> userspace executes sched_setaffinity() on some task trying to move it
> to a recently onlined CPU, then it will fail with -EINVAL. Userspace needs
> to wait around 5..30 ms before sched_setaffinity() will succeed for
> recently onlined CPU after receiving uevent.

Right.

>  Unfortunately, the execution time of
> echo 1 > /sys/devices/system/cpu/cpuX/online roughly doubled with this
> change (on my test machine).

Nobody cares, it's hotplug, it's supposed to be slow :-) That is,
we fundamentally shift the work _to_ the hotplug path, so as to keep
everybody else fast.

> The nature of this bug is also described here (with different consequences):
> https://lore.kernel.org/lkml/[email protected]/

Yeah, pesky deadlocks.. someone was going to try again.


>  kernel/cpu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 6ff2578ecf17..f39a27a7f24b 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -15,6 +15,7 @@
>  #include <linux/sched/smt.h>
>  #include <linux/unistd.h>
>  #include <linux/cpu.h>
> +#include <linux/cpuset.h>
>  #include <linux/oom.h>
>  #include <linux/rcupdate.h>
>  #include <linux/export.h>
> @@ -1275,6 +1276,8 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state 
> target)
>       }
>  
>       err = _cpu_up(cpu, 0, target);
> +     if (!err)
> +             cpuset_wait_for_hotplug();
>  out:
>       cpu_maps_update_done();
>       return err;

My only consideration is if doing that flush under cpu_add_remove_lock()
is wise.

Reply via email to