On 26/05/20 17:10, Peter Zijlstra wrote:
> The recent commit: 90b5363acd47 ("sched: Clean up scheduler_ipi()")
> got smp_call_function_single_async() subtly wrong. Even though it will
> return -EBUSY when trying to re-use a csd, that condition is not
> atomic and still requires external serialization.
>
> The change in kick_ilb() got this wrong.
>
> While on first reading kick_ilb() has an atomic test-and-set that
> appears to serialize the use, the matching 'release' is not in the
> right place to actually guarantee this serialization.
>I got confused the first time I read through that, and had the same thoughts as Vincent; reading Frederic's reply and the thread helped. Could we mention the tick softirq vs ilb kick race thing in the changelog? Otherwise the change looks okay, I couldn't convince myself there were more gaps left to fill. Reviewed-by: Valentin Schneider <[email protected]> > Rework the nohz_idle_balance() trigger so that the release is in the > IPI callback and thus guarantees the required serialization for the > CSD. > > Fixes: 90b5363acd47 ("sched: Clean up scheduler_ipi()") > Reported-by: Qian Cai <[email protected]> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

