On 9/24/20 3:18 PM, Vincent Guittot wrote: > On Thu, 24 Sep 2020 at 08:48, Xunlei Pang <[email protected]> wrote: >> >> We've met problems that occasionally tasks with full cpumask >> (e.g. by putting it into a cpuset or setting to full affinity) >> were migrated to our isolated cpus in production environment. >> >> After some analysis, we found that it is due to the current >> select_idle_smt() not considering the sched_domain mask. >> >> Steps to reproduce on my 31-CPU hyperthreads machine: >> 1. with boot parameter: "isolcpus=domain,2-31" >> (thread lists: 0,16 and 1,17) >> 2. cgcreate -g cpu:test; cgexec -g cpu:test "test_threads" >> 3. some threads will be migrated to the isolated cpu16~17. >> >> Fix it by checking the valid domain mask in select_idle_smt(). >> >> Fixes: 10e2f1acd010 ("sched/core: Rewrite and improve select_idle_siblings()) >> Reported-by: Wetp Zhang <[email protected]> >> Reviewed-by: Jiang Biao <[email protected]> >> Signed-off-by: Xunlei Pang <[email protected]> > > Reviewed-by: Vincent Guittot <[email protected]> >
Thanks, Vincent :-)

