On 6/30/26 10:01 AM, Juri Lelli wrote:
Hi Waiman,

On 29/06/26 23:33, Waiman Long wrote:
The nr_deadline_tasks variable in the cpuset structure was introduced by
commit 6c24849f5515 ("sched/cpuset: Keep track of SCHED_DEADLINE task
in cpusets"). It is reported by sashiko [1] that nr_deadline_tasks
can currently be modified by inc_dl_tasks_cs() under rq->lock and
by cpuset_attach() under cpuset_mutex. So if both updates happen
simultaneously, the nr_deadline_tasks variable can be corrupted leading
to incorrect operations down the road.

Fix that by changing its type to atomic_t so that nr_deadline_tasks are
always atomically updated.

[1] https://sashiko.dev/#/patchset/20260626181923.133658-1-longman%40redhat.comk

Fixes: 6c24849f5515 ("sched/cpuset: Keep track of SCHED_DEADLINE task in 
cpusets")
Signed-off-by: Waiman Long <[email protected]>
---
Looks like Sashiko is yet not completely happy with this:

https://sashiko.dev/#/patchset/20260630033344.352702-1-longman%40redhat.com

I actually wondered the same and couldn't convince myself we don't
actually have that problem with the window between sched_setscheduler()
and cpuset_attach(). If issue is confirmed, not sure if wait_attach_
done_lock() could help here as well? It's kind of a big lock for the
scheduler, but maybe only affecting DEADLINE tasks and if migrations
are ongoing.

Yes, I am aware of that. This patch can only partially close the race window. It doesn't completely eliminate it.

My current thought is for inc_dl_tasks_cs() to check if the in_progress flag is set. If so, it sets another flag for cpuset_attach() to double check the DL data for consistency. It will be a rather complicated solution in order to eliminate the race window. So I am postponing it to a later time when I have more free time to think about it.

Cheers,
Longman


Reply via email to