On 6/30/26 10:51 PM, Ridong Chen wrote:
On 6/30/2026 11:33 AM, Waiman Long wrote:
The only case where the cgroup_taskset structure requires task migration
to multiple cpusets is when enabling a cpuset controller in cgroup v2
where the newly created child cpusets inherits the same effective CPUs
and memory nodes from the parent. In that case, task migration can
happen
directly with no update to tasks' CPU and memory nodes assignment and no
further work needed from the cpuset side except updating
nr_deadline_tasks
when DL tasks are involved and setting old_mems_allowed in the child
cpusets.
Do that by tracking all the destination cpusets with a new dst_cs_head
singly linked list. The reset_migrate_dl_data() function is integrated
into clear_attach_data() so that it can be used for both source and
destination cpusets.
It is assumed that a given cpuset cannot be both a source and a
destination cpuset. If such condition happens or when there are multiple
destination cpusets with CPU or memory nodes changes, the current code
will not handle it correctly. So it will print a warning and fail the
attach operation in these unexpected cases as we will have to enhance
the
code to support this if such use cases are valid and not coding errors.
Signed-off-by: Waiman Long <[email protected]>
---
kernel/cgroup/cpuset-internal.h | 1 +
kernel/cgroup/cpuset.c | 115 ++++++++++++++++++++------------
2 files changed, 72 insertions(+), 44 deletions(-)
diff --git a/kernel/cgroup/cpuset-internal.h
b/kernel/cgroup/cpuset-internal.h
index e7d010661fd3..d1161b0a3d85 100644
--- a/kernel/cgroup/cpuset-internal.h
+++ b/kernel/cgroup/cpuset-internal.h
@@ -149,6 +149,7 @@ struct cpuset {
* For linking impacted cpusets during an attach operation.
*/
struct llist_node attach_node;
+ bool attach_source;
/* partition root state */
int partition_root_state;
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index b201f4ba18b6..1591d6dca66a 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -366,10 +366,12 @@ static struct {
bool cpus_updated;
bool mems_updated;
bool task_work_queued;
+ bool many_dest_cs; /* Have many destination cpusets */
struct cpuset *old_cs; /* Source cpuset */
nodemask_t nodemask_to;
} attach_ctx;
static LLIST_HEAD(src_cs_head);
+static LLIST_HEAD(dst_cs_head);
This looks a lot like the 'struct list_head mg_src_preload_node' and
'struct list_head mg_dst_preload_node' in struct css_set. Is there a
better way to reuse those instead of adding a separate tracking list
here?
The cgroup_mgctx is a cgroup internal data structure which is not
exposed to individual controllers. Sharing it will have some risks if it
is accidentally modified.
Conversion of css_set iteration to cpuset iteration is a bit more
complicated as 2 or more css_sets may point to the same cpuset. So we
still have to track if a cpuset has been visited before.
It is doable, but I doubt it is worth the effort.
Cheers,
Longman