On 2026/1/9 3:04, Michal Koutný wrote:
> Hi.
> 
> On Thu, Jan 01, 2026 at 02:15:58PM -0500, Waiman Long <[email protected]> 
> wrote:
>> Currently, when setting a cpuset's cpuset.cpus to a value that conflicts
>> with the cpuset.cpus/cpuset.cpus.exclusive of a sibling partition,
>> the sibling's partition state becomes invalid. This is overly harsh and
>> is probably not necessary.
>>
>> The cpuset.cpus.exclusive control file, if set, will override the
>> cpuset.cpus of the same cpuset when creating a cpuset partition.
>> So cpuset.cpus has less priority than cpuset.cpus.exclusive in setting up
>> a partition.  However, it cannot override a conflicting cpuset.cpus file
>> in a sibling cpuset and the partition creation process will fail. This
>> is inconsistent.  That will also make using cpuset.cpus.exclusive less
>> valuable as a tool to set up cpuset partitions as the users have to
>> check if such a cpuset.cpus conflict exists or not.
>>
>> Fix these problems by strictly adhering to the setting of the
>> following control files in descending order of priority when setting
>> up a partition.
>>
>>  1. cpuset.cpus.exclusive.effective of a valid partition
>>  2. cpuset.cpus.exclusive
>>  3. cpuset.cpus
> 
> 
>>
>> So once a cpuset.cpus.exclusive is set without failure, it will
>> always be allowed to form a valid partition as long as at least one
>> CPU can be granted from its parent irrespective of the state of the
>> siblings' cpuset.cpus values. Of course, setting cpuset.cpus.exclusive
>> will fail if it conflicts with the cpuset.cpus.exclusive or the
>> cpuset.cpus.exclusive.effective value of a sibling.
> 
> Concept question: 
> When a/b/cpuset.cpus.exclusive ⊂ a/b/cpuset.cpus (proper subset)
> and a/b/cpuset.cpus.partition == root, a/cpuset.cpus.partition == root
> (b is valid partition)
> should a/b/cpuset.cpus.exclusive.effective be equal to cpuset.cpus (as
> all of them happen to be exclusive) or "only" cpuset.cpus.exclusive?
> 

The value of cpuset.cpus will not affect cpuset.cpus.exclusive.effective when 
cpuset.cpus.exclusive
is set.

Therefore, the answer: only cpuset.cpus.exclusive.

If cpuset.cpus could not be used for exclusive CPU allocation in a partition, 
it would be easier to
understand the settings of cpuset.cpus.exclusive and cpuset.cpus.partition. 
This means that only
when cpuset.cpus.exclusive is set can the cpuset be a partition (it has nothing 
to do with
cpuset.cpus). However, for historical and compatibility reasons, cpuset.cpus is 
considered as the
exclusive CPUs if cpuset.cpus.exclusive is not set.

>> Partition can still be created by setting only cpuset.cpus without
>> setting cpuset.cpus.exclusive. However, any conflicting CPUs in sibling's
>> cpuset.cpus.exclusive.effective and cpuset.cpus.exclusive values will
>> be removed from its cpuset.cpus.exclusive.effective as long as there
>> is still one or more CPUs left and can be granted from its parent. This
>> CPU stripping is currently done in rm_siblings_excl_cpus().
>>
>> The new code will now try its best to enable the creation of new
>> partitions with only cpuset.cpus set without invalidating existing ones.
> 
> OK. (After I re-learnt benefits of remote partitions or more precisely
> cpuset.cpus.effective.)
> 
>> However it is not guaranteed that all the CPUs requested in cpuset.cpus
>> will be used in the new partition even when all these CPUs can be
>> granted from the parent.
>>
>> This is similar to the fact that cpuset.cpus.effective may not be
>> able to include all the CPUs requested in cpuset.cpus. In this case,
>> the parent may not able to grant all the exclusive CPUs requested in
>> cpuset.cpus to cpuset.cpus.exclusive.effective if some of them have
>> already been granted to other partitions earlier.
>>
>> With the creation of multiple sibling partitions by setting
>> only cpuset.cpus, this does have the side effect that their exact
>> cpuset.cpus.exclusive.effective settings will depend on the order of
>> partition creation if there are conflicts. Due to the exclusive nature
>> of the CPUs in a partition, it is not easy to make it fair other than
>> the old behavior of invalidating all the conflicting partitions.
>>
>> For example,
>>   # echo "0-2" > A1/cpuset.cpus
>>   # echo "root" > A1/cpuset.cpus.partition
>>   # echo A1/cpuset.cpus.partition
>>   root
>>   # echo A1/cpuset.cpus.exclusive.effective
>>   0-2
>>   # echo "2-4" > B1/cpuset.cpus
>>   # echo "root" > B1/cpuset.cpus.partition
>>   # echo B1/cpuset.cpus.partition
>>   root
>>   # echo B1/cpuset.cpus.exclusive.effective
>>   3-4
>>   # echo B1/cpuset.cpus.effective
>>   3-4
>>
>> For users who want to be sure that they can get most of the CPUs they
>> want,
> 
> Slightly OT but I'd say that users want:
> a) confinement (some cpuset.cpus in leaves)
> b) isolation (cpuset.cpus.exclusive in leaves)
> c) hierarchical organization
>   - confinment generalizes OK
>   - children can only claim what parent allowed
> 
> Conflicting exclusivity configs should be no users intention or a want :-p
> 
> 
>> cpuset.cpus.exclusive should be used instead if they can set
>> it successfully without failure. Setting cpuset.cpus.exclusive will
>> guarantee that sibling conflicts from then onward is no longer possible.
> 
> I think the background idea of the paragraph (shift away from local to
> remote partitions, also mentioned the other day) could be somehow fitted
> into the Documentation/ hunks.
> 
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst 
>> b/Documentation/admin-guide/cgroup-v2.rst
>> ...
>> @@ -2632,6 +2641,9 @@ Cpuset Interface Files
>>  
>>      The root cgroup is always a partition root and its state cannot
>>      be changed.  All other non-root cgroups start out as "member".
>> +    Even though the "cpuset.cpus.exclusive*" control files are not
>> +    present in the root cgroup, they are implicitly the same as
>> +    "cpuset.cpus".
> 
> Even "cpuset.cpus" have CFTYPE_NOT_ON_ROOT, so this formulation might be
> confusing. Maybe it's same as "cpuset.cpus.effective"?
> 
> Thanks,
> Michal

-- 
Best regards,
Ridong


Reply via email to