Re: [PATCH v7 0/5] cpuset: Enable cpuset controller in default hierarchy

2018-04-23 Thread Waiman Long
On 04/20/2018 04:23 AM, Mike Galbraith wrote:
> On Thu, 2018-04-19 at 09:46 -0400, Waiman Long wrote:
>> v7:
>>  - Add a root-only cpuset.cpus.isolated control file for CPU isolation.
>>  - Enforce that load_balancing can only be turned off on cpusets with
>>CPUs from the isolated list.
>>  - Update sched domain generation to allow cpusets with CPUs only
>>from the isolated CPU list to be in separate root domains.
> I haven't done much, but was able to do a q/d manual build, populate
> and teardown of system/critical sets on my desktop box, and it looked
> ok.  Thanks for getting this aboard the v2 boat.
>
>   -Mike

Thank for the testing.

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/5] cpuset: Enable cpuset controller in default hierarchy

2018-04-23 Thread Waiman Long
On 04/23/2018 09:57 AM, Juri Lelli wrote:
> On 23/04/18 15:07, Juri Lelli wrote:
>> Hi Waiman,
>>
>> On 19/04/18 09:46, Waiman Long wrote:
>>> v7:
>>>  - Add a root-only cpuset.cpus.isolated control file for CPU isolation.
>>>  - Enforce that load_balancing can only be turned off on cpusets with
>>>CPUs from the isolated list.
>>>  - Update sched domain generation to allow cpusets with CPUs only
>>>from the isolated CPU list to be in separate root domains.
> Guess I'll be adding comments as soon as I stumble on something unclear
> (to me :), hope that's OK (shout if I should do it differently).
>
> The below looked unexpected to me:
>
> root@debian-kvm:/sys/fs/cgroup# cat g1/cpuset.cpus
> 2-3
> root@debian-kvm:/sys/fs/cgroup# cat g1/cpuset.mems
>
> root@debian-kvm:~# echo $$ > /sys/fs/cgroup/g1/cgroup.threads
> root@debian-kvm:/sys/fs/cgroup# cat g1/cgroup.threads
> 2312
>
> So I can add tasks to groups with no mems? Or is it this only true in my
> case with a single mem node? Or maybe it's inherited from root group
> (slightly confusing IMHO if that's the case).

No mems mean looking up the parents until we find one with non-empty
mems. The mems.effective will show you the actual memory nodes used.

-Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/5] cpuset: Enable cpuset controller in default hierarchy

2018-04-23 Thread Juri Lelli
On 23/04/18 15:07, Juri Lelli wrote:
> Hi Waiman,
> 
> On 19/04/18 09:46, Waiman Long wrote:
> > v7:
> >  - Add a root-only cpuset.cpus.isolated control file for CPU isolation.
> >  - Enforce that load_balancing can only be turned off on cpusets with
> >CPUs from the isolated list.
> >  - Update sched domain generation to allow cpusets with CPUs only
> >from the isolated CPU list to be in separate root domains.
> 

Guess I'll be adding comments as soon as I stumble on something unclear
(to me :), hope that's OK (shout if I should do it differently).

The below looked unexpected to me:

root@debian-kvm:/sys/fs/cgroup# cat g1/cpuset.cpus
2-3
root@debian-kvm:/sys/fs/cgroup# cat g1/cpuset.mems

root@debian-kvm:~# echo $$ > /sys/fs/cgroup/g1/cgroup.threads
root@debian-kvm:/sys/fs/cgroup# cat g1/cgroup.threads
2312

So I can add tasks to groups with no mems? Or is it this only true in my
case with a single mem node? Or maybe it's inherited from root group
(slightly confusing IMHO if that's the case).
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/5] cpuset: Enable cpuset controller in default hierarchy

2018-04-23 Thread Juri Lelli
Hi Waiman,

On 19/04/18 09:46, Waiman Long wrote:
> v7:
>  - Add a root-only cpuset.cpus.isolated control file for CPU isolation.
>  - Enforce that load_balancing can only be turned off on cpusets with
>CPUs from the isolated list.
>  - Update sched domain generation to allow cpusets with CPUs only
>from the isolated CPU list to be in separate root domains.

Just got this while

# echo 2-3 > /sys/fs/cgroup/cpuset.cpus.isolated

[ 6679.177826] =
[ 6679.178385] WARNING: suspicious RCU usage
[ 6679.178910] 4.16.0-rc6+ #151 Not tainted
[ 6679.179459] -
[ 6679.180082] /home/juri/work/kernel/linux/kernel/cgroup/cgroup.c:3826 
cgroup_mutex or RCU read lock required!
[ 6679.181402]
[ 6679.181402] other info that might help us debug this:
[ 6679.181402]
[ 6679.182407]
[ 6679.182407] rcu_scheduler_active = 2, debug_locks = 1
[ 6679.183278] 3 locks held by bash/2205:
[ 6679.183785]  #0:  (sb_writers#10){.+.+}, at: [<4e577fb9>] 
vfs_write+0x18a/0x1b0
[ 6679.184871]  #1:  (>mutex){+.+.}, at: [<5944c83f>] 
kernfs_fop_write+0xe2/0x1a0
[ 6679.185987]  #2:  (cpuset_mutex){+.+.}, at: [<879bfba0>] 
cpuset_write_resmask+0x72/0x1560
[ 6679.187112]
[ 6679.187112] stack backtrace:
[ 6679.187612] CPU: 3 PID: 2205 Comm: bash Not tainted 4.16.0-rc6+ #151
[ 6679.188318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-2.fc27 04/01/2014
[ 6679.189291] Call Trace:
[ 6679.189581]  dump_stack+0x85/0xc5
[ 6679.189963]  css_next_child+0x90/0xd0
[ 6679.190385]  cpuset_write_resmask+0x46f/0x1560
[ 6679.190885]  ? lock_acquire+0x9f/0x210
[ 6679.191315]  cgroup_file_write+0x94/0x230
[ 6679.191768]  kernfs_fop_write+0x113/0x1a0
[ 6679.192223]  __vfs_write+0x36/0x180
[ 6679.192617]  ? rcu_read_lock_sched_held+0x6b/0x80
[ 6679.193139]  ? rcu_sync_lockdep_assert+0x2e/0x60
[ 6679.193654]  ? __sb_start_write+0x154/0x1f0
[ 6679.194118]  ? __sb_start_write+0x16a/0x1f0
[ 6679.194607]  vfs_write+0xc1/0x1b0
[ 6679.194984]  SyS_write+0x55/0xc0
[ 6679.195365]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 6679.195839]  do_syscall_64+0x79/0x220
[ 6679.196212]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 6679.196729] RIP: 0033:0x7f03183ff780
[ 6679.197138] RSP: 002b:7ffeae336ca8 EFLAGS: 0246 ORIG_RAX: 
0001
[ 6679.197866] RAX: ffda RBX: 0004 RCX: 7f03183ff780
[ 6679.198550] RDX: 0004 RSI: 00eaf408 RDI: 0001
[ 6679.199235] RBP: 00eaf408 R08: 000a R09: 7f0318cff700
[ 6679.199928] R10:  R11: 0246 R12: 7f03186b57a0
[ 6679.200615] R13: 0004 R14: 0001 R15: 
[ 6679.201369] rebuild_sched_domains dom 0: 0-1
[ 6679.202196] span: 0-1 (max cpu_capacity = 1024)

Guess we should grab either lock from the writing path.

Best,

- Juri
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 0/5] cpuset: Enable cpuset controller in default hierarchy

2018-04-19 Thread Waiman Long
v7:
 - Add a root-only cpuset.cpus.isolated control file for CPU isolation.
 - Enforce that load_balancing can only be turned off on cpusets with
   CPUs from the isolated list.
 - Update sched domain generation to allow cpusets with CPUs only
   from the isolated CPU list to be in separate root domains.

v6:
 - Hide cpuset control knobs in root cgroup.
 - Rename effective_cpus and effective_mems to cpus.effective and
   mems.effective respectively.
 - Remove cpuset.flags and add cpuset.sched_load_balance instead
   as the behavior of sched_load_balance has changed and so is
   not a simple flag.
 - Update cgroup-v2.txt accordingly.

v5:
 - Add patch 2 to provide the cpuset.flags control knob for the
   sched_load_balance flag which should be the only feature that is
   essential as a replacement of the "isolcpus" kernel boot parameter.

v4:
 - Further minimize the feature set by removing the flags control knob.

v3:
 - Further trim the additional features down to just memory_migrate.
 - Update Documentation/cgroup-v2.txt.

v6 patch: https://lkml.org/lkml/2018/3/21/530

The purpose of this patchset is to provide a basic set of cpuset
features for cgroup v2. This basic set includes the non-root "cpus",
"mems", "cpus.effective" and "mems.effective", "sched_load_balance"
control file as well as a root-only "cpus.isolated".

The root-only "cpus.isolated" file is added to support use cases similar
to the "isolcpus" kernel parameter. CPUs from the isolated list can be
put into child cpusets where "sched_load_balance" can be disabled to
allow finer control of task-cpu mappings of those isolated CPUs.

On the other hand, enabling the "sched_load_balance" on a cpuset with
only CPUs from the isolated list will allow those CPUs to use a separate
root domain from that of the root cpuset.

This patchset does not exclude the possibility of adding more features
in the future after careful consideration.

Patch 1 enables cpuset in cgroup v2 with cpus, mems and their
effective counterparts.

Patch 2 adds sched_load_balance whose behavior changes in v2 to become
hierarchical and includes an implicit !cpu_exclusive.

Patch 3 adds a new root-only "cpuset.cpus.isolated" control file for
CPU isolation purpose.

Patch 4 adds the limitation that "sched_load_balance" can only be turned
off in a cpuset if all the CPUs in the cpuset are already in the root's
"cpuset.cpus.isolated".

Patch 5 modifies the sched domain generation code to generate separate root
sched domains if all the CPUs in a cpuset comes from "cpuset.cpus.isolated".

In other words, all the CPUs that need to be isolated or in separate
root domains have to be put into the "cpuset.cpus.isolated" first. Then
child cpusets can be created to partition those isolated CPUs into
either separate root domains with "sched_load_balance" on or really
isolated CPUs with "sched_load_balance" off. Load balancing cannot
be turned off at root.

Waiman Long (5):
  cpuset: Enable cpuset controller in default hierarchy
  cpuset: Add cpuset.sched_load_balance to v2
  cpuset: Add a root-only cpus.isolated v2 control file
  cpuset: Restrict load balancing off cpus to subset of cpus.isolated
  cpuset: Make generate_sched_domains() recognize isolated_cpus

 Documentation/cgroup-v2.txt | 138 -
 kernel/cgroup/cpuset.c  | 287 +---
 2 files changed, 404 insertions(+), 21 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html