[BUG] cpuset: lockdep warning

2014-06-29 Thread Li Zefan
Hi Tejun,

In this lockdep warning kernfs and workqueue are involved, so I'm not sure 
what's
happening here.

This was triggered when tasks were being moved to parent cpuset due to hotplug.
The kernel is 3.16-rc1, with no modification.

localhost:/ # mount -t cgroup -o cpuset xxx /cpuset
localhost:/ # mkdir /cpuset/tmp
localhost:/ # echo 1 > /cpuset/tmp/cpuset.cpus
localhost:/ # echo 0 > cpuset/tmp/cpuset.mems
localhost:/ # echo $$ > /cpuset/tmp/tasks
localhost:/ # echo 0 > /sys/devices/system/cpu/cpu1/online


[ 1810.292243] ==
[ 1810.292251] [ INFO: possible circular locking dependency detected ]
[ 1810.292259] 3.16.0-rc1-0.1-default+ #7 Not tainted
[ 1810.292266] ---
[ 1810.292273] kworker/1:0/32649 is trying to acquire lock:
[ 1810.292280]  (cgroup_mutex){+.+.+.}, at: [] 
cgroup_transfer_tasks+0x37/0x150
[ 1810.292300]
[ 1810.292300] but task is already holding lock:
[ 1810.292309]  (cpuset_hotplug_work){+.+...}, at: [] 
process_one_work+0x192/0x520
[ 1810.292327]
[ 1810.292327] which lock already depends on the new lock.
[ 1810.292327]
[ 1810.292339]
[ 1810.292339] the existing dependency chain (in reverse order) is:
[ 1810.292348]
[ 1810.292348] -> #2 (cpuset_hotplug_work){+.+...}:
[ 1810.292360][] validate_chain+0x656/0x7c0
[ 1810.292371][] __lock_acquire+0x382/0x660
[ 1810.292380][] lock_acquire+0xf9/0x170
[ 1810.292389][] flush_work+0x39/0x90
[ 1810.292398][] cpuset_write_resmask+0x51/0x120
[ 1810.292409][] cgroup_file_write+0x49/0x1f0
[ 1810.292419][] kernfs_fop_write+0xfd/0x190
[ 1810.292431][] vfs_write+0xe5/0x190
[ 1810.292443][] SyS_write+0x5c/0xc0
[ 1810.292452][] system_call_fastpath+0x16/0x1b
[ 1810.292464]
[ 1810.292464] -> #1 (s_active#175){.+}:
[ 1810.292476][] validate_chain+0x656/0x7c0
[ 1810.292486][] __lock_acquire+0x382/0x660
[ 1810.292495][] lock_acquire+0xf9/0x170
[ 1810.292504][] kernfs_drain+0x13b/0x1c0
[ 1810.292513][] __kernfs_remove+0xc8/0x220
[ 1810.292523][] kernfs_remove_by_name_ns+0x50/0xb0
[ 1810.292533][] cgroup_addrm_files+0x16e/0x290
[ 1810.292543][] cgroup_clear_dir+0x6d/0xa0
[ 1810.292552][] rebind_subsystems+0x10f/0x350
[ 1810.292562][] cgroup_setup_root+0x1bf/0x290
[ 1810.292571][] cgroup_mount+0x123/0x3d0
[ 1810.292581][] mount_fs+0x4d/0x1a0
[ 1810.292591][] vfs_kern_mount+0x79/0x160
[ 1810.292602][] do_new_mount+0xd9/0x200
[ 1810.292611][] do_mount+0x1dc/0x220
[ 1810.292621][] SyS_mount+0xbc/0xe0
[ 1810.292630][] system_call_fastpath+0x16/0x1b
[ 1810.292640]
[ 1810.292640] -> #0 (cgroup_mutex){+.+.+.}:
[ 1810.292651][] check_prev_add+0x43e/0x4b0
[ 1810.292660][] validate_chain+0x656/0x7c0
[ 1810.292669][] __lock_acquire+0x382/0x660
[ 1810.292678][] lock_acquire+0xf9/0x170
[ 1810.292687][] mutex_lock_nested+0x6f/0x380
[ 1810.292697][] cgroup_transfer_tasks+0x37/0x150
[ 1810.292707][] 
hotplug_update_tasks_insane+0x110/0x1d0
[ 1810.292718][] 
cpuset_hotplug_update_tasks+0x13d/0x180
[ 1810.292729][] cpuset_hotplug_workfn+0x18c/0x630
[ 1810.292739][] process_one_work+0x254/0x520
[ 1810.292748][] worker_thread+0x13d/0x3d0
[ 1810.292758][] kthread+0xf8/0x100
[ 1810.292768][] ret_from_fork+0x7c/0xb0
[ 1810.292778]
[ 1810.292778] other info that might help us debug this:
[ 1810.292778]
[ 1810.292789] Chain exists of:
[ 1810.292789]   cgroup_mutex --> s_active#175 --> cpuset_hotplug_work
[ 1810.292789]
[ 1810.292807]  Possible unsafe locking scenario:
[ 1810.292807]
[ 1810.292816]CPU0CPU1
[ 1810.292822]
[ 1810.292827]   lock(cpuset_hotplug_work);
[ 1810.292835]lock(s_active#175);
[ 1810.292845]lock(cpuset_hotplug_work);
[ 1810.292855]   lock(cgroup_mutex);
[ 1810.292862]
[ 1810.292862]  *** DEADLOCK ***
[ 1810.292862]
[ 1810.292872] 2 locks held by kworker/1:0/32649:
[ 1810.292878]  #0:  ("events"){.+.+.+}, at: [] 
process_one_work+0x192/0x520
[ 1810.292895]  #1:  (cpuset_hotplug_work){+.+...}, at: [] 
process_one_work+0x192/0x520
[ 1810.292911]
[ 1810.292911] stack backtrace:
[ 1810.292920] CPU: 1 PID: 32649 Comm: kworker/1:0 Not tainted 
3.16.0-rc1-0.1-default+ #7
[ 1810.292929] Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285
  /BC11BTSA  , BIOS CTSAV036 04/27/2011
[ 1810.292943] Workqueue: events cpuset_hotplug_workfn
[ 1810.292951]  824b01e0 8800afdd3918 815a5f78 
8800afdd3958
[ 1810.292964]  810c263f 1d1fa490 8800afdd3978 
88061d1fa490
[ 1810.292976]   88061d1fad08 88061d1fad40 
8800afdd39f8
[ 1810.292989] Call 

[BUG] cpuset: lockdep warning

2014-06-29 Thread Li Zefan
Hi Tejun,

In this lockdep warning kernfs and workqueue are involved, so I'm not sure 
what's
happening here.

This was triggered when tasks were being moved to parent cpuset due to hotplug.
The kernel is 3.16-rc1, with no modification.

localhost:/ # mount -t cgroup -o cpuset xxx /cpuset
localhost:/ # mkdir /cpuset/tmp
localhost:/ # echo 1  /cpuset/tmp/cpuset.cpus
localhost:/ # echo 0  cpuset/tmp/cpuset.mems
localhost:/ # echo $$  /cpuset/tmp/tasks
localhost:/ # echo 0  /sys/devices/system/cpu/cpu1/online


[ 1810.292243] ==
[ 1810.292251] [ INFO: possible circular locking dependency detected ]
[ 1810.292259] 3.16.0-rc1-0.1-default+ #7 Not tainted
[ 1810.292266] ---
[ 1810.292273] kworker/1:0/32649 is trying to acquire lock:
[ 1810.292280]  (cgroup_mutex){+.+.+.}, at: [8110e3d7] 
cgroup_transfer_tasks+0x37/0x150
[ 1810.292300]
[ 1810.292300] but task is already holding lock:
[ 1810.292309]  (cpuset_hotplug_work){+.+...}, at: [81085412] 
process_one_work+0x192/0x520
[ 1810.292327]
[ 1810.292327] which lock already depends on the new lock.
[ 1810.292327]
[ 1810.292339]
[ 1810.292339] the existing dependency chain (in reverse order) is:
[ 1810.292348]
[ 1810.292348] - #2 (cpuset_hotplug_work){+.+...}:
[ 1810.292360][810c4ee6] validate_chain+0x656/0x7c0
[ 1810.292371][810c53d2] __lock_acquire+0x382/0x660
[ 1810.292380][810c57a9] lock_acquire+0xf9/0x170
[ 1810.292389][810862b9] flush_work+0x39/0x90
[ 1810.292398][811158b1] cpuset_write_resmask+0x51/0x120
[ 1810.292409][8110cc39] cgroup_file_write+0x49/0x1f0
[ 1810.292419][81286c7d] kernfs_fop_write+0xfd/0x190
[ 1810.292431][81204a15] vfs_write+0xe5/0x190
[ 1810.292443][8120545c] SyS_write+0x5c/0xc0
[ 1810.292452][815acb92] system_call_fastpath+0x16/0x1b
[ 1810.292464]
[ 1810.292464] - #1 (s_active#175){.+}:
[ 1810.292476][810c4ee6] validate_chain+0x656/0x7c0
[ 1810.292486][810c53d2] __lock_acquire+0x382/0x660
[ 1810.292495][810c57a9] lock_acquire+0xf9/0x170
[ 1810.292504][812848eb] kernfs_drain+0x13b/0x1c0
[ 1810.292513][81285418] __kernfs_remove+0xc8/0x220
[ 1810.292523][812855c0] kernfs_remove_by_name_ns+0x50/0xb0
[ 1810.292533][8110802e] cgroup_addrm_files+0x16e/0x290
[ 1810.292543][811081bd] cgroup_clear_dir+0x6d/0xa0
[ 1810.292552][8110c30f] rebind_subsystems+0x10f/0x350
[ 1810.292562][8110f2cf] cgroup_setup_root+0x1bf/0x290
[ 1810.292571][8110f4c3] cgroup_mount+0x123/0x3d0
[ 1810.292581][81208b7d] mount_fs+0x4d/0x1a0
[ 1810.292591][8122b439] vfs_kern_mount+0x79/0x160
[ 1810.292602][8122be69] do_new_mount+0xd9/0x200
[ 1810.292611][8122cadc] do_mount+0x1dc/0x220
[ 1810.292621][8122cbdc] SyS_mount+0xbc/0xe0
[ 1810.292630][815acb92] system_call_fastpath+0x16/0x1b
[ 1810.292640]
[ 1810.292640] - #0 (cgroup_mutex){+.+.+.}:
[ 1810.292651][810c481e] check_prev_add+0x43e/0x4b0
[ 1810.292660][810c4ee6] validate_chain+0x656/0x7c0
[ 1810.292669][810c53d2] __lock_acquire+0x382/0x660
[ 1810.292678][810c57a9] lock_acquire+0xf9/0x170
[ 1810.292687][815aa13f] mutex_lock_nested+0x6f/0x380
[ 1810.292697][8110e3d7] cgroup_transfer_tasks+0x37/0x150
[ 1810.292707][811129c0] 
hotplug_update_tasks_insane+0x110/0x1d0
[ 1810.292718][81112bbd] 
cpuset_hotplug_update_tasks+0x13d/0x180
[ 1810.292729][811148ec] cpuset_hotplug_workfn+0x18c/0x630
[ 1810.292739][810854d4] process_one_work+0x254/0x520
[ 1810.292748][810875dd] worker_thread+0x13d/0x3d0
[ 1810.292758][8108e0c8] kthread+0xf8/0x100
[ 1810.292768][815acaec] ret_from_fork+0x7c/0xb0
[ 1810.292778]
[ 1810.292778] other info that might help us debug this:
[ 1810.292778]
[ 1810.292789] Chain exists of:
[ 1810.292789]   cgroup_mutex -- s_active#175 -- cpuset_hotplug_work
[ 1810.292789]
[ 1810.292807]  Possible unsafe locking scenario:
[ 1810.292807]
[ 1810.292816]CPU0CPU1
[ 1810.292822]
[ 1810.292827]   lock(cpuset_hotplug_work);
[ 1810.292835]lock(s_active#175);
[ 1810.292845]lock(cpuset_hotplug_work);
[ 1810.292855]   lock(cgroup_mutex);
[ 1810.292862]
[ 1810.292862]  *** DEADLOCK ***
[ 1810.292862]
[ 1810.292872] 2 locks held by kworker/1:0/32649:
[ 1810.292878]  #0:  (events){.+.+.+}, at: [81085412] 
process_one_work+0x192/0x520
[ 1810.292895]