Re: [PATCH net-next 02/17] net: dsa: mv88e6xxx: fix circular lock in PPU work
On 06/03/2016 09:44 AM, Andrew Lunn wrote: > From: Vivien Didelot> > Lock debugging shows that there is a possible circular lock in the PPU > work code. Switch the lock order of smi_mutex and ppu_mutex to fix this. > > Here's the full trace: > > [4.341325] == > [4.347519] [ INFO: possible circular locking dependency detected ] > [4.353800] 4.6.0 #4 Not tainted > [4.357039] --- > [4.363315] kworker/0:1/328 is trying to acquire lock: > [4.368463] (>smi_mutex){+.+.+.}, at: [<8049c758>] > mv88e6xxx_reg_read+0x30/0x54 > [4.376313] > [4.376313] but task is already holding lock: > [4.382160] (>ppu_mutex){+.+...}, at: [<8049cac0>] > mv88e6xxx_ppu_reenable_work+0x28/0xd4 > [4.390772] > [4.390772] which lock already depends on the new lock. > [4.390772] > [4.398963] > [4.398963] the existing dependency chain (in reverse order) is: > [4.406461] > [4.406461] -> #1 (>ppu_mutex){+.+...}: > [4.410897][<806d86bc>] mutex_lock_nested+0x54/0x360 > [4.416606][<8049a800>] mv88e6xxx_ppu_access_get+0x28/0x100 > [4.422906][<8049b778>] mv88e6xxx_phy_read+0x90/0xdc > [4.428599][<806a4534>] dsa_slave_phy_read+0x3c/0x40 > [4.434300][<804943ec>] mdiobus_read+0x68/0x80 > [4.439481][<804939d4>] get_phy_device+0x58/0x1d8 > [4.444914][<80493ed0>] mdiobus_scan+0x24/0xf4 > [4.450078][<8049409c>] __mdiobus_register+0xfc/0x1ac > [4.455857][<806a40b0>] dsa_probe+0x860/0xca8 > [4.460934][<8043246c>] platform_drv_probe+0x5c/0xc0 > [4.466627][<804305a0>] driver_probe_device+0x118/0x450 > [4.472589][<80430b00>] __device_attach_driver+0xac/0x128 > [4.478724][<8042e350>] bus_for_each_drv+0x74/0xa8 > [4.484235][<804302d8>] __device_attach+0xc4/0x154 > [4.489755][<80430cec>] device_initial_probe+0x1c/0x20 > [4.495612][<8042f620>] bus_probe_device+0x98/0xa0 > [4.501123][<8042fbd0>] deferred_probe_work_func+0x4c/0xd4 > [4.507328][<8013a794>] process_one_work+0x1a8/0x604 > [4.513030][<8013ac54>] worker_thread+0x64/0x528 > [4.518367][<801409e8>] kthread+0xec/0x100 > [4.523201][<80108f30>] ret_from_fork+0x14/0x24 > [4.528462] > [4.528462] -> #0 (>smi_mutex){+.+.+.}: > [4.532895][<8015ad5c>] lock_acquire+0xb4/0x1dc > [4.538154][<806d86bc>] mutex_lock_nested+0x54/0x360 > [4.543856][<8049c758>] mv88e6xxx_reg_read+0x30/0x54 > [4.549549][<8049cad8>] mv88e6xxx_ppu_reenable_work+0x40/0xd4 > [4.556022][<8013a794>] process_one_work+0x1a8/0x604 > [4.561707][<8013ac54>] worker_thread+0x64/0x528 > [4.567053][<801409e8>] kthread+0xec/0x100 > [4.571878][<80108f30>] ret_from_fork+0x14/0x24 > [4.577139] > [4.577139] other info that might help us debug this: > [4.577139] > [4.585159] Possible unsafe locking scenario: > [4.585159] > [4.591093]CPU0CPU1 > [4.595631] > [4.600169] lock(>ppu_mutex); > [4.603693]lock(>smi_mutex); > [4.609742]lock(>ppu_mutex); > [4.615790] lock(>smi_mutex); > [4.619314] > [4.619314] *** DEADLOCK *** > [4.619314] > [4.625256] 3 locks held by kworker/0:1/328: > [4.629537] #0: ("events"){.+.+..}, at: [<8013a704>] > process_one_work+0x118/0x604 > [4.637288] #1: ((>ppu_work)){+.+...}, at: [<8013a704>] > process_one_work+0x118/0x604 > [4.645653] #2: (>ppu_mutex){+.+...}, at: [<8049cac0>] > mv88e6xxx_ppu_reenable_work+0x28/0xd4 > [4.654714] > [4.654714] stack backtrace: > [4.659098] CPU: 0 PID: 328 Comm: kworker/0:1 Not tainted 4.6.0 #4 > [4.665286] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) > [4.671748] Workqueue: events mv88e6xxx_ppu_reenable_work > [4.677174] Backtrace: > [4.679674] [<8010d354>] (dump_backtrace) from [<8010d5a0>] > (show_stack+0x20/0x24) > [4.687252] r6:80fb3c88 r5:80fb3c88 r4:80fb4728 r3:0002 > [4.693003] [<8010d580>] (show_stack) from [<803b45e8>] > (dump_stack+0x24/0x28) > [4.700246] [<803b45c4>] (dump_stack) from [<80157398>] > (print_circular_bug+0x208/0x32c) > [4.708361] [<80157190>] (print_circular_bug) from [<8015a630>] > (__lock_acquire+0x185c/0x1b80) > [4.716982] r10:9ec22a00
[PATCH net-next 02/17] net: dsa: mv88e6xxx: fix circular lock in PPU work
From: Vivien DidelotLock debugging shows that there is a possible circular lock in the PPU work code. Switch the lock order of smi_mutex and ppu_mutex to fix this. Here's the full trace: [4.341325] == [4.347519] [ INFO: possible circular locking dependency detected ] [4.353800] 4.6.0 #4 Not tainted [4.357039] --- [4.363315] kworker/0:1/328 is trying to acquire lock: [4.368463] (>smi_mutex){+.+.+.}, at: [<8049c758>] mv88e6xxx_reg_read+0x30/0x54 [4.376313] [4.376313] but task is already holding lock: [4.382160] (>ppu_mutex){+.+...}, at: [<8049cac0>] mv88e6xxx_ppu_reenable_work+0x28/0xd4 [4.390772] [4.390772] which lock already depends on the new lock. [4.390772] [4.398963] [4.398963] the existing dependency chain (in reverse order) is: [4.406461] [4.406461] -> #1 (>ppu_mutex){+.+...}: [4.410897][<806d86bc>] mutex_lock_nested+0x54/0x360 [4.416606][<8049a800>] mv88e6xxx_ppu_access_get+0x28/0x100 [4.422906][<8049b778>] mv88e6xxx_phy_read+0x90/0xdc [4.428599][<806a4534>] dsa_slave_phy_read+0x3c/0x40 [4.434300][<804943ec>] mdiobus_read+0x68/0x80 [4.439481][<804939d4>] get_phy_device+0x58/0x1d8 [4.444914][<80493ed0>] mdiobus_scan+0x24/0xf4 [4.450078][<8049409c>] __mdiobus_register+0xfc/0x1ac [4.455857][<806a40b0>] dsa_probe+0x860/0xca8 [4.460934][<8043246c>] platform_drv_probe+0x5c/0xc0 [4.466627][<804305a0>] driver_probe_device+0x118/0x450 [4.472589][<80430b00>] __device_attach_driver+0xac/0x128 [4.478724][<8042e350>] bus_for_each_drv+0x74/0xa8 [4.484235][<804302d8>] __device_attach+0xc4/0x154 [4.489755][<80430cec>] device_initial_probe+0x1c/0x20 [4.495612][<8042f620>] bus_probe_device+0x98/0xa0 [4.501123][<8042fbd0>] deferred_probe_work_func+0x4c/0xd4 [4.507328][<8013a794>] process_one_work+0x1a8/0x604 [4.513030][<8013ac54>] worker_thread+0x64/0x528 [4.518367][<801409e8>] kthread+0xec/0x100 [4.523201][<80108f30>] ret_from_fork+0x14/0x24 [4.528462] [4.528462] -> #0 (>smi_mutex){+.+.+.}: [4.532895][<8015ad5c>] lock_acquire+0xb4/0x1dc [4.538154][<806d86bc>] mutex_lock_nested+0x54/0x360 [4.543856][<8049c758>] mv88e6xxx_reg_read+0x30/0x54 [4.549549][<8049cad8>] mv88e6xxx_ppu_reenable_work+0x40/0xd4 [4.556022][<8013a794>] process_one_work+0x1a8/0x604 [4.561707][<8013ac54>] worker_thread+0x64/0x528 [4.567053][<801409e8>] kthread+0xec/0x100 [4.571878][<80108f30>] ret_from_fork+0x14/0x24 [4.577139] [4.577139] other info that might help us debug this: [4.577139] [4.585159] Possible unsafe locking scenario: [4.585159] [4.591093]CPU0CPU1 [4.595631] [4.600169] lock(>ppu_mutex); [4.603693]lock(>smi_mutex); [4.609742]lock(>ppu_mutex); [4.615790] lock(>smi_mutex); [4.619314] [4.619314] *** DEADLOCK *** [4.619314] [4.625256] 3 locks held by kworker/0:1/328: [4.629537] #0: ("events"){.+.+..}, at: [<8013a704>] process_one_work+0x118/0x604 [4.637288] #1: ((>ppu_work)){+.+...}, at: [<8013a704>] process_one_work+0x118/0x604 [4.645653] #2: (>ppu_mutex){+.+...}, at: [<8049cac0>] mv88e6xxx_ppu_reenable_work+0x28/0xd4 [4.654714] [4.654714] stack backtrace: [4.659098] CPU: 0 PID: 328 Comm: kworker/0:1 Not tainted 4.6.0 #4 [4.665286] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) [4.671748] Workqueue: events mv88e6xxx_ppu_reenable_work [4.677174] Backtrace: [4.679674] [<8010d354>] (dump_backtrace) from [<8010d5a0>] (show_stack+0x20/0x24) [4.687252] r6:80fb3c88 r5:80fb3c88 r4:80fb4728 r3:0002 [4.693003] [<8010d580>] (show_stack) from [<803b45e8>] (dump_stack+0x24/0x28) [4.700246] [<803b45c4>] (dump_stack) from [<80157398>] (print_circular_bug+0x208/0x32c) [4.708361] [<80157190>] (print_circular_bug) from [<8015a630>] (__lock_acquire+0x185c/0x1b80) [4.716982] r10:9ec22a00 r9:0060 r8:8164b6bc r7:0040 r6:0003 r5:8163a5b4 [4.724905] r4:0003 r3:9ec22de8 [4.728537] [<80158dd4>] (__lock_acquire) from [<8015ad5c>] (lock_acquire+0xb4/0x1dc) [4.736378] r10:6013