nouveau_fan_update: possible circular locking dependency detected
On Thu, Mar 13, 2014 at 09:38:45AM -0400, Ilia Mirkin wrote: > On Sun, Mar 9, 2014 at 10:51 AM, Marcin Slusarz > wrote: > > [ 326.168487] == > > [ 326.168491] [ INFO: possible circular locking dependency detected ] > > [ 326.168496] 3.13.6 #1270 Not tainted > > [ 326.168500] --- > > [ 326.168504] ldconfig/22297 is trying to acquire lock: > > [ 326.168507] (&(&priv->fan->lock)->rlock){-.-...}, at: > > [] nouveau_fan_update+0xeb/0x252 [nouveau] > > [ 326.168551] > > but task is already holding lock: > > [ 326.168555] (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: > > [] alarm_timer_callback+0xf1/0x179 [nouveau] > > [ 326.168587] > > which lock already depends on the new lock. > > (...) > > Marcin, how reproducible is this? What hardware was this on? If it's > reasonably reproducible perhaps it makes sense to file a bug in the > fd.o tracker? It happened only once so far (but I don't use this machine every day) - on the first boot of 3.13 kernel. At that time the machine was quite hot (it was rebuilding the whole system (Gentoo) *and* CPU fan was dusty), so it probably affected GPU temperature. It's NVA8 card (dmesg attached). Marcin -- next part -- [0.00] Linux version 3.13.6 (marcin at joi) (gcc version 4.7.3 (Gentoo 4.7.3-r1 p1.4, pie-0.5.5) ) #1274 SMP PREEMPT Thu Mar 13 22:55:41 CET 2014 [0.00] Command line: BOOT_IMAGE=/boot/kernel-3.13.6 root=UUID=a55f9cc0-8726-4a17-9198-a153da676c85 netconsole=3 at 192.168.1.123/eth0, at 192.168.1.102/00:06:5b:6a:a5:74 [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e4c00-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0xbf77] usable [0.00] BIOS-e820: [mem 0xbf78-0xbf797fff] ACPI data [0.00] BIOS-e820: [mem 0xbf798000-0xbf7dbfff] ACPI NVS [0.00] BIOS-e820: [mem 0xbf7dc000-0xbfff] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved [0.00] BIOS-e820: [mem 0xffe0-0x] reserved [0.00] BIOS-e820: [mem 0x0001-0x0001bfff] usable [0.00] NX (Execute Disable) protection: active [0.00] SMBIOS 2.5 present. [0.00] DMI: System manufacturer System Product Name/P6T SE, BIOS 0603 09/02/2009 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] No AGP bridge found [0.00] e820: last_pfn = 0x1c max_arch_pfn = 0x4 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-B uncachable [0.00] C-E3FFF write-protect [0.00] E4000-EBFFF write-through [0.00] EC000-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 1C000 mask FC000 uncachable [0.00] 1 base 0 mask E write-back [0.00] 2 base 0C000 mask FC000 uncachable [0.00] 3 base 0BF80 mask FFF80 uncachable [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 [0.00] original variable MTRRs [0.00] reg 0, base: 7GB, range: 1GB, type UC [0.00] reg 1, base: 0GB, range: 8GB, type WB [0.00] reg 2, base: 3GB, range: 1GB, type UC [0.00] reg 3, base: 3064MB, range: 8MB, type UC [0.00] total RAM covered: 6136M [0.00] Found optimal setting for mtrr clean up [0.00] gran_size: 64K chunk_size: 16M num_reg: 5 lose cover RAM: 0G [0.00] New variable MTRRs [0.00] reg 0, base: 0GB, range: 2GB, type WB [0.00] reg 1, base: 2GB, range: 1GB, type WB [0.00] reg 2, base: 3064MB, range: 8MB, type UC [0.00] reg 3, base: 4GB, range: 2GB, type WB [0.00] reg 4, base: 6GB, range: 1GB, type WB [0.00] e820: update [mem 0xbf80-0x] usable ==> reserved [0.00] e820: last_pfn = 0xbf780 max_arch_pfn = 0x4 [0.00] Base memory trampoline at [88099000] 99000 size 24576 [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x-0x000f] page 4k [0.00] BRK [0x027db000, 0x027dbfff] PGTABLE [0.00] BRK [0x027dc000, 0x027dcfff] PGTABLE [0.00] BRK [0x027dd000, 0x027ddfff] PGTABLE [0.00] init_memory_mapping: [mem 0x1bfe0-0x1bfff
nouveau_fan_update: possible circular locking dependency detected
Le 13/03/2014 14:38, Ilia Mirkin a ?crit : > On Sun, Mar 9, 2014 at 10:51 AM, Marcin Slusarz > wrote: >> [ 326.168487] == >> [ 326.168491] [ INFO: possible circular locking dependency detected ] >> [ 326.168496] 3.13.6 #1270 Not tainted >> [ 326.168500] --- >> [ 326.168504] ldconfig/22297 is trying to acquire lock: >> [ 326.168507] (&(&priv->fan->lock)->rlock){-.-...}, at: >> [] nouveau_fan_update+0xeb/0x252 [nouveau] >> [ 326.168551] >> but task is already holding lock: >> [ 326.168555] (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: >> [] alarm_timer_callback+0xf1/0x179 [nouveau] >> [ 326.168587] >> which lock already depends on the new lock. >> >> [ 326.168592] >> the existing dependency chain (in reverse order) is: >> [ 326.168596] >> -> #1 (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}: >> [ 326.168606][] lock_acquire+0xce/0x117 >> [ 326.168615][] _raw_spin_lock_irqsave+0x3f/0x51 >> [ 326.168623][] alarm_timer_callback+0xf1/0x179 >> [nouveau] >> [ 326.168651][] >> nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] >> [ 326.168679][] nv04_timer_alarm+0xb5/0xbe >> [nouveau] >> [ 326.168708][] nouveau_fan_update+0x234/0x252 >> [nouveau] >> [ 326.168735][] nouveau_fan_alarm+0x15/0x17 >> [nouveau] >> [ 326.168763][] >> nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] >> [ 326.168790][] nv04_timer_intr+0x5b/0x13c >> [nouveau] >> [ 326.168817][] nouveau_mc_intr+0x2e2/0x3b1 >> [nouveau] >> [ 326.168838][] handle_irq_event_percpu+0x5c/0x1dc >> [ 326.168846][] handle_irq_event+0x3c/0x5c >> [ 326.168852][] handle_edge_irq+0xc4/0xeb >> [ 326.168860][] handle_irq+0x120/0x12d >> [ 326.168868][] do_IRQ+0x48/0xaf >> [ 326.168873][] ret_from_intr+0x0/0x13 >> [ 326.168881][] arch_cpu_idle+0x13/0x1d >> [ 326.168887][] cpu_startup_entry+0x140/0x218 >> [ 326.168895][] start_secondary+0x1bf/0x1c4 >> [ 326.168902] >> -> #0 (&(&priv->fan->lock)->rlock){-.-...}: >> [ 326.168913][] __lock_acquire+0x10be/0x182b >> [ 326.168920][] lock_acquire+0xce/0x117 >> [ 326.168924][] _raw_spin_lock_irqsave+0x3f/0x51 >> [ 326.168931][] nouveau_fan_update+0xeb/0x252 >> [nouveau] >> [ 326.168958][] nouveau_therm_fan_set+0x14/0x16 >> [nouveau] >> [ 326.168984][] nouveau_therm_update+0x303/0x312 >> [nouveau] >> [ 326.169011][] nouveau_therm_alarm+0x13/0x15 >> [nouveau] >> [ 326.169038][] >> nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] >> [ 326.169059][] nv04_timer_alarm+0xb5/0xbe >> [nouveau] >> [ 326.169079][] alarm_timer_callback+0x15e/0x179 >> [nouveau] >> [ 326.169101][] >> nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] >> [ 326.169121][] nv04_timer_intr+0x5b/0x13c >> [nouveau] >> [ 326.169142][] nouveau_mc_intr+0x2e2/0x3b1 >> [nouveau] >> [ 326.169160][] handle_irq_event_percpu+0x5c/0x1dc >> [ 326.169165][] handle_irq_event+0x3c/0x5c >> [ 326.169170][] handle_edge_irq+0xc4/0xeb >> [ 326.169175][] handle_irq+0x120/0x12d >> [ 326.169179][] do_IRQ+0x48/0xaf >> [ 326.169183][] ret_from_intr+0x0/0x13 >> [ 326.169189] >> other info that might help us debug this: >> >> [ 326.169193] Possible unsafe locking scenario: >> >> [ 326.169195]CPU0CPU1 >> [ 326.169197] >> [ 326.169199] lock(&(&priv->sensor.alarm_program_lock)->rlock); >> [ 326.169205] >> lock(&(&priv->fan->lock)->rlock); >> [ 326.169211] >> lock(&(&priv->sensor.alarm_program_lock)->rlock); >> [ 326.169216] lock(&(&priv->fan->lock)->rlock); >> [ 326.169221] >> *** DEADLOCK *** >> >> [ 326.169225] 1 lock held by ldconfig/22297: >> [ 326.169229] #0: (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, >> at: [] alarm_timer_callback+0xf1/0x179 [nouveau] >> [ 326.169253] >> stack backtrace: >> [ 326.169258] CPU: 7 PID: 22297 Comm: ldconfig Not tainted 3.13.6 #1270 >> [ 326.169260] Hardware name: System manufacturer System Product Name/P6T >> SE, BIOS 060309/02/2009 >> [ 326.169264] 90fb6360 8801bfdc3a38 9059e369 >> 0006 >> [ 326.169273] 90fb61b0 8801bfdc3a88 905998cf >> 0002 >> [ 326.169282] 8800b148dbe0 0001 8800b148e1e0 >> 0001 >> [ 326.169342] Call Trace: >> [ 326.169344][] dump_stack+0x4e/0x71 >> [ 326.169352] [] print_circular_bug+0x2ad/0x2be >> [ 326.169356] [] __lock_acquire+0x10be/0x182b >> [ 326.169360] [] ? check_irq_usage+0x99/0xab >> [ 326.169365] [] lock_acquire+0xce/0x117 >> [ 326.169384] []
nouveau_fan_update: possible circular locking dependency detected
On Sun, Mar 9, 2014 at 10:51 AM, Marcin Slusarz wrote: > [ 326.168487] == > [ 326.168491] [ INFO: possible circular locking dependency detected ] > [ 326.168496] 3.13.6 #1270 Not tainted > [ 326.168500] --- > [ 326.168504] ldconfig/22297 is trying to acquire lock: > [ 326.168507] (&(&priv->fan->lock)->rlock){-.-...}, at: > [] nouveau_fan_update+0xeb/0x252 [nouveau] > [ 326.168551] > but task is already holding lock: > [ 326.168555] (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: > [] alarm_timer_callback+0xf1/0x179 [nouveau] > [ 326.168587] > which lock already depends on the new lock. > > [ 326.168592] > the existing dependency chain (in reverse order) is: > [ 326.168596] > -> #1 (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}: > [ 326.168606][] lock_acquire+0xce/0x117 > [ 326.168615][] _raw_spin_lock_irqsave+0x3f/0x51 > [ 326.168623][] alarm_timer_callback+0xf1/0x179 > [nouveau] > [ 326.168651][] > nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] > [ 326.168679][] nv04_timer_alarm+0xb5/0xbe > [nouveau] > [ 326.168708][] nouveau_fan_update+0x234/0x252 > [nouveau] > [ 326.168735][] nouveau_fan_alarm+0x15/0x17 > [nouveau] > [ 326.168763][] > nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] > [ 326.168790][] nv04_timer_intr+0x5b/0x13c > [nouveau] > [ 326.168817][] nouveau_mc_intr+0x2e2/0x3b1 > [nouveau] > [ 326.168838][] handle_irq_event_percpu+0x5c/0x1dc > [ 326.168846][] handle_irq_event+0x3c/0x5c > [ 326.168852][] handle_edge_irq+0xc4/0xeb > [ 326.168860][] handle_irq+0x120/0x12d > [ 326.168868][] do_IRQ+0x48/0xaf > [ 326.168873][] ret_from_intr+0x0/0x13 > [ 326.168881][] arch_cpu_idle+0x13/0x1d > [ 326.168887][] cpu_startup_entry+0x140/0x218 > [ 326.168895][] start_secondary+0x1bf/0x1c4 > [ 326.168902] > -> #0 (&(&priv->fan->lock)->rlock){-.-...}: > [ 326.168913][] __lock_acquire+0x10be/0x182b > [ 326.168920][] lock_acquire+0xce/0x117 > [ 326.168924][] _raw_spin_lock_irqsave+0x3f/0x51 > [ 326.168931][] nouveau_fan_update+0xeb/0x252 > [nouveau] > [ 326.168958][] nouveau_therm_fan_set+0x14/0x16 > [nouveau] > [ 326.168984][] nouveau_therm_update+0x303/0x312 > [nouveau] > [ 326.169011][] nouveau_therm_alarm+0x13/0x15 > [nouveau] > [ 326.169038][] > nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] > [ 326.169059][] nv04_timer_alarm+0xb5/0xbe > [nouveau] > [ 326.169079][] alarm_timer_callback+0x15e/0x179 > [nouveau] > [ 326.169101][] > nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] > [ 326.169121][] nv04_timer_intr+0x5b/0x13c > [nouveau] > [ 326.169142][] nouveau_mc_intr+0x2e2/0x3b1 > [nouveau] > [ 326.169160][] handle_irq_event_percpu+0x5c/0x1dc > [ 326.169165][] handle_irq_event+0x3c/0x5c > [ 326.169170][] handle_edge_irq+0xc4/0xeb > [ 326.169175][] handle_irq+0x120/0x12d > [ 326.169179][] do_IRQ+0x48/0xaf > [ 326.169183][] ret_from_intr+0x0/0x13 > [ 326.169189] > other info that might help us debug this: > > [ 326.169193] Possible unsafe locking scenario: > > [ 326.169195]CPU0CPU1 > [ 326.169197] > [ 326.169199] lock(&(&priv->sensor.alarm_program_lock)->rlock); > [ 326.169205] > lock(&(&priv->fan->lock)->rlock); > [ 326.169211] > lock(&(&priv->sensor.alarm_program_lock)->rlock); > [ 326.169216] lock(&(&priv->fan->lock)->rlock); > [ 326.169221] > *** DEADLOCK *** > > [ 326.169225] 1 lock held by ldconfig/22297: > [ 326.169229] #0: (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, > at: [] alarm_timer_callback+0xf1/0x179 [nouveau] > [ 326.169253] > stack backtrace: > [ 326.169258] CPU: 7 PID: 22297 Comm: ldconfig Not tainted 3.13.6 #1270 > [ 326.169260] Hardware name: System manufacturer System Product Name/P6T > SE, BIOS 060309/02/2009 > [ 326.169264] 90fb6360 8801bfdc3a38 9059e369 > 0006 > [ 326.169273] 90fb61b0 8801bfdc3a88 905998cf > 0002 > [ 326.169282] 8800b148dbe0 0001 8800b148e1e0 > 0001 > [ 326.169342] Call Trace: > [ 326.169344][] dump_stack+0x4e/0x71 > [ 326.169352] [] print_circular_bug+0x2ad/0x2be > [ 326.169356] [] __lock_acquire+0x10be/0x182b > [ 326.169360] [] ? check_irq_usage+0x99/0xab > [ 326.169365] [] lock_acquire+0xce/0x117 > [ 326.169384] [] ? nouveau_fan_update+0xeb/0x252 > [nouveau] > [ 326.169388] [] _raw_spin_lock_irqsave+0x3f/0x51 > [ 326.169407] [] ? nouveau_fan_update+0xeb/0x252 > [nouveau] > [ 326
nouveau_fan_update: possible circular locking dependency detected
[ 326.168487] == [ 326.168491] [ INFO: possible circular locking dependency detected ] [ 326.168496] 3.13.6 #1270 Not tainted [ 326.168500] --- [ 326.168504] ldconfig/22297 is trying to acquire lock: [ 326.168507] (&(&priv->fan->lock)->rlock){-.-...}, at: [] nouveau_fan_update+0xeb/0x252 [nouveau] [ 326.168551] but task is already holding lock: [ 326.168555] (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [] alarm_timer_callback+0xf1/0x179 [nouveau] [ 326.168587] which lock already depends on the new lock. [ 326.168592] the existing dependency chain (in reverse order) is: [ 326.168596] -> #1 (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}: [ 326.168606][] lock_acquire+0xce/0x117 [ 326.168615][] _raw_spin_lock_irqsave+0x3f/0x51 [ 326.168623][] alarm_timer_callback+0xf1/0x179 [nouveau] [ 326.168651][] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] [ 326.168679][] nv04_timer_alarm+0xb5/0xbe [nouveau] [ 326.168708][] nouveau_fan_update+0x234/0x252 [nouveau] [ 326.168735][] nouveau_fan_alarm+0x15/0x17 [nouveau] [ 326.168763][] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] [ 326.168790][] nv04_timer_intr+0x5b/0x13c [nouveau] [ 326.168817][] nouveau_mc_intr+0x2e2/0x3b1 [nouveau] [ 326.168838][] handle_irq_event_percpu+0x5c/0x1dc [ 326.168846][] handle_irq_event+0x3c/0x5c [ 326.168852][] handle_edge_irq+0xc4/0xeb [ 326.168860][] handle_irq+0x120/0x12d [ 326.168868][] do_IRQ+0x48/0xaf [ 326.168873][] ret_from_intr+0x0/0x13 [ 326.168881][] arch_cpu_idle+0x13/0x1d [ 326.168887][] cpu_startup_entry+0x140/0x218 [ 326.168895][] start_secondary+0x1bf/0x1c4 [ 326.168902] -> #0 (&(&priv->fan->lock)->rlock){-.-...}: [ 326.168913][] __lock_acquire+0x10be/0x182b [ 326.168920][] lock_acquire+0xce/0x117 [ 326.168924][] _raw_spin_lock_irqsave+0x3f/0x51 [ 326.168931][] nouveau_fan_update+0xeb/0x252 [nouveau] [ 326.168958][] nouveau_therm_fan_set+0x14/0x16 [nouveau] [ 326.168984][] nouveau_therm_update+0x303/0x312 [nouveau] [ 326.169011][] nouveau_therm_alarm+0x13/0x15 [nouveau] [ 326.169038][] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] [ 326.169059][] nv04_timer_alarm+0xb5/0xbe [nouveau] [ 326.169079][] alarm_timer_callback+0x15e/0x179 [nouveau] [ 326.169101][] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau] [ 326.169121][] nv04_timer_intr+0x5b/0x13c [nouveau] [ 326.169142][] nouveau_mc_intr+0x2e2/0x3b1 [nouveau] [ 326.169160][] handle_irq_event_percpu+0x5c/0x1dc [ 326.169165][] handle_irq_event+0x3c/0x5c [ 326.169170][] handle_edge_irq+0xc4/0xeb [ 326.169175][] handle_irq+0x120/0x12d [ 326.169179][] do_IRQ+0x48/0xaf [ 326.169183][] ret_from_intr+0x0/0x13 [ 326.169189] other info that might help us debug this: [ 326.169193] Possible unsafe locking scenario: [ 326.169195]CPU0CPU1 [ 326.169197] [ 326.169199] lock(&(&priv->sensor.alarm_program_lock)->rlock); [ 326.169205]lock(&(&priv->fan->lock)->rlock); [ 326.169211] lock(&(&priv->sensor.alarm_program_lock)->rlock); [ 326.169216] lock(&(&priv->fan->lock)->rlock); [ 326.169221] *** DEADLOCK *** [ 326.169225] 1 lock held by ldconfig/22297: [ 326.169229] #0: (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [] alarm_timer_callback+0xf1/0x179 [nouveau] [ 326.169253] stack backtrace: [ 326.169258] CPU: 7 PID: 22297 Comm: ldconfig Not tainted 3.13.6 #1270 [ 326.169260] Hardware name: System manufacturer System Product Name/P6T SE, BIOS 060309/02/2009 [ 326.169264] 90fb6360 8801bfdc3a38 9059e369 0006 [ 326.169273] 90fb61b0 8801bfdc3a88 905998cf 0002 [ 326.169282] 8800b148dbe0 0001 8800b148e1e0 0001 [ 326.169342] Call Trace: [ 326.169344][] dump_stack+0x4e/0x71 [ 326.169352] [] print_circular_bug+0x2ad/0x2be [ 326.169356] [] __lock_acquire+0x10be/0x182b [ 326.169360] [] ? check_irq_usage+0x99/0xab [ 326.169365] [] lock_acquire+0xce/0x117 [ 326.169384] [] ? nouveau_fan_update+0xeb/0x252 [nouveau] [ 326.169388] [] _raw_spin_lock_irqsave+0x3f/0x51 [ 326.169407] [] ? nouveau_fan_update+0xeb/0x252 [nouveau] [ 326.169426] [] ? nv04_timer_alarm_trigger+0x18d/0x1cb [nouveau] [ 326.169445] [] nouveau_fan_update+0xeb/0x252 [nouveau] [ 326.169465] [] nouveau_therm_fan_set+0x14/0x16 [nouveau] [ 326.169483] [] nouveau_therm_update+0x303/0x312 [nouveau] [ 326.169502] [] nouveau_therm_alarm+0x