Public bug reported:

Hello

I have a Ubuntu Core 16 device (RPi CM3) which sometimes (at random)
loses ethernet connection.

Core snap is 16-2.40
Kernel snap is pi2-kernel 4.4.0-1120.129

Now I’ve managed to capture serial data from it and it seems it crashes
completely, however it never restores (or full resets).

[18037.433255] INFO: rcu_sched self-detected stall on CPU
[18037.441257] INFO: rcu_sched detected stalls on CPUs/tasks:
[18037.441264] e[12;17H1-...: (3708966 ticks this GP) 
idle=78d/140000000000001/0 softirq=7574/7589 fqs=1588 
[18037.441270] e[13;17H(detected by 3, t=4495427 jiffies, g=2096, c=2095, 
q=35614)
[18037.441272] Task dump for CPU 1:
[18037.441278] manager-control R running      0  2310   2308 0x00000082
[18037.441285] rcu_sched kthread starved for 4493705 jiffies! g2096 c2095 f0x2 
s3 ->state=0x0
[18037.720278] e[17;17H1-...: (3708966 ticks this GP) 
idle=78d/140000000000001/0 softirq=7574/7589 fqs=1588 
[18037.782050] e[18;17H (t=4495512 jiffies g=2096 c=2095 q=35615)
[18037.813311] rcu_sched kthread starved for 4493798 jiffies! g2096 c2095 f0x2 
s3 ->state=0x0
[18037.873341] Task dump for CPU 1:
[18037.901730] manager-control R running      0  2310   2308 0x00000082
[18037.933017] [<80112554>] (unwind_backtrace) from [<8010d7dc>] 
(show_stack+0x20/0x24)
[18037.990378] [<8010d7dc>] (show_stack) from [<801571bc>] 
(sched_show_task+0xb8/0x110)
[18038.048268] [<801571bc>] (sched_show_task) from [<80159a00>] 
(dump_cpu_task+0x48/0x4c)
[18038.107812] [<80159a00>] (dump_cpu_task) from [<801919b8>] 
(rcu_dump_cpu_stacks+0x9c/0xd4)
[18038.168554] [<801919b8>] (rcu_dump_cpu_stacks) from [<80195fec>] 
(rcu_check_callbacks+0x5c0/0x8bc)
[18038.230663] [<80195fec>] (rcu_check_callbacks) from [<8019c3cc>] 
(update_process_times+0x4c/0x74)
[18038.294469] [<8019c3cc>] (update_process_times) from [<801b04dc>] 
(tick_sched_handle+0x64/0x70)
[18038.358337] [<801b04dc>] (tick_sched_handle) from [<801b0550>] 
(tick_sched_timer+0x68/0xbc)
[18038.423493] [<801b0550>] (tick_sched_timer) from [<8019d1b0>] 
(__hrtimer_run_queues+0x188/0x364)
[18038.490259] [<8019d1b0>] (__hrtimer_run_queues) from [<8019db4c>] 
(hrtimer_interrupt+0xd8/0x244)
[18038.557715] [<8019db4c>] (hrtimer_interrupt) from [<80743e5c>] 
(arch_timer_handler_phys+0x40/0x48)
[18038.626525] [<80743e5c>] (arch_timer_handler_phys) from [<8018bd2c>] 
(handle_percpu_devid_irq+0x80/0x194)
[18038.696876] [<8018bd2c>] (handle_percpu_devid_irq) from [<80186f64>] 
(generic_handle_irq+0x34/0x44)
[18038.768565] [<80186f64>] (generic_handle_irq) from [<80187270>] 
(__handle_domain_irq+0x6c/0xc4)
[18038.840084] [<80187270>] (__handle_domain_irq) from [<801096f4>] 
(handle_IRQ+0x28/0x2c)
[18038.910902] [<801096f4>] (handle_IRQ) from [<801015e4>] 
(bcm2836_arm_irqchip_handle_irq+0xb8/0xbc)
[18038.982964] [<801015e4>] (bcm2836_arm_irqchip_handle_irq) from [<808a3678>] 
(__irq_svc+0x58/0x78)
[18039.055005] Exception stack(0xaea2b9d0 to 0xaea2ba18)
[18039.091313] b9c0:                                     b9e3e01c 00000000 
00000025 00000024
[18039.161335] b9e0: ae800040 ffffffff aea2bae4 00011000 aea2bae4 80e0355c 
000b0000 aea2ba2c
[18039.230628] ba00: aea2ba30 aea2ba20 8027a628 808a2c30 200f0013 ffffffff
[18039.267788] [<808a3678>] (__irq_svc) from [<808a2c30>] 
(_raw_spin_lock+0x40/0x54)
[18039.335854] [<808a2c30>] (_raw_spin_lock) from [<8027a628>] 
(unmap_single_vma+0x1e8/0x64c)
[18039.404789] [<8027a628>] (unmap_single_vma) from [<8027bab4>] 
(unmap_vmas+0x64/0x78)
[18039.473309] [<8027bab4>] (unmap_vmas) from [<80282e38>] 
(exit_mmap+0x110/0x214)
[18039.511274] [<80282e38>] (exit_mmap) from [<80122264>] (mmput+0x6c/0x138)
[18039.548375] [<80122264>] (mmput) from [<80128a74>] (do_exit+0x32c/0xb38)
[18039.585148] [<80128a74>] (do_exit) from [<8010db5c>] (die+0x37c/0x38c)
[18039.621494] [<8010db5c>] (die) from [<8011bee0>] 
(__do_kernel_fault.part.0+0x74/0x1f4)
[18039.688470] [<8011bee0>] (__do_kernel_fault.part.0) from [<808a40d8>] 
(do_page_fault+0x244/0x3c4)
[18039.756059] [<808a40d8>] (do_page_fault) from [<80101284>] 
(do_DataAbort+0x58/0xe8)
[18039.822179] [<80101284>] (do_DataAbort) from [<808a35e4>] 
(__dabt_svc+0x44/0x80)
[18039.887651] Exception stack(0xaea2bd00 to 0xaea2bd48)
[18039.921312] bd00: b9f55fc0 00000037 00000038 00000000 b9036000 ad5f32a0 
b9f55fc0 00017000
[18039.987397] bd20: ae80005c b8af3eec 80e0354c aea2bd6c aea2bd70 aea2bd50 
802855e8 802a4a5c
[18040.054541] bd40: 60010113 ffffffff
[18040.086842] [<808a35e4>] (__dabt_svc) from [<802a4a5c>] 
(mem_cgroup_begin_page_stat+0x94/0xa0)
[18040.153350] [<802a4a5c>] (mem_cgroup_begin_page_stat) from [<802855e8>] 
(page_add_file_rmap+0x1c/0xa4)
[18040.220784] [<802855e8>] (page_add_file_rmap) from [<8027c2d4>] 
(do_set_pte+0xec/0x100)
[18040.287422] [<8027c2d4>] (do_set_pte) from [<802498e0>] 
(filemap_map_pages+0x27c/0x298)

and

[18064.097260] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! 
[manager-control:2310]
[18064.163238] Modules linked in: cfg80211 nls_ascii bcm2835_wdt 
bcm2835_gpiomem spi_bcm2835 uio_pdrv_genirq uio i2c_bcm2708
[18064.233969] CPU: 1 PID: 2310 Comm: manager-control Tainted: G      D      L  
4.4.0-1120-raspi2 #129-Ubuntu
[18064.302713] Hardware name: BCM2709
[18064.334894] task: b83f1440 ti: aea2a000 task.ti: aea2a000
[18064.368692] PC is at _raw_spin_lock+0x40/0x54
[18064.400880] LR is at unmap_single_vma+0x1e8/0x64c
[18064.432726] pc : [<808a2c30>]    lr : [<8027a628>]    psr: 200f0013
[18064.432726] sp : aea2ba20  ip : aea2ba30  fp : aea2ba2c
[18064.497639] r10: 000b0000  r9 : 80e0355c  r8 : aea2bae4
[18064.528646] r7 : 00011000  r6 : aea2bae4  r5 : ffffffff  r4 : ae800040
[18064.560481] r3 : 00000024  r2 : 00000025  r1 : 00000000  r0 : b9e3e01c
[18064.591720] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[18064.623483] Control: 10c5383d  Table: 301ec06a  DAC: 00000051
[18064.653583] CPU: 1 PID: 2310 Comm: manager-control Tainted: G      D      L  
4.4.0-1120-raspi2 #129-Ubuntu
[18064.711017] Hardware name: BCM2709
[18064.737944] [<80112554>] (unwind_backtrace) from [<8010d7dc>] 
(show_stack+0x20/0x24)
[18064.793410] [<8010d7dc>] (show_stack) from [<804be668>] 
(dump_stack+0xc8/0x10c)
[18064.825162] [<804be668>] (dump_stack) from [<80109a78>] (show_regs+0x1c/0x20)
[18064.856288] [<80109a78>] (show_regs) from [<801ed21c>] 
(watchdog_timer_fn+0x258/0x2c0)
[18064.911793] [<801ed21c>] (watchdog_timer_fn) from [<8019d1b0>] 
(__hrtimer_run_queues+0x188/0x364)
[18064.968537] [<8019d1b0>] (__hrtimer_run_queues) from [<8019db4c>] 
(hrtimer_interrupt+0xd8/0x244)
[18065.025759] [<8019db4c>] (hrtimer_interrupt) from [<80743e5c>] 
(arch_timer_handler_phys+0x40/0x48)
[18065.084353] [<80743e5c>] (arch_timer_handler_phys) from [<8018bd2c>] 
(handle_percpu_devid_irq+0x80/0x194)
[18065.145475] [<8018bd2c>] (handle_percpu_devid_irq) from [<80186f64>] 
(generic_handle_irq+0x34/0x44)
[18065.208195] [<80186f64>] (generic_handle_irq) from [<80187270>] 
(__handle_domain_irq+0x6c/0xc4)
[18065.272665] [<80187270>] (__handle_domain_irq) from [<801096f4>] 
(handle_IRQ+0x28/0x2c)
[18065.338842] [<801096f4>] (handle_IRQ) from [<801015e4>] 
(bcm2836_arm_irqchip_handle_irq+0xb8/0xbc)
[18065.407526] [<801015e4>] (bcm2836_arm_irqchip_handle_irq) from [<808a3678>] 
(__irq_svc+0x58/0x78)
[18065.477520] Exception stack(0xaea2b9d0 to 0xaea2ba18)
[18065.513738] b9c0:                                     b9e3e01c 00000000 
00000025 00000024
[18065.583538] b9e0: ae800040 ffffffff aea2bae4 00011000 aea2bae4 80e0355c 
000b0000 aea2ba2c

Was told to report this bug here from https://forum.snapcraft.io/t
/watchdog-soft-lockup/13375

** Affects: linux-raspi2 (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-raspi2 in Ubuntu.
https://bugs.launchpad.net/bugs/1845178

Title:
  watchdog bug: soft lockup

Status in linux-raspi2 package in Ubuntu:
  New

Bug description:
  Hello

  I have a Ubuntu Core 16 device (RPi CM3) which sometimes (at random)
  loses ethernet connection.

  Core snap is 16-2.40
  Kernel snap is pi2-kernel 4.4.0-1120.129

  Now I’ve managed to capture serial data from it and it seems it
  crashes completely, however it never restores (or full resets).

  [18037.433255] INFO: rcu_sched self-detected stall on CPU
  [18037.441257] INFO: rcu_sched detected stalls on CPUs/tasks:
  [18037.441264] e[12;17H1-...: (3708966 ticks this GP) 
idle=78d/140000000000001/0 softirq=7574/7589 fqs=1588 
  [18037.441270] e[13;17H(detected by 3, t=4495427 jiffies, g=2096, c=2095, 
q=35614)
  [18037.441272] Task dump for CPU 1:
  [18037.441278] manager-control R running      0  2310   2308 0x00000082
  [18037.441285] rcu_sched kthread starved for 4493705 jiffies! g2096 c2095 
f0x2 s3 ->state=0x0
  [18037.720278] e[17;17H1-...: (3708966 ticks this GP) 
idle=78d/140000000000001/0 softirq=7574/7589 fqs=1588 
  [18037.782050] e[18;17H (t=4495512 jiffies g=2096 c=2095 q=35615)
  [18037.813311] rcu_sched kthread starved for 4493798 jiffies! g2096 c2095 
f0x2 s3 ->state=0x0
  [18037.873341] Task dump for CPU 1:
  [18037.901730] manager-control R running      0  2310   2308 0x00000082
  [18037.933017] [<80112554>] (unwind_backtrace) from [<8010d7dc>] 
(show_stack+0x20/0x24)
  [18037.990378] [<8010d7dc>] (show_stack) from [<801571bc>] 
(sched_show_task+0xb8/0x110)
  [18038.048268] [<801571bc>] (sched_show_task) from [<80159a00>] 
(dump_cpu_task+0x48/0x4c)
  [18038.107812] [<80159a00>] (dump_cpu_task) from [<801919b8>] 
(rcu_dump_cpu_stacks+0x9c/0xd4)
  [18038.168554] [<801919b8>] (rcu_dump_cpu_stacks) from [<80195fec>] 
(rcu_check_callbacks+0x5c0/0x8bc)
  [18038.230663] [<80195fec>] (rcu_check_callbacks) from [<8019c3cc>] 
(update_process_times+0x4c/0x74)
  [18038.294469] [<8019c3cc>] (update_process_times) from [<801b04dc>] 
(tick_sched_handle+0x64/0x70)
  [18038.358337] [<801b04dc>] (tick_sched_handle) from [<801b0550>] 
(tick_sched_timer+0x68/0xbc)
  [18038.423493] [<801b0550>] (tick_sched_timer) from [<8019d1b0>] 
(__hrtimer_run_queues+0x188/0x364)
  [18038.490259] [<8019d1b0>] (__hrtimer_run_queues) from [<8019db4c>] 
(hrtimer_interrupt+0xd8/0x244)
  [18038.557715] [<8019db4c>] (hrtimer_interrupt) from [<80743e5c>] 
(arch_timer_handler_phys+0x40/0x48)
  [18038.626525] [<80743e5c>] (arch_timer_handler_phys) from [<8018bd2c>] 
(handle_percpu_devid_irq+0x80/0x194)
  [18038.696876] [<8018bd2c>] (handle_percpu_devid_irq) from [<80186f64>] 
(generic_handle_irq+0x34/0x44)
  [18038.768565] [<80186f64>] (generic_handle_irq) from [<80187270>] 
(__handle_domain_irq+0x6c/0xc4)
  [18038.840084] [<80187270>] (__handle_domain_irq) from [<801096f4>] 
(handle_IRQ+0x28/0x2c)
  [18038.910902] [<801096f4>] (handle_IRQ) from [<801015e4>] 
(bcm2836_arm_irqchip_handle_irq+0xb8/0xbc)
  [18038.982964] [<801015e4>] (bcm2836_arm_irqchip_handle_irq) from 
[<808a3678>] (__irq_svc+0x58/0x78)
  [18039.055005] Exception stack(0xaea2b9d0 to 0xaea2ba18)
  [18039.091313] b9c0:                                     b9e3e01c 00000000 
00000025 00000024
  [18039.161335] b9e0: ae800040 ffffffff aea2bae4 00011000 aea2bae4 80e0355c 
000b0000 aea2ba2c
  [18039.230628] ba00: aea2ba30 aea2ba20 8027a628 808a2c30 200f0013 ffffffff
  [18039.267788] [<808a3678>] (__irq_svc) from [<808a2c30>] 
(_raw_spin_lock+0x40/0x54)
  [18039.335854] [<808a2c30>] (_raw_spin_lock) from [<8027a628>] 
(unmap_single_vma+0x1e8/0x64c)
  [18039.404789] [<8027a628>] (unmap_single_vma) from [<8027bab4>] 
(unmap_vmas+0x64/0x78)
  [18039.473309] [<8027bab4>] (unmap_vmas) from [<80282e38>] 
(exit_mmap+0x110/0x214)
  [18039.511274] [<80282e38>] (exit_mmap) from [<80122264>] (mmput+0x6c/0x138)
  [18039.548375] [<80122264>] (mmput) from [<80128a74>] (do_exit+0x32c/0xb38)
  [18039.585148] [<80128a74>] (do_exit) from [<8010db5c>] (die+0x37c/0x38c)
  [18039.621494] [<8010db5c>] (die) from [<8011bee0>] 
(__do_kernel_fault.part.0+0x74/0x1f4)
  [18039.688470] [<8011bee0>] (__do_kernel_fault.part.0) from [<808a40d8>] 
(do_page_fault+0x244/0x3c4)
  [18039.756059] [<808a40d8>] (do_page_fault) from [<80101284>] 
(do_DataAbort+0x58/0xe8)
  [18039.822179] [<80101284>] (do_DataAbort) from [<808a35e4>] 
(__dabt_svc+0x44/0x80)
  [18039.887651] Exception stack(0xaea2bd00 to 0xaea2bd48)
  [18039.921312] bd00: b9f55fc0 00000037 00000038 00000000 b9036000 ad5f32a0 
b9f55fc0 00017000
  [18039.987397] bd20: ae80005c b8af3eec 80e0354c aea2bd6c aea2bd70 aea2bd50 
802855e8 802a4a5c
  [18040.054541] bd40: 60010113 ffffffff
  [18040.086842] [<808a35e4>] (__dabt_svc) from [<802a4a5c>] 
(mem_cgroup_begin_page_stat+0x94/0xa0)
  [18040.153350] [<802a4a5c>] (mem_cgroup_begin_page_stat) from [<802855e8>] 
(page_add_file_rmap+0x1c/0xa4)
  [18040.220784] [<802855e8>] (page_add_file_rmap) from [<8027c2d4>] 
(do_set_pte+0xec/0x100)
  [18040.287422] [<8027c2d4>] (do_set_pte) from [<802498e0>] 
(filemap_map_pages+0x27c/0x298)

  and

  [18064.097260] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! 
[manager-control:2310]
  [18064.163238] Modules linked in: cfg80211 nls_ascii bcm2835_wdt 
bcm2835_gpiomem spi_bcm2835 uio_pdrv_genirq uio i2c_bcm2708
  [18064.233969] CPU: 1 PID: 2310 Comm: manager-control Tainted: G      D      
L  4.4.0-1120-raspi2 #129-Ubuntu
  [18064.302713] Hardware name: BCM2709
  [18064.334894] task: b83f1440 ti: aea2a000 task.ti: aea2a000
  [18064.368692] PC is at _raw_spin_lock+0x40/0x54
  [18064.400880] LR is at unmap_single_vma+0x1e8/0x64c
  [18064.432726] pc : [<808a2c30>]    lr : [<8027a628>]    psr: 200f0013
  [18064.432726] sp : aea2ba20  ip : aea2ba30  fp : aea2ba2c
  [18064.497639] r10: 000b0000  r9 : 80e0355c  r8 : aea2bae4
  [18064.528646] r7 : 00011000  r6 : aea2bae4  r5 : ffffffff  r4 : ae800040
  [18064.560481] r3 : 00000024  r2 : 00000025  r1 : 00000000  r0 : b9e3e01c
  [18064.591720] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
none
  [18064.623483] Control: 10c5383d  Table: 301ec06a  DAC: 00000051
  [18064.653583] CPU: 1 PID: 2310 Comm: manager-control Tainted: G      D      
L  4.4.0-1120-raspi2 #129-Ubuntu
  [18064.711017] Hardware name: BCM2709
  [18064.737944] [<80112554>] (unwind_backtrace) from [<8010d7dc>] 
(show_stack+0x20/0x24)
  [18064.793410] [<8010d7dc>] (show_stack) from [<804be668>] 
(dump_stack+0xc8/0x10c)
  [18064.825162] [<804be668>] (dump_stack) from [<80109a78>] 
(show_regs+0x1c/0x20)
  [18064.856288] [<80109a78>] (show_regs) from [<801ed21c>] 
(watchdog_timer_fn+0x258/0x2c0)
  [18064.911793] [<801ed21c>] (watchdog_timer_fn) from [<8019d1b0>] 
(__hrtimer_run_queues+0x188/0x364)
  [18064.968537] [<8019d1b0>] (__hrtimer_run_queues) from [<8019db4c>] 
(hrtimer_interrupt+0xd8/0x244)
  [18065.025759] [<8019db4c>] (hrtimer_interrupt) from [<80743e5c>] 
(arch_timer_handler_phys+0x40/0x48)
  [18065.084353] [<80743e5c>] (arch_timer_handler_phys) from [<8018bd2c>] 
(handle_percpu_devid_irq+0x80/0x194)
  [18065.145475] [<8018bd2c>] (handle_percpu_devid_irq) from [<80186f64>] 
(generic_handle_irq+0x34/0x44)
  [18065.208195] [<80186f64>] (generic_handle_irq) from [<80187270>] 
(__handle_domain_irq+0x6c/0xc4)
  [18065.272665] [<80187270>] (__handle_domain_irq) from [<801096f4>] 
(handle_IRQ+0x28/0x2c)
  [18065.338842] [<801096f4>] (handle_IRQ) from [<801015e4>] 
(bcm2836_arm_irqchip_handle_irq+0xb8/0xbc)
  [18065.407526] [<801015e4>] (bcm2836_arm_irqchip_handle_irq) from 
[<808a3678>] (__irq_svc+0x58/0x78)
  [18065.477520] Exception stack(0xaea2b9d0 to 0xaea2ba18)
  [18065.513738] b9c0:                                     b9e3e01c 00000000 
00000025 00000024
  [18065.583538] b9e0: ae800040 ffffffff aea2bae4 00011000 aea2bae4 80e0355c 
000b0000 aea2ba2c

  Was told to report this bug here from https://forum.snapcraft.io/t
  /watchdog-soft-lockup/13375

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-raspi2/+bug/1845178/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to