[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
Summary... For Ubuntu Bionic, dpkg triggers for systemd (237-3ubuntu10.39) might have caused systemd to hang: [ 363.776878] wait_for_completion+0xba/0x140 [ 363.776890] __flush_work+0x15b/0x210 [ 363.776901] flush_delayed_work+0x41/0x50 [ 363.776908] fsnotify_wait_marks_destroyed+0x15/0x20 [ 363.776912] fsnotify_destroy_group+0x48/0xd0 [ 363.776917] inotify_release+0x1e/0x50 [ 363.776923] __fput+0xea/0x220 [ 363.776929] fput+0xe/0x10 [ 363.776935] task_work_run+0x9d/0xc0 [ 363.776942] exit_to_usermode_loop+0xc0/0xd0 [ 363.776947] do_syscall_64+0x121/0x130 [ 363.776954] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 and [ 364.050206] wait_for_completion+0xba/0x140 [ 364.050238] __synchronize_srcu.part.13+0x85/0xb0 [ 364.050248] synchronize_srcu+0x66/0xe0 [ 364.050256] fsnotify_mark_destroy_workfn+0x7b/0xe0 [ 364.050262] process_one_work+0x1de/0x420 [ 364.050267] worker_thread+0x228/0x410 [ 364.050272] kthread+0x121/0x140 and [ 364.326985] wait_for_completion+0xba/0x140 [ 364.326988] __synchronize_srcu.part.13+0x85/0xb0 [ 364.326993] synchronize_srcu+0x66/0xe0 [ 364.326995] ? synchronize_srcu+0x66/0xe0 [ 364.326996] fsnotify_connector_destroy_workfn+0x4a/0x80 [ 364.326998] process_one_work+0x1de/0x420 [ 364.326999] worker_thread+0x253/0x410 [ 364.327001] kthread+0x121/0x140 All stack traces seem to come from "fsnotify" subsystem and waiting on delayed work (completion) for fsnotify marks destruction after a inotify_release() was called. Completion did not happen for the past 2 minutes. Without a kernel dump it is hard to tell if completion was still ok - due to kthread being overloaded doing scheduled work and/or the marks group destruction - or there was a dead lock for the completion due to a kernel bug. If this is reproducible, I think that having a kernel dump would help identifying the issue. I'm letting the kernel team to handle this and marking all other issues as dealt per previous comments. ** No longer affects: ipmitool (Ubuntu) ** Changed in: maas Status: New => Invalid ** Summary changed: - commissioning fails due to hung tasks setting up ipmitool + [Ubuntu][Bionic] systemd caused kernel to hang on fsnotify wait-on-completion ** Also affects: linux (Ubuntu) Importance: Undecided Status: New ** No longer affects: linux (Ubuntu) ** Project changed: linux => linux (Ubuntu) ** Also affects: linux (Ubuntu Bionic) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Eoan) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Bionic) Status: New => Triaged ** Changed in: linux (Ubuntu Bionic) Importance: Undecided => Medium -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: [Ubuntu][Bionic] systemd caused kernel to hang on fsnotify wait-on- completion To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
Just re-deployed the machine to check that it worked with the same kernel, and it does: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 18.04.4 LTS Release:18.04 Codename: bionic $ uname -r 4.15.0-96-generic -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: commissioning fails due to hung tasks setting up ipmitool To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
Thanks, I could commission and deploy the machine after disabling the BMC configuration step. After the machine booted, I manually installed impitool and ran 'ipmitool lan print' successfully. After that, I released the machine, and commissioned again with the options "Skip configuring supported BMC controllers with a MAAS generated username and password" and "Allow SSH access and prevent machine powering off". I ssh'ed into the machine, and ran 'apt-get install ipmitools'. The command hanged for 2 minutes, and then finally finished: https://pastebin.canonical.com/p/n2YFPFt5nY/ I opened another SSH session, which took ~30s to open, and took the output from dmesg: https://pastebin.canonical.com/p/PkQXCy4gP4/ This was using the following config: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 18.04.4 LTS Release:18.04 Codename: bionic $ uname -r 4.15.0-96-generic The machine is still in this state and I can perform more tests. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: commissioning fails due to hung tasks setting up ipmitool To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
MAAS instructs cloud-init to install ipmitool during commissioning. If you select "Skip configuring supported BMC controllers with a MAAS generated username and password" and "Allow SSH access and prevent machine powering off" ipmitool won't be installed and the machine will be left on after commissioning/testing to allow for debug. You can also boot into rescue mode on a failed host. ** Changed in: ipmitool (Ubuntu) Status: Incomplete => New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: commissioning fails due to hung tasks setting up ipmitool To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
@maas team can the hang in impitool be isolated somehow for debugging? ** Changed in: ipmitool (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: commissioning fails due to hung tasks setting up ipmitool To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
** Also affects: linux Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: commissioning fails due to hung tasks setting up ipmitool To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
Hi, It still gives the same behavior. I ssh'd into the machine and syslog shows the following: Apr 10 09:00:34 machine-name kernel: [ 78.597093] x86/PAT: ipmi-locate:2583 map pfn expected mapping type uncached-minus for [mem 0xbde1d000-0xbde1dfff], got write-back Apr 10 09:00:34 machine-name kernel: [ 78.758144] x86/PAT: ipmi-locate:2583 map pfn expected mapping type uncached-minus for [mem 0xbde1c000-0xbde1cfff], got write-back Apr 10 09:01:08 machine-name systemd[1]: Starting Stop ureadahead data collection... Apr 10 09:01:08 machine-name systemd[1]: Stopping Read required files in advance... Apr 10 09:01:08 machine-name systemd[1]: Started Stop ureadahead data collection. Apr 10 09:01:25 machine-name kernel: [ 129.353796] INFO: rcu_sched detected stalls on CPUs/tasks: Apr 10 09:01:25 machine-name kernel: [ 129.429260] 0-...!: (0 ticks this GP) idle=c18/0/0 softirq=2098/2098 fqs=0 Apr 10 09:01:25 machine-name kernel: [ 129.524289] (detected by 6, t=15044 jiffies, g=1622, c=1621, q=11120) Apr 10 09:01:25 machine-name kernel: [ 129.613039] Sending NMI from CPU 6 to CPUs 0: Apr 10 09:01:25 machine-name kernel: [ 129.613089] NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0x12/0x20 Apr 10 09:01:25 machine-name kernel: [ 129.614057] rcu_sched kthread starved for 15067 jiffies! g1622 c1621 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0 Apr 10 09:01:25 machine-name kernel: [ 129.755325] rcu_sched I0 9 2 0x8000 Apr 10 09:01:25 machine-name kernel: [ 129.755332] Call Trace: Apr 10 09:01:25 machine-name kernel: [ 129.755350] __schedule+0x24e/0x880 Apr 10 09:01:25 machine-name kernel: [ 129.755357] ? __switch_to_asm+0x41/0x70 Apr 10 09:01:25 machine-name kernel: [ 129.755363] schedule+0x2c/0x80 Apr 10 09:01:25 machine-name kernel: [ 129.755368] schedule_timeout+0x15d/0x350 Apr 10 09:01:25 machine-name kernel: [ 129.755375] ? __next_timer_interrupt+0xe0/0xe0 Apr 10 09:01:25 machine-name kernel: [ 129.755382] rcu_gp_kthread+0x53a/0x980 Apr 10 09:01:25 machine-name kernel: [ 129.755390] kthread+0x121/0x140 Apr 10 09:01:25 machine-name kernel: [ 129.755394] ? rcu_note_context_switch+0x150/0x150 Apr 10 09:01:25 machine-name kernel: [ 129.755399] ? kthread_create_worker_on_cpu+0x70/0x70 Apr 10 09:01:25 machine-name kernel: [ 129.755404] ret_from_fork+0x22/0x40 ** Changed in: maas Status: Incomplete => New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: commissioning fails due to hung tasks setting up ipmitool To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool
** Also affects: ipmitool (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872021 Title: commissioning fails due to hung tasks setting up ipmitool To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/1872021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs