[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-22 Thread Rafael David Tinoco
Summary...

For Ubuntu Bionic, dpkg triggers for systemd (237-3ubuntu10.39) might
have caused systemd to hang:

[  363.776878]  wait_for_completion+0xba/0x140
[  363.776890]  __flush_work+0x15b/0x210
[  363.776901]  flush_delayed_work+0x41/0x50
[  363.776908]  fsnotify_wait_marks_destroyed+0x15/0x20
[  363.776912]  fsnotify_destroy_group+0x48/0xd0
[  363.776917]  inotify_release+0x1e/0x50
[  363.776923]  __fput+0xea/0x220
[  363.776929]  fput+0xe/0x10
[  363.776935]  task_work_run+0x9d/0xc0
[  363.776942]  exit_to_usermode_loop+0xc0/0xd0
[  363.776947]  do_syscall_64+0x121/0x130
[  363.776954]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

and

[  364.050206]  wait_for_completion+0xba/0x140
[  364.050238]  __synchronize_srcu.part.13+0x85/0xb0
[  364.050248]  synchronize_srcu+0x66/0xe0
[  364.050256]  fsnotify_mark_destroy_workfn+0x7b/0xe0
[  364.050262]  process_one_work+0x1de/0x420
[  364.050267]  worker_thread+0x228/0x410
[  364.050272]  kthread+0x121/0x140

and

[  364.326985]  wait_for_completion+0xba/0x140
[  364.326988]  __synchronize_srcu.part.13+0x85/0xb0
[  364.326993]  synchronize_srcu+0x66/0xe0
[  364.326995]  ? synchronize_srcu+0x66/0xe0
[  364.326996]  fsnotify_connector_destroy_workfn+0x4a/0x80
[  364.326998]  process_one_work+0x1de/0x420
[  364.326999]  worker_thread+0x253/0x410
[  364.327001]  kthread+0x121/0x140

All stack traces seem to come from "fsnotify" subsystem and waiting on
delayed work (completion) for fsnotify marks destruction after a
inotify_release() was called. Completion did not happen for the past 2
minutes. Without a kernel dump it is hard to tell if completion was
still ok - due to kthread being overloaded doing scheduled work and/or
the marks group destruction - or there was a dead lock for the
completion due to a kernel bug.

If this is reproducible, I think that having a kernel dump would help
identifying the issue. I'm letting the kernel team to handle this and
marking all other issues as dealt per previous comments.


** No longer affects: ipmitool (Ubuntu)

** Changed in: maas
   Status: New => Invalid

** Summary changed:

- commissioning fails due to hung tasks setting up ipmitool
+ [Ubuntu][Bionic] systemd caused kernel to hang on fsnotify wait-on-completion

** Also affects: linux (Ubuntu)
   Importance: Undecided
   Status: New

** No longer affects: linux (Ubuntu)

** Project changed: linux => linux (Ubuntu)

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Bionic)
   Status: New => Triaged

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  [Ubuntu][Bionic] systemd caused kernel to hang on fsnotify wait-on-
  completion

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-21 Thread Laurent Sesques
Just re-deployed the machine to check that it worked with the same kernel, and 
it does:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 18.04.4 LTS
Release:18.04
Codename:   bionic
$ uname -r
4.15.0-96-generic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  commissioning fails due to hung tasks setting up ipmitool

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-20 Thread Laurent Sesques
Thanks, I could commission and deploy the machine after disabling the BMC 
configuration step.
After the machine booted, I manually installed impitool and ran 'ipmitool lan 
print' successfully.

After that, I released the machine, and commissioned again with the options 
"Skip configuring supported BMC controllers with a MAAS generated username and 
password" and "Allow SSH access and prevent machine powering off".
I ssh'ed into the machine, and ran 'apt-get install ipmitools'.
The command hanged for 2 minutes, and then finally finished: 
https://pastebin.canonical.com/p/n2YFPFt5nY/
I opened another SSH session, which took ~30s to open, and took the output from 
dmesg:
https://pastebin.canonical.com/p/PkQXCy4gP4/

This was using the following config:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 18.04.4 LTS
Release:18.04
Codename:   bionic
$ uname -r
4.15.0-96-generic

The machine is still in this state and I can perform more tests.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  commissioning fails due to hung tasks setting up ipmitool

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-17 Thread Lee Trager
MAAS instructs cloud-init to install ipmitool during commissioning. If
you select "Skip configuring supported BMC controllers with a MAAS
generated username and password" and "Allow SSH access and prevent
machine powering off" ipmitool won't be installed and the machine will
be left on after commissioning/testing to allow for debug. You can also
boot into rescue mode on a failed host.

** Changed in: ipmitool (Ubuntu)
   Status: Incomplete => New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  commissioning fails due to hung tasks setting up ipmitool

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-16 Thread Christian Ehrhardt 
@maas team can the hang in impitool be isolated somehow for debugging?

** Changed in: ipmitool (Ubuntu)
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  commissioning fails due to hung tasks setting up ipmitool

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-10 Thread Lee Trager
** Also affects: linux
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  commissioning fails due to hung tasks setting up ipmitool

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-10 Thread Laurent Sesques
Hi,

It still gives the same behavior.
I ssh'd into the machine and syslog shows the following:

Apr 10 09:00:34 machine-name kernel: [   78.597093] x86/PAT: ipmi-locate:2583 
map pfn expected mapping type uncached-minus for [mem 0xbde1d000-0xbde1dfff], 
got write-back
Apr 10 09:00:34 machine-name kernel: [   78.758144] x86/PAT: ipmi-locate:2583 
map pfn expected mapping type uncached-minus for [mem 0xbde1c000-0xbde1cfff], 
got write-back
Apr 10 09:01:08 machine-name systemd[1]: Starting Stop ureadahead data 
collection...
Apr 10 09:01:08 machine-name systemd[1]: Stopping Read required files in 
advance...
Apr 10 09:01:08 machine-name systemd[1]: Started Stop ureadahead data 
collection.
Apr 10 09:01:25 machine-name kernel: [  129.353796] INFO: rcu_sched detected 
stalls on CPUs/tasks:
Apr 10 09:01:25 machine-name kernel: [  129.429260] 0-...!: (0 ticks 
this GP) idle=c18/0/0 softirq=2098/2098 fqs=0
Apr 10 09:01:25 machine-name kernel: [  129.524289] (detected by 6, 
t=15044 jiffies, g=1622, c=1621, q=11120)
Apr 10 09:01:25 machine-name kernel: [  129.613039] Sending NMI from CPU 6 to 
CPUs 0:
Apr 10 09:01:25 machine-name kernel: [  129.613089] NMI backtrace for cpu 0 
skipped: idling at native_safe_halt+0x12/0x20
Apr 10 09:01:25 machine-name kernel: [  129.614057] rcu_sched kthread starved 
for 15067 jiffies! g1622 c1621 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
Apr 10 09:01:25 machine-name kernel: [  129.755325] rcu_sched   I0 
9  2 0x8000
Apr 10 09:01:25 machine-name kernel: [  129.755332] Call Trace:
Apr 10 09:01:25 machine-name kernel: [  129.755350]  __schedule+0x24e/0x880
Apr 10 09:01:25 machine-name kernel: [  129.755357]  ? __switch_to_asm+0x41/0x70
Apr 10 09:01:25 machine-name kernel: [  129.755363]  schedule+0x2c/0x80
Apr 10 09:01:25 machine-name kernel: [  129.755368]  
schedule_timeout+0x15d/0x350
Apr 10 09:01:25 machine-name kernel: [  129.755375]  ? 
__next_timer_interrupt+0xe0/0xe0
Apr 10 09:01:25 machine-name kernel: [  129.755382]  rcu_gp_kthread+0x53a/0x980
Apr 10 09:01:25 machine-name kernel: [  129.755390]  kthread+0x121/0x140
Apr 10 09:01:25 machine-name kernel: [  129.755394]  ? 
rcu_note_context_switch+0x150/0x150
Apr 10 09:01:25 machine-name kernel: [  129.755399]  ? 
kthread_create_worker_on_cpu+0x70/0x70
Apr 10 09:01:25 machine-name kernel: [  129.755404]  ret_from_fork+0x22/0x40

** Changed in: maas
   Status: Incomplete => New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  commissioning fails due to hung tasks setting up ipmitool

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872021] Re: commissioning fails due to hung tasks setting up ipmitool

2020-04-10 Thread Alberto Donato
** Also affects: ipmitool (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872021

Title:
  commissioning fails due to hung tasks setting up ipmitool

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1872021/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs