Here at Mozilla, we have 200 servers running on HP Moonshot system, all
have same hardware configuration and Ubuntu 16.04.2. The OS is not up to
date, we use it as is was released. We using a program to tests Firefox
source code and after each test we reboot the servers using
/sbin/reboot. After a while (between 24-48h - during this period ~6
reboots/h are made), randomly, all 200 servers get stuck at the reboot -
see the ILO capture - and to bring it back we have to power cycle each
of them.

On one of the beta servers, we have made the bellow updates/changes, set debug, 
set cron to reboot server after 5-10 min, however, the reboot freeze is still 
present:
- upgraded OS to Ubuntu 16.04.5 latest packages;
- used GRUB_CMDLINE_LINUX_DEFAULT="reboot=bios" 
- used GRUB_CMDLINE_LINUX_DEFAULT="acpi=off"
- GRUB_CMDLINE_LINUX_DEFAULT="reboot=force"
- upgraded Kernel to v4.15 (the main one from Ubuntu's repo);
- upgraded Kernel to v4.20 from https://kernel.ubuntu.com/~kernel-ppa/mainline/
- now we are testing the reboot with 4.20.3 from the above repo and working to 
update systemd.

Attached you can find the debug-log for:
- kernel 4.4.0-66-generic #87-Ubuntu - shutdown-debuglogkernel-4.4.txt
- kernel 4.15 - shutdown-log-kernel4-15.txt 
- kernel 4.20 shutdown-log-kernel420.txt
- ILO capture with the freeze ILO-reboot-freeze.PNG

Please check all this logs/capture and let us know a solution. Thanks.

** Attachment added: "UbuntuBug.zip"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1783499/+attachment/5230309/+files/UbuntuBug.zip

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1783499

Title:
  systemd: Failed to send signal

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/1783499/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to