Here at Mozilla, we have 200 servers running on HP Moonshot system, all have same hardware configuration and Ubuntu 16.04.2. The OS is not up to date, we use it as is was released. We using a program to tests Firefox source code and after each test we reboot the servers using /sbin/reboot. After a while (between 24-48h - during this period ~6 reboots/h are made), randomly, all 200 servers get stuck at the reboot - see the ILO capture - and to bring it back we have to power cycle each of them.
On one of the beta servers, we have made the bellow updates/changes, set debug, set cron to reboot server after 5-10 min, however, the reboot freeze is still present: - upgraded OS to Ubuntu 16.04.5 latest packages; - used GRUB_CMDLINE_LINUX_DEFAULT="reboot=bios" - used GRUB_CMDLINE_LINUX_DEFAULT="acpi=off" - GRUB_CMDLINE_LINUX_DEFAULT="reboot=force" - upgraded Kernel to v4.15 (the main one from Ubuntu's repo); - upgraded Kernel to v4.20 from https://kernel.ubuntu.com/~kernel-ppa/mainline/ - now we are testing the reboot with 4.20.3 from the above repo and working to update systemd. Attached you can find the debug-log for: - kernel 4.4.0-66-generic #87-Ubuntu - shutdown-debuglogkernel-4.4.txt - kernel 4.15 - shutdown-log-kernel4-15.txt - kernel 4.20 shutdown-log-kernel420.txt - ILO capture with the freeze ILO-reboot-freeze.PNG Please check all this logs/capture and let us know a solution. Thanks. ** Attachment added: "UbuntuBug.zip" https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1783499/+attachment/5230309/+files/UbuntuBug.zip -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1783499 Title: systemd: Failed to send signal To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/1783499/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
