[Touch-packages] [Bug 1750937] Re: 4.4.0-116 Kernel update on 2/21 breaks Nvidia drivers (on 14.04 and 16.04) by an insufficient compiler!
I manage 300+ machines that run openafs that has a dkms built kernel module like the nvidia module that needs to be built. I also manage dozens of nvidia gpu servers where users have sudo access and can install anything they want. Here is a snippet of what I found. Note this is for 16.04 systems but 14.04 systems running the 4.4.0-116 kernel will have similar problems: Short story, if your machine is not using the Ubuntu supplied gcc you will have issues with afs and nvidia built kernel modules or any dkms built kernel modules. Longer story below. NOTE! this problem affects at least, openafs, nvidia, virtual box or any dkms built module. I am going to forward this info to gr...@cs.unc.edu. This started with the latest Ubuntu 4.4.0-116 kernel version. Looking through that bug and testing took me hours. The short story is the machines having issues with openafs.ko module are ones that have the Ubuntu toolchain ppa that has a gcc compiler suite that does not support the "retpoline" feature which was recently put in to fix the Spectre security issue. The nvidia module will also have issues. The machines using the Ubuntu supplied gcc compiler are the ones that are not having issues. But, host olympia was a special case. The compiler that works, using "gcc -v" gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9) The ones that don't work like host bvisionserver8: gcc version 5.4.1 20160904 You can use "apt-cache policy gcc" to show what repo the compiler comes from. WARNING, /usr/bin/gcc is a link to /usr/bin/gcc-5, the gcc package is a meta package and you need to query gcc-5. If you query gcc it shows coming from the standard Ubuntu repo but /usr/bin/gcc-5 is coming from the toolchain repo. A good gcc-5 shows: classroom:55% apt-cache policy gcc-5 gcc-5: Installed: 5.4.0-6ubuntu1~16.04.9 Candidate: 5.4.0-6ubuntu1~16.04.9 Version table: *** 5.4.0-6ubuntu1~16.04.9 500 500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 100 /var/lib/dpkg/status 5.3.1-14ubuntu2 500 500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages The bad compilers show: -- bvisionserver8:/> apt-cache policy gcc-5 gcc-5: Installed: 5.4.1-2ubuntu1~16.04 Candidate: 5.4.1-2ubuntu1~16.04 Version table: *** 5.4.1-2ubuntu1~16.04 500 500 http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu xenial/main amd64 Packages 100 /var/lib/dpkg/status 5.4.0-6ubuntu1~16.04.9 500 500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 5.3.1-14ubuntu2 500 500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages And you can see /etc/apt/sources.list.d/ubuntu-toolchain-r-ubuntu-test-xenial.list repo is configure on those machines. On a good machine modinfo openafs shows that retpoline is turned on in the vermagic: line: classroom:56% modinfo openafs filename: /lib/modules/4.4.0-116-generic/updates/dkms/openafs.ko license:http://www.openafs.org/dl/license10.html srcversion: 4E1BEB8CE16072EF8E64542 depends: vermagic: 4.4.0-116-generic SMP mod_unload modversions retpoline And not turned on a bad machine: bvisionserver8:/> modinfo openafs filename: /lib/modules/4.4.0-116-generic/updates/dkms/openafs.ko license:http://www.openafs.org/dl/license10.html srcversion: 66044F5DC18AA3288DB22FF depends: vermagic: 4.4.0-116-generic SMP mod_unload modversions -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to xorg in Ubuntu. https://bugs.launchpad.net/bugs/1750937 Title: 4.4.0-116 Kernel update on 2/21 breaks Nvidia drivers (on 14.04 and 16.04) by an insufficient compiler! Status in gcc: New Status in linux package in Ubuntu: Confirmed Status in nvidia-graphics-drivers-384 package in Ubuntu: Confirmed Status in xorg package in Ubuntu: Confirmed Bug description: Running fine with nvidia-384 until this kernel update came along. When booted into the new kernel, got super low resolution and nvidia- settings was missing most of its functionality - could not change resolution. Rebooted into 4.4.0-112 kernel and all was well. The root cause of the problem has been found to be installing the -116 kernel without a sufficiently updated version of gcc. In my case, my system received the gcc update AFTER the kernel update. Uninstalling the -116 kernel and reinstalling it with the updated version of gcc solved the problem for me. ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: xorg 1:7.7+1ubuntu8.1 ProcVersionSignature: Ubuntu 4.4.0-112.135~14.04.1-generic 4.4.98 Uname: Linux 4.4.0-112-generic x86_64 NonfreeKernelModules:
[Touch-packages] [Bug 1577596] Re: ntpd not started when using ntpdate
*** This bug is a duplicate of bug 1575572 *** https://bugs.launchpad.net/bugs/1575572 Another observation, the ntpdate command is really slow on Ubuntu 14.04 and 16.04. On average it takes about 6.1 seconds to run the ntpdate command, I am running ntpdate after boot. Our Red hat 6.8 machines take about 0.1 seconds. We manage 300+ Ubuntu 14.04 and 16.04 systems and I checked ntpdate on several of them. Still can't figure out why most machines work but some consistently fail to start ntpd. Even the ones that do start ntpd ntpdate still takes 6+ seconds to run. And we have our own in house stratum 1 time servers that are on the same internal network. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1577596 Title: ntpd not started when using ntpdate Status in init-system-helpers package in Ubuntu: Confirmed Status in ntp package in Ubuntu: Confirmed Bug description: After updating from 14.04 to 16.04 on a number of my systems, ntpd no longer starts at boot on any of those systems. `systemctl status ntp` shows: ntp.service - LSB: Start NTP daemon Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled) Active: inactive (dead) Docs: man:systemd-sysv-generator(8) May 02 19:10:14 host systemd[1]: Stopped LSB: Start NTP daemon. May 02 19:10:17 host systemd[1]: Stopped LSB: Start NTP daemon. Manually starting it using `systemctl start ntp` works fine. However, systemd does not seem to want to start it automatically at boot time. As best as I can tell based on trial and error, there is something special about the combination of the service being named "ntp.service" and the service depending on network.target. However, I haven't been able to identify exactly what is causing this. If I copy the init script to any other name, everything works fine: cp /etc/init.d/ntp /etc/init.d/ntpd Edit /etc/init.d/ntpd and change "Provides: ntp" to "Provides: ntpd" systemctl enable ntpd # After a reboot, ntpd.service is started, but ntp.service is not. If I remove "$network" from the "# Required-Start: $network $remote_fs $syslog" line in /etc/init.d/ntp, then systemd starts it automatically ... But of course it is started before the network comes up, so it fails. If I replace /etc/init.d/ntp with a file containing only the following, systemd won't try to start it automatically at boot: #!/bin/sh ### BEGIN INIT INFO # Provides: ntp # Required-Start: $network # Required-Stop: $network # Default-Start: 2 3 4 5 # Default-Stop: 1 # Short-Description: Start NTP daemon ### END INIT INFO echo "script was run" >> /ntp.log If I rename that same dummy script to /etc/init.d/ntp2, it is started automatically at boot. However, grepping the systemd source code and my systemd config files for ntp doesn't seem to find anything that might cause this behavior: /etc/systemd# grep -iR ntp * timesyncd.conf:#NTP= timesyncd.conf:#FallbackNTP=ntp.ubuntu.com /lib/systemd# grep -R ntp * system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/ntpd system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/openntpd Binary file systemd-networkd matches Binary file systemd-timedated matches Binary file systemd-timesyncd matches What else can I do to debug this further? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/init-system-helpers/+bug/1577596/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1577596] Re: ntpd not started when using ntpdate
*** This bug is a duplicate of bug 1575572 *** https://bugs.launchpad.net/bugs/1575572 I am seeing this on some but not all systems. I can only manually start after first doing a stop: root@tophat:~# ps -ef | grep ntp root 2380 2365 0 11:51 pts/800:00:00 grep --color=auto ntp root@tophat:~# systemctl start ntp root@tophat:~# ps -ef | grep ntp root 2384 2365 0 11:51 pts/800:00:00 grep --color=auto ntp root@tophat:~# systemctl stop ntp root@tophat:~# systemctl start ntp root@tophat:~# ps -ef | grep ntp ntp 2414 1 0 11:51 ?00:00:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 111:116 root 2416 2365 0 11:51 pts/800:00:00 grep --color=auto ntp -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to ntp in Ubuntu. https://bugs.launchpad.net/bugs/1577596 Title: ntpd not started when using ntpdate Status in init-system-helpers package in Ubuntu: Confirmed Status in ntp package in Ubuntu: Confirmed Bug description: After updating from 14.04 to 16.04 on a number of my systems, ntpd no longer starts at boot on any of those systems. `systemctl status ntp` shows: ntp.service - LSB: Start NTP daemon Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled) Active: inactive (dead) Docs: man:systemd-sysv-generator(8) May 02 19:10:14 host systemd[1]: Stopped LSB: Start NTP daemon. May 02 19:10:17 host systemd[1]: Stopped LSB: Start NTP daemon. Manually starting it using `systemctl start ntp` works fine. However, systemd does not seem to want to start it automatically at boot time. As best as I can tell based on trial and error, there is something special about the combination of the service being named "ntp.service" and the service depending on network.target. However, I haven't been able to identify exactly what is causing this. If I copy the init script to any other name, everything works fine: cp /etc/init.d/ntp /etc/init.d/ntpd Edit /etc/init.d/ntpd and change "Provides: ntp" to "Provides: ntpd" systemctl enable ntpd # After a reboot, ntpd.service is started, but ntp.service is not. If I remove "$network" from the "# Required-Start: $network $remote_fs $syslog" line in /etc/init.d/ntp, then systemd starts it automatically ... But of course it is started before the network comes up, so it fails. If I replace /etc/init.d/ntp with a file containing only the following, systemd won't try to start it automatically at boot: #!/bin/sh ### BEGIN INIT INFO # Provides: ntp # Required-Start: $network # Required-Stop: $network # Default-Start: 2 3 4 5 # Default-Stop: 1 # Short-Description: Start NTP daemon ### END INIT INFO echo "script was run" >> /ntp.log If I rename that same dummy script to /etc/init.d/ntp2, it is started automatically at boot. However, grepping the systemd source code and my systemd config files for ntp doesn't seem to find anything that might cause this behavior: /etc/systemd# grep -iR ntp * timesyncd.conf:#NTP= timesyncd.conf:#FallbackNTP=ntp.ubuntu.com /lib/systemd# grep -R ntp * system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/ntpd system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/openntpd Binary file systemd-networkd matches Binary file systemd-timedated matches Binary file systemd-timesyncd matches What else can I do to debug this further? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/init-system-helpers/+bug/1577596/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp