[Touch-packages] [Bug 1750937] Re: 4.4.0-116 Kernel update on 2/21 breaks Nvidia drivers (on 14.04 and 16.04) by an insufficient compiler!

2018-03-03 Thread John Sopko
I manage 300+ machines that run openafs that has a dkms built kernel
module like the nvidia module that needs to be built. I also manage
dozens of nvidia gpu servers where users have sudo access and can
install anything they want. Here is a snippet of what I found. Note this
is for 16.04 systems but 14.04 systems running the 4.4.0-116 kernel will
have similar problems:

Short story, if your machine is not using the Ubuntu supplied gcc you
will have issues with afs and nvidia built kernel modules or any dkms
built kernel modules. Longer story below.

NOTE!  this problem affects at least, openafs, nvidia, virtual box or
any dkms built module. I am going to forward this info to
gr...@cs.unc.edu. This started with the latest Ubuntu  4.4.0-116
kernel version.

Looking through that bug and testing took me hours. The short story is
the machines having issues with openafs.ko module are ones that have
the Ubuntu toolchain ppa that has a gcc compiler suite that does not
support the "retpoline" feature which was recently put in to fix the
Spectre security issue. The nvidia module will also have issues.

The machines using the Ubuntu supplied gcc compiler are the ones that
are not having issues. But, host olympia was a special case.

The compiler that works, using "gcc -v"

gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)

The ones that don't work like host bvisionserver8:

gcc version 5.4.1 20160904

You can use "apt-cache policy gcc" to show what repo the compiler
comes from. WARNING, /usr/bin/gcc is a link to /usr/bin/gcc-5, the gcc
package is a meta package and you need to query gcc-5. If you query
gcc it shows coming from the standard Ubuntu repo but /usr/bin/gcc-5
is coming from the toolchain repo.

A good gcc-5 shows:


classroom:55% apt-cache policy gcc-5
gcc-5:
  Installed: 5.4.0-6ubuntu1~16.04.9
  Candidate: 5.4.0-6ubuntu1~16.04.9
  Version table:
 *** 5.4.0-6ubuntu1~16.04.9 500
500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main
amd64 Packages
500 http://security.ubuntu.com/ubuntu xenial-security/main
amd64 Packages
100 /var/lib/dpkg/status
 5.3.1-14ubuntu2 500
500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages


The bad compilers show:
--

bvisionserver8:/> apt-cache policy gcc-5
gcc-5:
  Installed: 5.4.1-2ubuntu1~16.04
  Candidate: 5.4.1-2ubuntu1~16.04
  Version table:
 *** 5.4.1-2ubuntu1~16.04 500
500 http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu
xenial/main amd64 Packages
100 /var/lib/dpkg/status
 5.4.0-6ubuntu1~16.04.9 500
500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main
amd64 Packages
500 http://security.ubuntu.com/ubuntu xenial-security/main
amd64 Packages
 5.3.1-14ubuntu2 500
500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages

And you can see
/etc/apt/sources.list.d/ubuntu-toolchain-r-ubuntu-test-xenial.list
repo is configure on those machines.


On a good machine modinfo openafs shows that retpoline is turned on in
the vermagic: line:

classroom:56% modinfo openafs
filename:   /lib/modules/4.4.0-116-generic/updates/dkms/openafs.ko
license:http://www.openafs.org/dl/license10.html
srcversion: 4E1BEB8CE16072EF8E64542
depends:
vermagic:   4.4.0-116-generic SMP mod_unload modversions retpoline

And not turned on a bad machine:

bvisionserver8:/> modinfo openafs
filename:   /lib/modules/4.4.0-116-generic/updates/dkms/openafs.ko
license:http://www.openafs.org/dl/license10.html
srcversion: 66044F5DC18AA3288DB22FF
depends:
vermagic:   4.4.0-116-generic SMP mod_unload modversions

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to xorg in Ubuntu.
https://bugs.launchpad.net/bugs/1750937

Title:
  4.4.0-116 Kernel update on 2/21 breaks Nvidia drivers (on 14.04 and
  16.04) by an insufficient compiler!

Status in gcc:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-384 package in Ubuntu:
  Confirmed
Status in xorg package in Ubuntu:
  Confirmed

Bug description:
  Running fine with nvidia-384 until this kernel update came along.
  When booted into the new kernel, got super low resolution and nvidia-
  settings was missing most of its functionality - could not change
  resolution.

  Rebooted into 4.4.0-112 kernel and all was well.

  The root cause of the problem has been found to be installing the -116
  kernel without a sufficiently updated version of gcc.  In my case, my
  system received the gcc update AFTER the kernel update.

  Uninstalling the -116 kernel and reinstalling it with the updated
  version of gcc solved the problem for me.

  
  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: xorg 1:7.7+1ubuntu8.1
  ProcVersionSignature: Ubuntu 4.4.0-112.135~14.04.1-generic 4.4.98
  Uname: Linux 4.4.0-112-generic x86_64
  NonfreeKernelModules:

[Touch-packages] [Bug 1577596] Re: ntpd not started when using ntpdate

2016-07-01 Thread John Sopko
*** This bug is a duplicate of bug 1575572 ***
https://bugs.launchpad.net/bugs/1575572

Another observation, the ntpdate command is really slow on Ubuntu 14.04
and 16.04. On average it takes about 6.1 seconds to run the ntpdate
command, I am running ntpdate after boot. Our Red hat 6.8 machines take
about 0.1 seconds. We manage 300+ Ubuntu 14.04 and 16.04 systems and I
checked ntpdate on several of them. Still can't figure out why most
machines work but some consistently fail to start ntpd. Even the ones
that do start ntpd ntpdate still takes 6+ seconds to run. And we have
our own in house stratum 1 time servers that are on the same internal
network.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to ntp in Ubuntu.
https://bugs.launchpad.net/bugs/1577596

Title:
  ntpd not started when using ntpdate

Status in init-system-helpers package in Ubuntu:
  Confirmed
Status in ntp package in Ubuntu:
  Confirmed

Bug description:
  After updating from 14.04 to 16.04 on a number of my systems, ntpd no
  longer starts at boot on any of those systems.

  `systemctl status ntp` shows:
 ntp.service - LSB: Start NTP daemon
 Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
 Active: inactive (dead)
   Docs: man:systemd-sysv-generator(8)
  May 02 19:10:14 host systemd[1]: Stopped LSB: Start NTP daemon.
  May 02 19:10:17 host systemd[1]: Stopped LSB: Start NTP daemon.

  Manually starting it using `systemctl start ntp` works fine.  However,
  systemd does not seem to want to start it automatically at boot time.


  As best as I can tell based on trial and error, there is something
  special about the combination of the service being named "ntp.service"
  and the service depending on network.target.  However, I haven't been
  able to identify exactly what is causing this.

  If I copy the init script to any other name, everything works fine:
  cp /etc/init.d/ntp /etc/init.d/ntpd
  Edit /etc/init.d/ntpd and change "Provides: ntp" to "Provides: ntpd"
  systemctl enable ntpd
  # After a reboot, ntpd.service is started, but ntp.service is not.

  If I remove "$network" from the "# Required-Start: $network $remote_fs
  $syslog" line in /etc/init.d/ntp, then systemd starts it automatically
  ... But of course it is started before the network comes up, so it
  fails.

  If I replace /etc/init.d/ntp with a file containing only the following, 
systemd won't try to start it automatically at boot:
  #!/bin/sh
  ### BEGIN INIT INFO
  # Provides: ntp
  # Required-Start: $network
  # Required-Stop: $network
  # Default-Start: 2 3 4 5
  # Default-Stop: 1
  # Short-Description: Start NTP daemon
  ### END INIT INFO
  echo "script was run" >> /ntp.log

  If I rename that same dummy script to /etc/init.d/ntp2, it is started
  automatically at boot.

  However, grepping the systemd source code and my systemd config files for ntp 
doesn't seem to find anything that might cause this behavior:
  /etc/systemd# grep -iR ntp *
  timesyncd.conf:#NTP=
  timesyncd.conf:#FallbackNTP=ntp.ubuntu.com
  /lib/systemd# grep -R ntp *
  
system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/ntpd
  
system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/openntpd
  Binary file systemd-networkd matches
  Binary file systemd-timedated matches
  Binary file systemd-timesyncd matches

  What else can I do to debug this further?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/init-system-helpers/+bug/1577596/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1577596] Re: ntpd not started when using ntpdate

2016-06-30 Thread John Sopko
*** This bug is a duplicate of bug 1575572 ***
https://bugs.launchpad.net/bugs/1575572

I am seeing this on some but not all systems. I can only manually start
after first doing a stop:

root@tophat:~# ps -ef | grep ntp
root  2380  2365  0 11:51 pts/800:00:00 grep --color=auto ntp
root@tophat:~# systemctl start ntp
root@tophat:~# ps -ef | grep ntp
root  2384  2365  0 11:51 pts/800:00:00 grep --color=auto ntp
root@tophat:~# systemctl stop ntp
root@tophat:~# systemctl start ntp
root@tophat:~# ps -ef | grep ntp
ntp   2414 1  0 11:51 ?00:00:00 /usr/sbin/ntpd -p 
/var/run/ntpd.pid -g -u 111:116
root  2416  2365  0 11:51 pts/800:00:00 grep --color=auto ntp

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to ntp in Ubuntu.
https://bugs.launchpad.net/bugs/1577596

Title:
  ntpd not started when using ntpdate

Status in init-system-helpers package in Ubuntu:
  Confirmed
Status in ntp package in Ubuntu:
  Confirmed

Bug description:
  After updating from 14.04 to 16.04 on a number of my systems, ntpd no
  longer starts at boot on any of those systems.

  `systemctl status ntp` shows:
 ntp.service - LSB: Start NTP daemon
 Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
 Active: inactive (dead)
   Docs: man:systemd-sysv-generator(8)
  May 02 19:10:14 host systemd[1]: Stopped LSB: Start NTP daemon.
  May 02 19:10:17 host systemd[1]: Stopped LSB: Start NTP daemon.

  Manually starting it using `systemctl start ntp` works fine.  However,
  systemd does not seem to want to start it automatically at boot time.


  As best as I can tell based on trial and error, there is something
  special about the combination of the service being named "ntp.service"
  and the service depending on network.target.  However, I haven't been
  able to identify exactly what is causing this.

  If I copy the init script to any other name, everything works fine:
  cp /etc/init.d/ntp /etc/init.d/ntpd
  Edit /etc/init.d/ntpd and change "Provides: ntp" to "Provides: ntpd"
  systemctl enable ntpd
  # After a reboot, ntpd.service is started, but ntp.service is not.

  If I remove "$network" from the "# Required-Start: $network $remote_fs
  $syslog" line in /etc/init.d/ntp, then systemd starts it automatically
  ... But of course it is started before the network comes up, so it
  fails.

  If I replace /etc/init.d/ntp with a file containing only the following, 
systemd won't try to start it automatically at boot:
  #!/bin/sh
  ### BEGIN INIT INFO
  # Provides: ntp
  # Required-Start: $network
  # Required-Stop: $network
  # Default-Start: 2 3 4 5
  # Default-Stop: 1
  # Short-Description: Start NTP daemon
  ### END INIT INFO
  echo "script was run" >> /ntp.log

  If I rename that same dummy script to /etc/init.d/ntp2, it is started
  automatically at boot.

  However, grepping the systemd source code and my systemd config files for ntp 
doesn't seem to find anything that might cause this behavior:
  /etc/systemd# grep -iR ntp *
  timesyncd.conf:#NTP=
  timesyncd.conf:#FallbackNTP=ntp.ubuntu.com
  /lib/systemd# grep -R ntp *
  
system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/ntpd
  
system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/openntpd
  Binary file systemd-networkd matches
  Binary file systemd-timedated matches
  Binary file systemd-timesyncd matches

  What else can I do to debug this further?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/init-system-helpers/+bug/1577596/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp