Public bug reported: TLDR:
attempting to run ntp in standalone instead of dhcp driven somehow leads to logind/dbus pam timeouts and throw errors to auth.log and results in pam logins via console/ssh/su to be delayed by 25s for said timeout but otherwise work. at first I thought I had a regression on 1591411 but this is different. I do not understand why this happened but presume it is because nm has some weird side effect on network-pre.target so disabling the two ntp units didn't work as I intended. I have since run into 1823098 which has the same starting issue I had which is to run ntp in standalone mode with a provided /etc/ntp.conf that isn't modified by dhcp. I am filing this bug in case somebody else travels down this road, hits these log messages and wonders wtf broke. the ntp issues: - there should be a knob in /etc/default/ntp to specify which form of ntp you want (none/standalone/dhcp/netplan/etc) - vaguely using netplan 'use-ntp' shouldn't be the only way to handle this and even that's locked into a bool instead of a selection since it has sync semantics - dhcp shouldn't assume 'ntp-servers' is requested without checking for it. there's no /etc/default/dhcp that I can see so you have to modify /etc/dhcp/dhclient.conf directly as it is in the request block and not a useful hook. perhaps there's an intended way to do this but I didn't run into it thus far. open to suggestions. meanwhile my solution was simply to disable the two units and no longer be a network-pre.target. it seemed to work for 2days and then I started seeing these on pam auths: === DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS" $ uptime #all users are just xterm login shells for me in X 23:34:24 up 34 days, 2:55, 31 users, load average: 0.55, 0.44, 0.38 $ dpkg -l|egrep 'systemd |dbus-user-session'|awk '{print $1,$2,$3,$4}'|column -t ii dbus-user-session 1.12.2-1ubuntu1.2 amd64 ii libsystemd0:amd64 237-3ubuntu10.42 amd64 ii libsystemd0:i386 237-3ubuntu10.42 i386 ii python3-systemd 234-1build1 amd64 ii systemd 237-3ubuntu10.42 amd64 $ systemd --version systemd 237 +PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid $ ps laxwww|grep dbus-[d] 4 106 1021 1 20 0 52740 3548 - Ss ? 13:45 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 0 1000 1839 1712 20 0 51580 2160 ep_pol Ss ? 1:37 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 0 1000 2286 2273 20 0 50052 984 ep_pol S ? 0:16 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/accessibility.conf --nofork --print-address 3 ===8<: trigger1 $ date;sudo su - build -c /bin/date Thu Feb 18 22:54:40 PST 2021 [sudo] password for luser: Thu Feb 18 22:55:06 PST 2021 ===8<: trigger1 auth.log Feb 18 22:54:10 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) Feb 18 22:54:10 bahx sshd[22024]: pam_systemd(sshd:session): Failed to create session: Connection timed out Feb 18 22:54:10 bahx systemd-logind[1030]: Failed to start session scope session-941.scope: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) Feb 18 22:54:10 bahx sshd[22090]: Received disconnect from 127.0.0.1 port 50456:11: disconnected by user Feb 18 22:54:10 bahx sshd[22090]: Disconnected from user luser 127.0.0.1 port 50456 Feb 18 22:54:10 bahx sshd[22024]: pam_unix(sshd:session): session closed for user luser Feb 18 22:54:41 bahx sudo: luser : TTY=pts/26 ; PWD=/home/luser ; USER=root ; COMMAND=/bin/su - build -c /bin/date Feb 18 22:54:41 bahx sudo: pam_unix(sudo:session): session opened for user root by luser(uid=0) Feb 18 22:54:41 bahx su[22111]: Successful su for build by root Feb 18 22:54:41 bahx su[22111]: + /dev/pts/26 root:build Feb 18 22:54:41 bahx su[22111]: pam_unix(su:session): session opened for user build by luser(uid=0) Feb 18 22:55:06 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) Feb 18 22:55:06 bahx su[22111]: pam_systemd(su:session): Failed to create session: Connection timed out Feb 18 22:55:06 bahx systemd-logind[1030]: Failed to start user slice user-20000.slice, ignoring: Connection timed out (org.freedesktop.DBus.Error.Timeout) Feb 18 22:55:06 bahx su[22111]: pam_unix(su:session): session closed for user build Feb 18 22:55:06 bahx sudo: pam_unix(sudo:session): session closed for user root ===8<: trigger2 $ date;ssh localhost date Thu Feb 18 22:55:52 PST 2021 generic banner luser@localhost's password: Thu Feb 18 22:56:18 PST 2021 ===8<: trigger2 auth.log Feb 18 22:55:53 bahx sshd[22135]: Accepted password for luser from 127.0.0.1 port 50608 ssh2 Feb 18 22:55:53 bahx sshd[22135]: pam_unix(sshd:session): session opened for user luser by (uid=0) Feb 18 22:55:56 bahx systemd-logind[1030]: Failed to start session scope session-2.scope: Connection timed out Feb 18 22:55:56 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) Feb 18 22:56:12 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedesktop.nm_dispatcher': timed out (service_start_timeout=25000ms) Feb 18 22:56:18 bahx sshd[22135]: pam_systemd(sshd:session): Failed to create session: Connection timed out Feb 18 22:56:18 bahx sshd[22202]: Received disconnect from 127.0.0.1 port 50608:11: disconnected by user Feb 18 22:56:18 bahx sshd[22202]: Disconnected from user luser 127.0.0.1 port 50608 Feb 18 22:56:18 bahx sshd[22135]: pam_unix(sshd:session): session closed for user luser Feb 18 22:56:21 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) Feb 18 22:56:21 bahx systemd-logind[1030]: Failed to stop user service: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) === all other references to these log messages end up with a generic restart logind/daemon reexec/reload and ignore the cause so I didn't repeat them blindly. instead i reverted and reenabled/started the two units I disabled/stopped and voila pam worked and no 25s delay on login. === $ for i in /var/log/auth.log*;do echo $i|egrep -q gz;(test $? -eq 0&&gzip -dc $i||cat $i)|egrep -i systemd1;done|awk 'BEGIN{IFS=":"}{print $1,$2}'|sort|uniq -c 2 Feb 15 277 Feb 17 424 Feb 18 #mostly me triaging ssh/sudo su $ for i in /var/log/auth.log*;do echo $i|egrep -q gz\$;(test $? -eq 0&&gzip -dc $i||cat $i)|egrep -i systemd1|awk '{print "'$i'",$0}';done|egrep 'Feb 15' /var/log/auth.log Feb 15 06:51:05 bahx polkitd(authority=local): Operator of unix-session:c2 FAILED to authenticate to gain authorization for action org.freedesktop.systemd1.manage-units for system-bus-name::1.12149 [systemctl start ntp.service] (owned by unix-user:luser) /var/log/auth.log Feb 15 07:06:43 bahx polkitd(authority=local): Operator of unix-session:c2 FAILED to authenticate to gain authorization for action org.freedesktop.systemd1.manage-units for system-bus-name::1.12173 [systemctl reset-failed ntp-systemd-netif.service] (owned by unix-user:luser) === this was me attempting to shut them down in a xterm without sudo and canceling the auth dialog and redoing them with sudo. the units disabled/stopped: ntp-systemd-netif.service ntp-systemd-netif.path because of: === $ systemctl list-units --failed --all UNIT LOAD ACTIVE SUB DESCRIPTION ● ntp-systemd-netif.service loaded failed failed ntp-systemd-netif.service with: sudo systemctl reset-failed ntp-systemd-netif.service === at this point I have disabled/stopped the units again and pam is still happy. I'll let it run to see if it returns. i did change one thing though since then. because the failure of the unit was because the target script hook for ntp was removed I replaced the ntp hook with a simple exit 0 bash script so it doesn't trigger failure. perhaps that will be enough to prevent this cascade? time will tell in any event something this minor in a single daemon shouldn't break pam across the system I should think. that's rather a violent overreach methinks. whether this classifies under nm, dbus, systemd-logind, or ntp is the real mystery. nm barked first it seems. there's some nuance in ntp's systemd config not aligning with nm I guess? why it took 2 days to trigger I assume is like the previous issue requiring enough usage or SOMAXCONN or something like it before 'it' happens. still tbd. i triggered another nonsudo auth dialog to try to reproduce but seems fine still ** Affects: ubuntu Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1916226 Title: pam login 25s timeout delay due to tickling ntp systemd units To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1916226/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs