** Changed in: sssd (Ubuntu Xenial)
Assignee: (unassigned) => Karl Stenerud (kstenerud)
** Description changed:
+ [Impact]
+
+ sssd doesn't check for PIDfile validity when starting. As a result, a
+ PID file from a crashed sssd process will prevent it from launching
+ again until the pidfile is manually removed.
+
+ [Test Case]
+
+ $ lxc launch ubuntu:xenial tester
+ $ lxc exec tester bash
+
+ # apt update
+ # apt dist-upgrade -y
+ # apt install -y sssd
+ # echo "[nss]
+ filter_groups = root
+ filter_users = root
+ reconnection_retries = 3
+
+ [pam]
+ reconnection_retries = 3
+
+ [sssd]
+ config_file_version = 2
+ reconnection_retries = 3
+ sbus_timeout = 30
+ services = nss, pam
+ domains = europe.example.com,asia.example.com
+
+ [domain/europe.example.com]
+ #With this as false, a simple "getent passwd" for testing won't work. You
must do getent passwd [email protected]
+ enumerate = false
+ cache_credentials = true
+
+ id_provider = ldap
+ access_provider = ldap
+ auth_provider = krb5
+ chpass_provider = krb5
+
+ ldap_uri = ldaps://dc1.europe.example.com,ldaps://dc2.europe.example.com
+ ldap_search_base = dc=europe,dc=example,dc=com
+ ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
+
+ #This parameter requires that the DC present a completely validated
certificate chain. If you're testing or don't care, use 'allow' or 'never'.
+ ldap_tls_reqcert = demand
+
+ krb5_realm = EUROPE.EXAMPLE.COM
+ dns_discovery_domain = EUROPE.EXAMPLE.COM
+
+ ldap_schema = rfc2307bis
+ ldap_access_order = expire
+ ldap_account_expire_policy = ad
+ ldap_force_upper_case_realm = true
+
+ ldap_user_search_base = dc=europe,dc=example,dc=com
+ ldap_group_search_base = dc=europe,dc=example,dc=com
+ ldap_user_object_class = user
+ ldap_user_name = sAMAccountName
+ ldap_user_fullname = displayName
+ ldap_user_home_directory = unixHomeDirectory
+ ldap_user_principal = userPrincipalName
+ ldap_group_object_class = group
+ ldap_group_name = sAMAccountName
+
+ #Bind credentials
+ ldap_default_bind_dn =
cn=europe-ldap-reader,cn=Users,dc=europe,dc=example,dc=com
+ ldap_default_authtok = secret
+
+ [domain/asia.example.com]
+ #With this as false, a simple "getent passwd" for testing won't work. You
must do getent passwd [email protected]
+ enumerate = false
+ cache_credentials = true
+
+ id_provider = ldap
+ access_provider = ldap
+ auth_provider = krb5
+ chpass_provider = krb5
+
+ ldap_uri = ldaps://dc1.asia.example.com,ldaps://dc2.asia.example.com
+ ldap_search_base = dc=asia,dc=example,dc=com
+ ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
+
+ #This parameter requires that the DC present a completely validated
certificate chain. If you're testing or don't care, use 'allow' or 'never'.
+ ldap_tls_reqcert = demand
+
+ krb5_realm = ASIA.EXAMPLE.COM
+ dns_discovery_domain = ASIA.EXAMPLE.COM
+
+ ldap_schema = rfc2307bis
+ ldap_access_order = expire
+ ldap_account_expire_policy = ad
+ ldap_force_upper_case_realm = true
+
+ ldap_user_search_base = dc=asia,dc=example,dc=com
+ ldap_group_search_base = dc=asia,dc=example,dc=com
+ ldap_user_object_class = user
+ ldap_user_name = sAMAccountName
+ ldap_user_fullname = displayName
+ ldap_user_home_directory = unixHomeDirectory
+ ldap_user_principal = userPrincipalName
+ ldap_group_object_class = group
+ ldap_group_name = sAMAccountName
+
+ #Bind credentials
+ ldap_default_bind_dn = cn=asia-ldap-reader,cn=Users,dc=asia,dc=example,dc=com
+ ldap_default_authtok = secret" >/etc/sssd/sssd.conf
+ # chmod 600 /etc/sssd/sssd.conf
+ # service sssd start
+ # pkill -KILL -F /var/run/sssd.pid
+ # service sssd start
+ # journalctl -xe
+ Oct 30 10:25:46 xtest sssd[7110]: SSSD is already running
+
+ [Regression Potential]
+
+ The change would be to check if the pid in the file is still active,
+ which shouldn't cause regressions.
+
+ [Original Description]
+
After having crashed, sssd will not start, because the old PIDfile is still
present. The fact that the process does not exist any more does not cause the
PIDfile to be cleaned up.
This bug is already known, but not fixed, upstream:
https://pagure.io/SSSD/sssd/issue/3528
(also contains instructions for reproduction).
In our environment, with hundreds of computers running Ubuntu, the
'solution' brought forth in that discussion, to investigate and handle
the issue manually, is not a serious option.
So I propose that we make systemd handle the PIDfile in case of a crash.
With the attached one-line patch applied, systemd will clean up the
PIDfile after a crash. That way, sssd doesn't have to make assumptions
about namespaces, but the package still handles the issue.
Mandatory data:
Ubuntu version:
- Ubuntu 16.04.4 LTS
+ Ubuntu 16.04.4 LTS
Package version:
- apt-cache policy $(dpkg -S /lib/systemd/system/sssd.service )
- sssd-common: Installed: 1.13.4-1ubuntu1.11
+ apt-cache policy $(dpkg -S /lib/systemd/system/sssd.service )
+ sssd-common: Installed: 1.13.4-1ubuntu1.11
What I expect to happen:
After
- kill -9 $(cat /var/run/sssd.pid)
+ kill -9 $(cat /var/run/sssd.pid)
the command
- systemctl start sssd
+ systemctl start sssd
results in a running sssd.
What happens instead:
No sssd is running. Only after
- rm /var/run/sssd.pid
- systemctl start sssd
+ rm /var/run/sssd.pid
+ systemctl start sssd
does it run again.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1777860
Title:
Sssd doesn't clean up PIDfile after crash
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1777860/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs