[Bug 1777860] Re: Sssd doesn't clean up PIDfile after crash

Karl Stenerud Tue, 30 Oct 2018 03:52:26 -0700

** Changed in: sssd (Ubuntu Xenial)
     Assignee: (unassigned) => Karl Stenerud (kstenerud)


** Description changed:

+ [Impact]
+ 
+ sssd doesn't check for PIDfile validity when starting. As a result, a
+ PID file from a crashed sssd process will prevent it from launching
+ again until the pidfile is manually removed.
+ 
+ [Test Case]
+ 
+ $ lxc launch ubuntu:xenial tester
+ $ lxc exec tester bash
+ 
+ # apt update
+ # apt dist-upgrade -y
+ # apt install -y sssd
+ # echo "[nss]
+ filter_groups = root
+ filter_users = root
+ reconnection_retries = 3
+ 
+ [pam]
+ reconnection_retries = 3
+ 
+ [sssd]
+ config_file_version = 2
+ reconnection_retries = 3
+ sbus_timeout = 30
+ services = nss, pam
+ domains = europe.example.com,asia.example.com
+ 
+ [domain/europe.example.com]
+ #With this as false, a simple "getent passwd" for testing won't work. You 
must do getent passwd [email protected]
+ enumerate = false
+ cache_credentials = true
+ 
+ id_provider = ldap
+ access_provider = ldap
+ auth_provider = krb5
+ chpass_provider = krb5
+ 
+ ldap_uri = ldaps://dc1.europe.example.com,ldaps://dc2.europe.example.com
+ ldap_search_base = dc=europe,dc=example,dc=com
+ ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
+ 
+ #This parameter requires that the DC present a completely validated 
certificate chain. If you're testing or don't care, use 'allow' or 'never'.
+ ldap_tls_reqcert = demand
+ 
+ krb5_realm = EUROPE.EXAMPLE.COM
+ dns_discovery_domain = EUROPE.EXAMPLE.COM
+ 
+ ldap_schema = rfc2307bis
+ ldap_access_order = expire
+ ldap_account_expire_policy = ad
+ ldap_force_upper_case_realm = true
+ 
+ ldap_user_search_base = dc=europe,dc=example,dc=com
+ ldap_group_search_base = dc=europe,dc=example,dc=com
+ ldap_user_object_class = user
+ ldap_user_name = sAMAccountName
+ ldap_user_fullname = displayName
+ ldap_user_home_directory = unixHomeDirectory
+ ldap_user_principal = userPrincipalName
+ ldap_group_object_class = group
+ ldap_group_name = sAMAccountName
+ 
+ #Bind credentials
+ ldap_default_bind_dn = 
cn=europe-ldap-reader,cn=Users,dc=europe,dc=example,dc=com
+ ldap_default_authtok = secret
+ 
+ [domain/asia.example.com]
+ #With this as false, a simple "getent passwd" for testing won't work. You 
must do getent passwd [email protected]
+ enumerate = false
+ cache_credentials = true
+ 
+ id_provider = ldap
+ access_provider = ldap
+ auth_provider = krb5
+ chpass_provider = krb5
+ 
+ ldap_uri = ldaps://dc1.asia.example.com,ldaps://dc2.asia.example.com
+ ldap_search_base = dc=asia,dc=example,dc=com
+ ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
+ 
+ #This parameter requires that the DC present a completely validated 
certificate chain. If you're testing or don't care, use 'allow' or 'never'.
+ ldap_tls_reqcert = demand
+ 
+ krb5_realm = ASIA.EXAMPLE.COM
+ dns_discovery_domain = ASIA.EXAMPLE.COM
+ 
+ ldap_schema = rfc2307bis
+ ldap_access_order = expire
+ ldap_account_expire_policy = ad
+ ldap_force_upper_case_realm = true
+ 
+ ldap_user_search_base = dc=asia,dc=example,dc=com
+ ldap_group_search_base = dc=asia,dc=example,dc=com
+ ldap_user_object_class = user
+ ldap_user_name = sAMAccountName
+ ldap_user_fullname = displayName
+ ldap_user_home_directory = unixHomeDirectory
+ ldap_user_principal = userPrincipalName
+ ldap_group_object_class = group
+ ldap_group_name = sAMAccountName
+ 
+ #Bind credentials
+ ldap_default_bind_dn = cn=asia-ldap-reader,cn=Users,dc=asia,dc=example,dc=com
+ ldap_default_authtok = secret" >/etc/sssd/sssd.conf
+ # chmod 600 /etc/sssd/sssd.conf
+ # service sssd start
+ # pkill -KILL -F /var/run/sssd.pid
+ # service sssd start
+ # journalctl -xe
+ Oct 30 10:25:46 xtest sssd[7110]: SSSD is already running
+ 
+ [Regression Potential]
+ 
+ The change would be to check if the pid in the file is still active,
+ which shouldn't cause regressions.
+ 
+ [Original Description]
+ 
  After having crashed, sssd will not start, because the old PIDfile is still 
present. The fact that the process does not exist any more does not cause the 
PIDfile to be cleaned up.
  This bug is already known, but not fixed, upstream: 
https://pagure.io/SSSD/sssd/issue/3528
  (also contains instructions for reproduction).
  
  In our environment, with hundreds of computers running Ubuntu, the
  'solution' brought forth in that discussion, to investigate and handle
  the issue manually, is not a serious option.
  
  So I propose that we make systemd handle the PIDfile in case of a crash.
  With the attached one-line patch applied, systemd will clean up the
  PIDfile after a crash. That way, sssd doesn't have to make assumptions
  about namespaces, but the package still handles the issue.
  
  Mandatory data:
  
  Ubuntu version:
-   Ubuntu 16.04.4 LTS
+   Ubuntu 16.04.4 LTS
  
  Package version:
-   apt-cache policy $(dpkg -S /lib/systemd/system/sssd.service )
-    sssd-common: Installed: 1.13.4-1ubuntu1.11
+   apt-cache policy $(dpkg -S /lib/systemd/system/sssd.service )
+    sssd-common: Installed: 1.13.4-1ubuntu1.11
  
  What I expect to happen:
  After
-   kill -9 $(cat /var/run/sssd.pid)
+   kill -9 $(cat /var/run/sssd.pid)
  the command
-   systemctl start sssd
+   systemctl start sssd
  results in a running sssd.
  
  What happens instead:
  No sssd is running. Only after
-   rm /var/run/sssd.pid
-   systemctl start sssd
+   rm /var/run/sssd.pid
+   systemctl start sssd
  does it run again.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1777860

Title:
  Sssd doesn't clean up PIDfile after crash

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1777860/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1777860] Re: Sssd doesn't clean up PIDfile after crash

Reply via email to