** Summary changed: - ad_use_ldaps error could not start tls encryption + ldap_install_tls occasionally fails due to watchdog timeout when using ad_use_ldaps with tls
** Description changed: - New sssd.conf variable ad_use_ldaps not working. On starting sssd it - errors with "sssd[be[13765]: Could not start TLS encryption. (unknown - error code)" + [Impact] - # lsb_release -rd - Description: Ubuntu 18.04.5 LTS - Release: 18.04 - Note: problem also seen with Ubuntu 20.04.2 - # apt-cache policy sssd | grep Installed - Installed: 1.16.1-1ubuntu1.7 + If you enable ad_use_ldaps on your sssd config, and have your sssd + configured to use TLS instead of the regular GSS-SPNEGO or GSSAPI + encryption, if you have a slow AD server or a busy network, the watchdog + could timeout the call to ldap_install_tls() before it completes, and + you won't be able to connect to the AD server, since the TLS handshake + will fail. - Expectation - Adding ad_use_ldaps to a working AD integrated /etc/sssd/sssd.conf to use port 636 instead of port 389 due ADV 190023. Reference https://bugs.launchpad.net/ubuntu/focal/+source/sssd/+bug/1868703/ + If you set debug_level to 4 or higher, you will see the following in + sssd_ldap_server.log: - Problem - Added a working Public root CA cert to the common ca-certificate (/etc/ssl/ca-certificates) and /etc/ldap/ldap.conf has following set: - TLS_CACERT /etc/ssl/certs/ca-certificates.crt - An ldapsearch using the above certificate bundle against LDAPS is successful: - - # openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005) - # ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperu...@company.com SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip> - # Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip> - - sssd.conf is configured with: - [sssd] - domains = company.com - config_file_version = 2 - services = nss, pam - - [domain/company.com] - ad_domain = company.com - krb5_realm = company.com - realmd_tags = manages-system joined-with-adcli - cache_credentials = True - id_provider = ad - krb5_store_password_if_offline = True - default_shell = /bin/bash - use_fully_qualified_names = True - fallback_homedir = /home/%u@%d - ldap_id_mapping = True - ad_use_ldaps = True - ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt - auth_provider = ad - access_provider = simple - simple_allow_groups = linux-admins - - Stopping sssd, clearing sssd cache, starting sssd returns following error: - sssd[be[13765]: Could not start TLS encryption. (unknown error code) - - Setting debug_level = 4 (or higher) returns following around this unknown error: [set_server_common_status] (0x0100): Marking server 'ad-server.company.com' as 'name resolved' [be_resolve_server_process] (0x0200): Found address for server ad-server.company.com: [y.y.y.y] TTL 3600 [ad_resolve_callback] (0x0100): Constructed uri 'ldaps://ad-server.company.com' [ad_resolve_callback] (0x0100): Constructed GC uri 'ldaps://ad-server.company.com' [sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting [sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect error] [(unknown error code)] [sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for ldap:[0x55d1149ef6e0] sd:[18] [sss_ldap_init_state_destructor] (0x0400): closing socket [18] [sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [5]: Input/output error. [fo_set_port_status] (0x0100): Marking port 389 of server 'ad-server.company.com' as 'not working' [fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ad-server.company.com' as 'not working' + + ldapsearch with ldaps will work correctly in the same environment: + + # openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005) + # ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperu...@company.com SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip> + # Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip> + + A workaround is to simply try again, since this a race condition, and + you might beat the watchdog on subsequent retries. Otherwise, disable + ad_use_ldaps until a fix is available. + + [Testcase] + + You will need a Windows 2k19 server with Active Directory installed and + configured, and create some users in Active Directory. + + On the Ubuntu client, join the AD server using realm. You will need to + import the AD certificate too. + + When importing the TLS certificate, you can add it to /etc/ssl/ca-certificates, and edit /etc/ldap/ldap.conf and set: + TLS_CACERT /etc/ssl/certs/ca-certificates.crt + + Edit /etc/sssd/sssd.conf and ensure that ldap_tls_cacert is set + correctly to "ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt", and + enable "ad_use_ldaps = True". + + Then restart sssd with: + + $ sudo systemctl restart sssd.service + + If you have a slow server or busy network, the watchdog will kill the + call to ldap_install_tls() before it completes, and sssd will fail to + start. You may need several attempts to reproduce. Just keep restarting + sssd.service. + + [Where problems could occur] + + The changes only affect users who implement ad_use_ldaps, and only those + who use TLS. Those using GSS-SPNEGO with ad_use_ldaps would not be + affected, and neither those not using ad_use_ldaps. + + The patch checks for failure of TLS handshake with the AD server, and + adds a retry if the failure was caused by the watchdog killing the call + to ldap_install_tls(). This happens very early on in sssd service + startup, and if a regression were to occur, a system administrator would + notice almost immediately and downgrade the package. + + If a regression were to occur, a workaround is to 1) change from tls to + GSS_SPNEGO, or 2) disable ad_use_ldaps. + + [Other info] + + This is reported upstream in: + + https://github.com/SSSD/sssd/issues/5531 + + The commit which fixes the issue is: + + commit da55e3e69707de416b7949d08c165c950090bbb6 + From: Iker Pedrosa <ipedr...@redhat.com> + Date: Wed, 3 Mar 2021 15:34:49 +0100 + Subject: ldap: retry ldap_install_tls() when watchdog interruption + Link: https://github.com/SSSD/sssd/commit/da55e3e69707de416b7949d08c165c950090bbb6 + + This landed in sssd 2.5.0, so Bionic, Focal, Hirsute and Impish all + require fixing. The commit is a cherry pick to Focal, Hirsute and + Impish, while Bionic requires a backport for minor context adjustments. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1921494 Title: ldap_install_tls occasionally fails due to watchdog timeout when using ad_use_ldaps with tls To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1921494/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs