On Fri, Oct 15, 2021 at 3:54 PM Aitor Pazos <[email protected]> wrote:
> Hi all, > Let me introduce the symptoms that triggered this investigation. > First, versions: > - OS: Ubuntu 20.04 > - SSSD: 2.2.3-3ubuntu0.7 > - Platform: x86_64 > > Some weeks after starting using a new region/provider we noticed some > intermittent issue impacting SSH logins (ie: clients got `Connection > closed by UNKNOWN port 65535` on `ssh ...`). When looking into it, I was > able to reproduce it somewhat reliably. It happens after some time of > inactivity for that user, but if I constantly issue commands (ie: `watch > sssctl user-checks <user>` or `watch ssh <host> whoami` from a client) the > issue wont trigger. If I stopped trying out those commands and tried again > ~30 minutes later, it would fail. > > With these symptoms I had these possibilities in mind: some caching issue, > some authentication expiring, connections pools issues... > > I enabled detailed logging and compared logs from a successful request and > a failed one and I found what seems to be the source of the error from sssd > logs perspective, but still can't make much sense of it. > Log extracted from sssd_<domain>.log: > ``` > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [ipa_hbac_rule_info_next] > (0x0400): Sending request for next search base: [cn=hbac,dc=ipa,dc=corp,<dc > <domain>>][2][(&(objectclass=ipaHBACRule)(ipaenabledflag=TRUE)(accessRuleType=allow)(|(hostCategory=all)(...)))] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_print_server] (0x2000): > Searching <ip10.1.10.102:389 > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x0400): calling ldap_search_ext with > [(&(objectclass=ipaHBACRule)(ipaenabledflag=TRUE)(accessRuleType=allow)(|(hostCategory=all)(...)][cn=hbac,dc=ipa,dc=corp,<dc > <domain>>]. > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [objectclass] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [cn] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [ipauniqueid] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [ipaenabledflag] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [accessRuleType] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [memberUser] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [userCategory] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [memberService] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [serviceCategory] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [sourceHost] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [sourceHostCategory] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [externalHost] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [memberHost] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x1000): Requesting attrs: [hostCategory] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] > (0x2000): ldap_search_ext called, msgid = 18 > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_op_add] (0x2000): New > operation 18 timeout 60 > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_process_result] (0x2000): > Trace: sh[0x55d349f96d00], connected[1], ops[0x55d349ff16e0], > ldap[0x55d34a10cba0] > (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_process_result] (0x2000): > Trace: end of ldap_result list > (Wed Oct 13 19:41:52 2021) [be[<domain>]] [sdap_op_timeout] (0x1000): > Issuing timeout for 18 > (Wed Oct 13 19:41:52 2021) [be[<domain>]] [sdap_op_destructor] (0x1000): > Abandoning operation 18 > (Wed Oct 13 19:41:52 2021) [be[<domain>]] [generic_ext_search_handler] > (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out > (Wed Oct 13 19:41:52 2021) [be[<domain>]] [ipa_hbac_rule_info_done] > (0x0080): Could not retrieve HBAC rules > (Wed Oct 13 19:41:52 2021) [be[<domain>]] [ipa_pam_access_handler_done] > (0x0020): Unable to fetch HBAC rules [110]: Connection timed out > (Wed Oct 13 19:41:52 2021) [be[<domain>]] [dp_req_done] (0x0400): DP > Request [PAM Account #21]: Request handler finished [0]: Success > ``` > > I tried using `ldap_enumeration_search_timeout = 180` which updated the > timeout in (`[sdap_op_add] (0x2000): New operation 18 timeout 60 `) but > same result (`enumerate` and `subdomain_enumerate` config options are left > to their defaults, `false` and `none` respectively). > > I looked at the IPA servers LDAP logs and saw now related requests > regarding `ipaHBACRules` so I assume the request didn't even left the > server. > Hi, Did you try to sniff network traffic (via tcpdump, for example) either on client or[/and] on server? > I tried tweaking some other caching options like `cached_auth_timeout` and > `pam_id_timeout` without any impact. Also tried clearing up the caches with > `sss_cache -E` or `systemctl stop sssd && rm /var/lib/sss/db/* && systemctl > start sssd` and nothing changed. > > This is the error returned by `sssctl user-checks` when it fails: > ``` > $ sudo sssctl user-checks <user> > user: <user> > action: acct > service: system-auth > > SSSD nss user lookup result: > - user name: <user> > - user id: <uid> > - group id: <gid> > - gecos: <name> > - home directory: /home/<user> > - shell: /bin/bash > > InfoPipe operation failed. Check that SSSD is running and the InfoPipe > responder is enabled. Make sure 'ifp' is listed in the 'services' option in > sssd.conf.InfoPipe User lookup with [<user>] failed. > testing pam_acct_mgmt > > pam_acct_mgmt: System error > > PAM Environment: > - no env - > ``` > > Any pointer here would be very useful. > > Thank you! > Aitor > _______________________________________________ > sssd-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedorahosted.org/archives/list/[email protected] > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure >
_______________________________________________ sssd-users mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/[email protected] Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
