On Fri, Oct 15, 2021 at 3:54 PM Aitor Pazos <[email protected]> wrote:

> Hi all,
> Let me introduce the symptoms that triggered this investigation.
> First, versions:
> - OS: Ubuntu 20.04
> - SSSD: 2.2.3-3ubuntu0.7
> - Platform: x86_64
>
> Some weeks after starting using a new region/provider we noticed some
> intermittent issue impacting SSH logins (ie: clients got  `Connection
> closed by UNKNOWN port 65535` on `ssh ...`). When looking into it, I was
> able to reproduce it somewhat reliably. It happens after some time of
> inactivity for that user, but if I constantly issue commands (ie: `watch
> sssctl user-checks <user>` or `watch ssh <host> whoami` from a client) the
> issue wont trigger. If I stopped trying out those commands and tried again
> ~30 minutes later, it would fail.
>
> With these symptoms I had these possibilities in mind: some caching issue,
> some authentication expiring, connections pools issues...
>
> I enabled detailed logging and compared logs from a successful request and
> a failed one and I found what seems to be the source of the error from sssd
> logs perspective, but still can't make much sense of it.
> Log extracted from sssd_<domain>.log:
> ```
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [ipa_hbac_rule_info_next]
> (0x0400): Sending request for next search base: [cn=hbac,dc=ipa,dc=corp,<dc
> <domain>>][2][(&(objectclass=ipaHBACRule)(ipaenabledflag=TRUE)(accessRuleType=allow)(|(hostCategory=all)(...)))]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_print_server] (0x2000):
> Searching <ip10.1.10.102:389
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x0400): calling ldap_search_ext with
> [(&(objectclass=ipaHBACRule)(ipaenabledflag=TRUE)(accessRuleType=allow)(|(hostCategory=all)(...)][cn=hbac,dc=ipa,dc=corp,<dc
> <domain>>].
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [objectclass]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [cn]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [ipauniqueid]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [ipaenabledflag]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [accessRuleType]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [memberUser]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [userCategory]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [memberService]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [serviceCategory]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [sourceHost]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [sourceHostCategory]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [externalHost]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [memberHost]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x1000): Requesting attrs: [hostCategory]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step]
> (0x2000): ldap_search_ext called, msgid = 18
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_op_add] (0x2000): New
> operation 18 timeout 60
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_process_result] (0x2000):
> Trace: sh[0x55d349f96d00], connected[1], ops[0x55d349ff16e0],
> ldap[0x55d34a10cba0]
> (Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_process_result] (0x2000):
> Trace: end of ldap_result list
> (Wed Oct 13 19:41:52 2021) [be[<domain>]] [sdap_op_timeout] (0x1000):
> Issuing timeout for 18
> (Wed Oct 13 19:41:52 2021) [be[<domain>]] [sdap_op_destructor] (0x1000):
> Abandoning operation 18
> (Wed Oct 13 19:41:52 2021) [be[<domain>]] [generic_ext_search_handler]
> (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out
> (Wed Oct 13 19:41:52 2021) [be[<domain>]] [ipa_hbac_rule_info_done]
> (0x0080): Could not retrieve HBAC rules
> (Wed Oct 13 19:41:52 2021) [be[<domain>]] [ipa_pam_access_handler_done]
> (0x0020): Unable to fetch HBAC rules [110]: Connection timed out
> (Wed Oct 13 19:41:52 2021) [be[<domain>]] [dp_req_done] (0x0400): DP
> Request [PAM Account #21]: Request handler finished [0]: Success
> ```
>
> I tried using `ldap_enumeration_search_timeout = 180` which updated the
> timeout in (`[sdap_op_add] (0x2000): New operation 18 timeout 60 `) but
> same result (`enumerate` and `subdomain_enumerate` config options are left
> to their defaults, `false` and `none` respectively).
>
> I looked at the IPA servers LDAP logs and saw now related requests
> regarding `ipaHBACRules` so I assume the request didn't even left the
> server.
>

Hi,
Did you try to sniff network traffic (via tcpdump, for example) either on
client or[/and] on server?


> I tried tweaking some other caching options like `cached_auth_timeout` and
> `pam_id_timeout` without any impact. Also tried clearing up the caches with
> `sss_cache -E` or `systemctl stop sssd && rm /var/lib/sss/db/* && systemctl
> start sssd` and nothing changed.
>
> This is the error returned by `sssctl user-checks` when it fails:
> ```
> $ sudo sssctl user-checks <user>
> user: <user>
> action: acct
> service: system-auth
>
> SSSD nss user lookup result:
>  - user name: <user>
>  - user id: <uid>
>  - group id: <gid>
>  - gecos: <name>
>  - home directory: /home/<user>
>  - shell: /bin/bash
>
> InfoPipe operation failed. Check that SSSD is running and the InfoPipe
> responder is enabled. Make sure 'ifp' is listed in the 'services' option in
> sssd.conf.InfoPipe User lookup with [<user>] failed.
> testing pam_acct_mgmt
>
> pam_acct_mgmt: System error
>
> PAM Environment:
>  - no env -
>  ```
>
> Any pointer here would be very useful.
>
> Thank you!
>   Aitor
> _______________________________________________
> sssd-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedorahosted.org/archives/list/[email protected]
> Do not reply to spam on the list, report it:
> https://pagure.io/fedora-infrastructure
>
_______________________________________________
sssd-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/[email protected]
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Reply via email to