I have had to deal with the symptoms you describe, never with 730 groups though. Based on my experience doing a lookup for a user in an AD trusted domain is a resource intensive process on the server. I’d first start by taking a look at your logs to see if the lookup is failing on the server or on the client. The logs should be able to tell you this. My suspicion is that the timeout is actually occurring on the server.
If the timeout is occurring on the server, I would start by increasing one or both of these values: ldap_opt_timeout ldap_search_timeout If that doesn’t work I’d take look to see if the 389 server is under high load when you are performing this operation. The easiest way I have found to do this is to just execute an LDAP query directly against the IPA server when you are performing an id lookup, for example: ldapsearch -D "cn=Directory Manager" -w <pw> -s base -b "cn=config" "(objectclass=*)” If the LDAP server is not responsive you probably need to increase the number of worker threads for 389ds. Also, you might want to disable referrals, check out the man pages for this; ldap_referrals = false Also, FWIW, if you crank up debug logging on the sssd client, you should be able to see the amount of seconds of timeout assigned to the operation, and witness the fact that the operation is actually timing out by inspecting the logs themselves. The logs can get a little verbose but the data is there. Dan On Jan 30, 2017, at 4:00 AM, Troels Hansen <t...@casalogic.dk<mailto:t...@casalogic.dk>> wrote: Hi there I'm trying to debug on a strange IPA timeout issue. Its SSSD 1.14, IPA 4.4, RHEL 7.3. 2 IPA servers in AD trust. Besides being a bit slow on groups membership lookups on users with a moderate number of Groups, there are some users with a HUGE amount of nested groups. A server just installed, thereby having clean cache: # time id shja id: shja: no such user real 0m12.107s user 0m0.000s sys 0m0.007s Hmm, lets try again: # sss_cache -E && systemctl restart sssd # time id shja id: shja: no such user real 0m58.016s user 0m0.001s sys 0m0.005s Hmm.. # sss_cache -E && systemctl restart sssd # time id shja ...about 30% of the users Groups are returned.... real 5m16.840s user 0m0.010s sys 0m0.019s Next lookup is pretty fast and returns all Groups (about 730). # time id shja real 0m7.670s user 0m0.028s sys 0m0.066s A few questions. The first times id seems to bail out and report no such user after whet seems to be a random amount of time. Then is actually starts fetching groups it fetches a portion of the Groups, and the last try it fetches all groups. It looks like IPA is starting a thread running in backgroups, filling the cache and this continues after the failed lookup? Shouldn't SSSD be able to use the cache from the the SSSD on the IPA server? In this example the IPA server had full cache of the user and groups but the time it took to do the lookup indicates its still traversing the AD? sssd.conf is pretty default: full_name_format = %1$s set on SSSD client. On IPA server this is added (no full_name_format): ignore_group_members = True ldap_purge_cache_timeout = 0 ldap_user_principal = nosuchattr subdomain_inherit = ldap_user_principal, ignore_group_members, ldap_purge_cache_timeout -- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go to http://freeipa.org for more info on the project -- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go to http://freeipa.org for more info on the project