I have had to deal with the symptoms you describe, never with 730 groups 
though.  Based on my experience doing a lookup for a user in an AD trusted 
domain is a resource intensive process on the server.  I’d first start by 
taking a look at your logs to see if the lookup is failing on the server or on 
the client.  The logs should be able to tell you this.  My suspicion is that 
the timeout is actually occurring on the server.

If the timeout is occurring on the server, I would start by increasing one or 
both of these values:

ldap_opt_timeout
ldap_search_timeout

If that doesn’t work I’d take look to see if the 389 server is under high load 
when you are performing this operation.  The easiest way I have found to do 
this is to just execute an LDAP query directly against the IPA server when you 
are performing an id lookup, for example:

ldapsearch -D "cn=Directory Manager" -w <pw> -s base -b "cn=config" 
"(objectclass=*)”

If the LDAP server is not responsive you probably need to increase the number 
of worker threads for 389ds.  Also, you might want to disable referrals, check 
out the man pages for this;

ldap_referrals = false

Also, FWIW, if you crank up debug logging on the sssd client, you should be 
able to see the amount of seconds of timeout assigned to the operation, and 
witness the fact that the operation is actually timing out by inspecting the 
logs themselves.  The logs can get a little verbose but the data is there.

Dan



On Jan 30, 2017, at 4:00 AM, Troels Hansen 
<t...@casalogic.dk<mailto:t...@casalogic.dk>> wrote:

Hi there

I'm trying to debug on a strange IPA timeout issue.

Its SSSD 1.14, IPA 4.4, RHEL 7.3.
2 IPA servers in AD trust.

Besides being a bit slow on groups membership lookups on users with a moderate 
number of Groups, there are some users with a HUGE amount of nested groups.

A server just installed, thereby having clean cache:

# time id shja
id: shja: no such user
real    0m12.107s
user    0m0.000s
sys     0m0.007s

Hmm, lets try again:

# sss_cache -E && systemctl restart sssd
# time id shja
id: shja: no such user
real    0m58.016s
user    0m0.001s
sys     0m0.005s

Hmm..

# sss_cache -E && systemctl restart sssd
# time id shja

...about 30% of the users Groups are returned....

real    5m16.840s
user    0m0.010s
sys     0m0.019s


Next lookup is pretty fast and returns all Groups (about 730).

# time id shja
real    0m7.670s
user    0m0.028s
sys     0m0.066s


A few questions.
The first times id seems to bail out and report no such user after whet seems 
to be a random amount of time.
Then is actually starts fetching groups it fetches a portion of the Groups, and 
the last try it fetches all groups.

It looks like IPA is starting a thread running in backgroups, filling the cache 
and this continues after the failed lookup?

Shouldn't SSSD be able to use the cache from the the SSSD on the IPA server?
In this example the IPA server had full cache of the user and groups but the 
time it took to do the lookup indicates its still traversing the AD?

sssd.conf is pretty default:
full_name_format = %1$s

set on SSSD client.

On IPA server this is added (no full_name_format):
ignore_group_members = True
ldap_purge_cache_timeout = 0
ldap_user_principal = nosuchattr
subdomain_inherit = ldap_user_principal, ignore_group_members, 
ldap_purge_cache_timeout


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Reply via email to