I’ve received incredibly good support from this mailing list previously; I am
hoping that somebody can help me succeed in my ongoing efforts. I have spent a
few days on this at this point and I can’t seem to figure it out how to address
this issue. On my DCs I am seeing excessive ldap_search_ext and
sdap_get_generic_ext_recv timeouts created solely by the invocation of the ‘id’
command on sssd clients. This problem seems to present itself only when I
parallelize lookups for an ‘uncached’ user (i.e. I have never performed an
initial lookup). Individual arbitrary one-off lookups for a single uncached
user on a single system almost always work fine. This leads me to believe this
is a performance tuning issue.
We operate in an academic research computing unit (i.e. we have an HPC
cluster), and I need the ability to lookup the same user in parallel (using the
id command) across a relatively large number of systems, for example to spawn
jobs that require large amounts of CPU cores and/or memory. Right now I am
doing about 50 parallel lookups for the same user to induce this problem.
Here is some background information:
1) I have read Jakub's “Anatomy of an SSSD Lookup” as well as “Performance
Tuning of SSSD for large IPA-AD deployments”, as well as implemented
recommendations from the performance tuning doc, including moving the sssd
cache to tmpfs.
2) We are on ipa-server 4.4.0-14.el7_3.4 using a trusted AD domain; all of our
consumed users and groups are in the AD trusted domain. We have two domain
controllers; each is a RHEL 7.3 VM with 6 GB of memory. Almost all (if not
all) of our clients are running at least sssd 1.14, and are all RHEL 6/7.
Neither DC is swapping, and both have 2 CPUs.
3) I have tuned SSSD clients on the DCs and all clients to include these
options (the problem persists):
a) ldap_opt_timeout = 60
b) ldap_search_timeout = 60
4) On both DCs, I can clear the SSSD cache, and lookup all 2000 or so users in
my environment with 40 concurrent lookups occurring locally on each DC (using
UNIX job control). I can process all 2000 lookups in this manner without any
failures (on either DC), and have ‘pre-populated’ the SSSD cache on both DC’s
by doing this.
6) I have made no additional performance tuning changes other than what has
Would anybody be able to advise on any potential tuning that would be required
(presumably on the DCs), to facilitate 50 parallel lookups without experiencing
sdap_get_generic_ext_recv or ldap_search_ext timeouts? Should I be able to
do this sort of thing with relative ease? I was hoping this would be the sort
of thing that would just work, but based on my relatively extensive testing it
doesn’t. Any advice anybody could provide would be greatly appreciated.
Manage your subscription for the Freeipa-users mailing list:
Go to http://freeipa.org for more info on the project