On Wed, Oct 30, 2013 at 12:18:44PM +0200, Sami K wrote: > Hello, > > We have been lately having big problems with sssd caching. On our ssh > servers, (each with ~100-200 users) login may take several minutes as the > sssd_be -process uses 100% cpu time and sssd_be -process may be in this > state for days. Clearing the cache and restarting sssd during the day > usually helps and then everything works for few days, sometimes only hours. > It is not clear what triggers this behaviour, maybe some some combination > of lots of users and cache update at the same time. > > The culprit seems to have been addition of few big groups lately to ldap > for our access policy worsening the situation and sssd-performance. > > On test server simple id command and empty cache with same setttings as in > production takes: > [root@testsk tmp]# time id testusr > uid=1143(testusr) gid=100(users) > groups=100(users),3318(roam),3102(nixe),1000(staff1),3785(wl-staff1),3119(system),3402(fileaccess),3377(vpn1),120(grp2),3123(devel),1001(devel3),3378(vpn2),3266(usr),3386(access3) > > real 0m28.689s > user 0m0.006s > sys 0m0.007s > > We have currently several groups with around 17 000 and 3000 users so this > id query creates over 100k ghost users to cache: > > [root@testsk tmp]# ldbsearch -H /var/lib/sss/db/cache_TESTAUTH.ldb |grep > ghost |wc -l > asq: Unable to register control with rootdse! > 105196 > > Indeed, with full debug (time of id-command is then over 1 minute) all I > see in the logs ldap backend mostly adding ghost users to cache as it adds > information from _all_ groups related to that uid. As backend is not > respondind to monitor pings fast enough, monitor tries to kill it and > restart. Same happens also in production servers. I have already extended > timeout to 60 but it seems not to be enough. > > This latter case seems to be relevant especially when we started to receive > complaints from some people that httpd authentication was not working. > Apache error log shows: > [Tue Oct 29 12:21:36 2013] [error] [client xxx.xx.xx.xx] GROUP: testuser > not in required group(s). > when in fact user is in the required group but it seems that sssd just > fails to respond fast enough. This is (PAM, AuthType Basic, Require group > testgroup) kind of authentication. > > This is on RHEL6.4, sssd-1.9.2-82.10.el6_4.x86_64. Configured services > nss, ldap: > sanitized config: > ------------------------ > [sssd] > config_file_version = 2 > debug_level = 1 > reconnection_retries = 3 > timeout = 60 > services = nss > domains = TESTAUTH > [nss] > filter_groups = root > filter_users = root > reconnection_retries = 3 > debug_level = 1 > [domain/TESTAUTH] > debug_level = 1 > ldap_purge_cache_timeout = 3600 > id_provider = ldap > auth_provider = ldap > ldap_uri = ldap://authserv.test > ldap_search_base = dc=test > ldap_user_search_base = ou=People,dc=test > ldap_group_search_base = ou=Group,dc=test > > So in the end, any ideas or suggestions how to improve the situation? Of > course I'm willing to debug/test this more if needed as the current > situation is almost disastrous. > > Cheers, > - Sami >
Hi Sami, I'm sorry you are having problems with SSSD. In 6.5, we added a new "ignore_group_members" option that makes all groups appear as empty. Setting this option to "true" would make a huge performance gain at the expense of not seeing the group members. But if your environment relies on group membership mostly for access control, that should be fine. > ps. Quick test on a Fedora 19 and sssd-1.11.1-4.fc19 made the same queries > in 7 seconds or less so apparently some progress in performance has been > done. Any idea when would RHEL6 sssd be rebased? Not in RHEL-6.5 :-) Currently it's not clear if RHEL6 will rebase. (And details about future RHEL updates are not usually disclosed on public mailing list). > I tried to compile latest > git-version on RHEL6 but I couldn't find all required components (for ex. > configure: error: you must have the cifsidmap header installed to build the > idmap plugin). Passing --disable-cifs-idmap-plugin to configure should get rid of this requirement. _______________________________________________ sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users