Hi. I'll take a stab at providing some info, but please don't take this as a definitive answer.
If you take a glance at https://sssd.io/_images/architecture.svg you'll see that SSSD is built around its cache (/var/lib/sss/db/*) The problem of a "slow `id` of a user that is a member of a bunch of big groups" is a very prominent SSSD problem in large environments, culminating in the IPA-AD trust scenario. And quite long standing: https://jhrozek.wordpress.com/2015/08/19/performance-tuning-sssd-for-large-ipa-ad-trust-deployments/ So far 'ignore_group_members = true' is by far maring the best response available. On a high level, on a client side the problem is two-fold: (1) slow cache write operations (by "backends", 'sssd_be' process) (2) slow cache read operations (by "responders", 'sssd_nss' in your case) (2) is being addressed to some extent: I currently have patches posted for review - https://github.com/SSSD/sssd/pull/7841 - that show some promise. Depending on specifics of your setup and workflow, those patches might, or might not, provide you some alleviation. Typical scenario where pronounced benefits are expected: busy server with a hot and *huge* cache that performs tons of identity operations. If you see it worth and could give those patches a try and then provide feedback - that would be great. (1) is more tricky. We have profiling results that show that most of CPU time is consumed in: - https://github.com/SSSD/sssd/blob/master/src/ldb_modules/memberof.c This a plugin for a 3rd party library - `libldb` - that on the fly adds 'memberof: group-dn" attributes to user objects being written to the cache. - otherwise CPU consumption really depends on a backend being used - IPA, AD, LDAP with or without nested groups, etc. There is no single bottleneck. Now getting to your ideas, if I understood it correctly. What you describe is more or less what already happens when 'id_provider = ldap' is used. When one does `getent -s sss group $group` with 'ignore_group_members = false', it will return all group members. But inspection of /var/lib/sss/db/cache_$domain.ldb will show only the group object being cached, containing all members as "ghost" and "orig_member" attributes. With IPA it doesn't work this way, if I understand correctly, to properly support IPA views (server side overrides) - user objects need to be resolved, so that the group could return overridden members properly. Honestly, I don't remember right now how it works exactly with "id_provider = ad". If you are curious, you can stop SSSD, wipe cache, start SSSD, resolve single group (`getent group ...`), stop SSSD and inspect cache content using 'ldbsearch' tool. I'm talking about `getent group` because this is - `getgrgid()` - what takes time when you call `id`. `id` first resolves user (fast), list of groups user is member of (fast* using tokenGroups), and then it needs to convert every GID to groupname using `getgrgid()` - this loop is what typically hammers SSSD. *) well, tokenGroups returns a list of SIDs, and SSSD still needs to loop and resolve every SID, this is definitely fast if 'ignore_group_members = true' but I don't remember right now what happens here otherwise, maybe all groups members gets resolved as well already here (in this case `id` loops over the cache already). Hope this helps. On Wed, Feb 26, 2025 at 7:23 PM Bob Green via sssd-devel <sssd-devel@lists.fedorahosted.org> wrote: > > prepping deployment of sssd in an environment with ~60,000 accounts, > ~4500 groups, backend is AD. Some accounts are members of ~200 > groups, whose total members might exceed 35,000 members total. None > of this is ideal, and frankly most of my issues can be attributed to > poor historic decisions around managing identity in this decades old > environment. > > With "ignore_group_members = false", if a single account ( who is a > member of 200 groups, some of which have 35,000 members) runs "id" it > can take minutes to complete on an uncached sssd client. With this > configuration option set to true, the operation can complete in a few > seconds on an uncached sssd client. This is great, however the > accounts in this environment are fond of running getent group > <some_group> and returning <some_group>'s member list, which is > disabled with "ignore_group_members = true". > > I was wondering if I missed a configuration item that might allow both > "quick" id <account_with_many_large_groups_as_member> operations AND > getent group <some_group>? > > Assuming no configuration item to address this, Is it conceivable that > sssd could consider foregrounding "id" type operations for accounts > when all that is being requested is a list of group ids and group > names for a single account, while deferring or backgrounding all of > the group member enumeration happening on the backend when "id > <account>" is run? If this is conceivable perhaps a pointer to where > I might look in the code to see about this? > > Perhaps I'm barking up the wrong tree, and it's simpler to write a > wrapper for getent group that caches the equivalent ldapsearch? > > Thank you for your consideration and development of this software. > -- > _______________________________________________ > sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org > To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedorahosted.org/archives/list/sssd-devel@lists.fedorahosted.org > Do not reply to spam, report it: > https://pagure.io/fedora-infrastructure/new_issue -- _______________________________________________ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/sssd-devel@lists.fedorahosted.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue