I'm having an issue with sssd failing to look up user or group names from an AD 
provider. The error occurs on both modern Fedora and Centos 7 systems joined to 
AD via realm commands. On Centos 7, the version of SSSD is 1.16.0, and that is 
the version on which I am reporting.

The systems will work perfectly for a long time (up to months) and then 
suddenly start failing. The most noticeable failure is that "ls -l" of files 
will give UID/GID numbers, not names, and also ssh into the system will report 
the error "/usr/bin/id: cannot find name for group ID".

The failure can be temporarily cured with commands such as:

getent passwd username
getent group "domain users"

but after a short period of time the failure resumes. Clearing the cache via 
"sss_cache -E" also causes the problem to immediately manifest.

I ran some tests with logging enabled. NSS debug level set to 6. The test is to 
issue the command:

ls -ld dpotterv

When things are working, I see:

drwx------. 19 dpotterv domain users 29 Jun  1 10:08 dpotterv

When things are failing, I see:

drwx------. 19 900209170 900200513 29 Jun  1 10:08 dpotterv

Here are the entries from the nss log for FAILURE:

(Fri Jun  1 11:17:59 2018) [sssd[nss]] [accept_fd_handler] (0x0400): Client 
connected!
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received 
client version [1].
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered 
version [1].
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [nss_getby_id] (0x0400): Input ID: 
900209170
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_send] (0x0400): CR #21: New 
request 'User by ID'
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_select_domains] (0x0400): CR 
#21: Performing a multi-domain search
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_domains] (0x0400): CR 
#21: Search will check the cache and check the data provider
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_set_domain] (0x0400): CR #21: 
Using domain [local]
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_send] (0x0400): CR 
#21: Looking up UID:900209170@local
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#21: Checking negative cache for [UID:900209170@local]
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#21: [UID:900209170@local] is not present in negative cache
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#21: Looking up [UID:900209170@local] in cache
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_idminmax_check] (0x0200): id 
exceeds min/max boundaries
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#21: ID [UID:900209170@local] was filtered out
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_locate_dom_cache_done] 
(0x0040): cache_req_search_recv returned [1432158300]: ID is outside the 
allowed range
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_process_result] (0x0400): CR 
#21: Finished: Error 1432158300: ID is outside the allowed range
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [client_recv] (0x0200): Client 
disconnected!
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [accept_fd_handler] (0x0400): Client 
connected!
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received 
client version [1].
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered 
version [1].
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [nss_getby_id] (0x0400): Input ID: 
900200513
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_send] (0x0400): CR #22: New 
request 'Group by ID'
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_select_domains] (0x0400): CR 
#22: Performing a multi-domain search
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_domains] (0x0400): CR 
#22: Search will check the cache and check the data provider
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_set_domain] (0x0400): CR #22: 
Using domain [local]
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_send] (0x0400): CR 
#22: Looking up GID:900200513@local
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#22: Checking negative cache for [GID:900200513@local]
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#22: [GID:900200513@local] is not present in negative cache
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#22: Looking up [GID:900200513@local] in cache
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_idminmax_check] (0x0200): id 
exceeds min/max boundaries
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#22: ID [GID:900200513@local] was filtered out
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_locate_dom_cache_done] 
(0x0040): cache_req_search_recv returned [1432158300]: ID is outside the 
allowed range
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [cache_req_process_result] (0x0400): CR 
#22: Finished: Error 1432158300: ID is outside the allowed range
(Fri Jun  1 11:17:59 2018) [sssd[nss]] [client_recv] (0x0200): Client 
disconnected!

I should note that in the sssd config file, I have:

[domain/local]
min_id = 3000
max_id = 199999
id_provider = local
access_provider = permit
remove_homedir = false

[domain/anl]
min_id = 200000
ldap_idmap_range_size = 100000000
cache_credentials = false
id_provider = ad
access_provider = ad
auth_provider = ad
chpass_provider = ad
ldap_schema = ad
ldap_id_mapping = true

(Yes, we have a very large range size. Currently, about 0.75 million SIDs in 
the AD forest, and someone wanted room for growth.)

Looking at the log entries, I see what looks like normal flow of events. A 
multi-domain search is initiated, and the first domain searched is the local 
one.
It correctly determines that the ID is outside the range that is valid for this 
domain, and it returns.

What is not happening is a subsequent search of the ANL domain where the ID is 
valid!

The following is the NSS log when I do "getent passwd dpotterv", which succeeds:

(Fri Jun  1 11:34:14 2018) [sssd[nss]] [accept_fd_handler] (0x0400): Client 
connected!
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_cmd_get_version] (0x0200): Received 
client version [1].
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_cmd_get_version] (0x0200): Offered 
version [1].
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [nss_getby_name] (0x0400): Input name: 
dpotterv
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_send] (0x0400): CR #27: New 
request 'User by name'
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_process_input] (0x0400): CR 
#27: Parsing input name [dpotterv]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): 
name 'dpotterv' matched without domain, user is dpotterv
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_set_name] (0x0400): CR #27: 
Setting name [dpotterv]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_select_domains] (0x0400): CR 
#27: Performing a multi-domain search
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_domains] (0x0400): CR 
#27: Search will check the cache and check the data provider
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_set_domain] (0x0400): CR #27: 
Using domain [local]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_prepare_domain_data] 
(0x0400): CR #27: Preparing input data for domain [local] rules
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_send] (0x0400): CR 
#27: Looking up dpotterv@local
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#27: Checking negative cache for [dpotterv@local]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#27: [dpotterv@local] is not present in negative cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#27: Looking up [dpotterv@local] in cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#27: Object [dpotterv@local] was not found in cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_dp] (0x0400): CR #27: 
Looking up [dpotterv@local] in data provider
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#27: Looking up [dpotterv@local] in cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#27: Object [dpotterv@local] was not found in cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_ncache_add_to_domain] 
(0x0400): CR #27: Adding [dpotterv@local] to negative cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_ncache_set_str] (0x0400): Adding 
[NCE/USER/local/dpotterv@local] to negative cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_set_domain] (0x0400): CR #27: 
Using domain [anl]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_prepare_domain_data] 
(0x0400): CR #27: Preparing input data for domain [anl] rules
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_send] (0x0400): CR 
#27: Looking up dpotterv@anl
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#27: Checking negative cache for [dpotterv@anl]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_ncache] (0x0400): CR 
#27: [dpotterv@anl] is not present in negative cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#27: Looking up [dpotterv@anl] in cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_send] (0x0400): CR 
#27: Object found, but needs to be refreshed.
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_dp] (0x0400): CR #27: 
Looking up [dpotterv@anl] in data provider
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_dp_issue_request] (0x0400): Issuing 
request for [0x5560f40e6b50:1:dpotterv@anl@anl]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_dp_get_account_msg] (0x0400): 
Creating request for [anl][0x1][BE_REQ_USER][name=dpotterv@anl:-]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_dp_internal_get_send] (0x0400): 
Entering request [0x5560f40e6b50:1:dpotterv@anl@anl]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_cache] (0x0400): CR 
#27: Looking up [dpotterv@anl] in cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_ncache_filter] 
(0x0400): CR #27: This request type does not support filtering result by 
negative cache
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_search_done] (0x0400): CR 
#27: Returning updated object [dpotterv@anl]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_create_and_add_result] 
(0x0400): CR #27: Found 1 entries in domain anl
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [sss_dp_req_destructor] (0x0400): 
Deleting request: [0x5560f40e6b50:1:dpotterv@anl@anl]
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [cache_req_done] (0x0400): CR #27: 
Finished: Success
(Fri Jun  1 11:34:14 2018) [sssd[nss]] [client_recv] (0x0200): Client 
disconnected!

This looks like a completely routine multi-domain search by name, first 
searching the local domain, and then searching the ANL domain, where it queries 
and receives an answer from the AD data provider.

So, is there a bug in sssd wherein it is failing to continue multi-domain 
searches by ID when an ID is out of range for local? Or is there something I'm 
doing wrong in the sssd config? I'm happy to do further debugging. What logs 
(and level) might be useful?

Our needs are fairly simple. ID's below 200000 are local, everything else is 
divided into large slices for AD domain(s), and we are joined to one large AD 
domain.

Thanks for any insight!







_______________________________________________
sssd-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]/message/4LBPB4R3WJC6RFSWEGPMD7RVYU6H7TAM/

Reply via email to