> On 18 Aug 2016, at 17:44, Jeff White <[email protected]> wrote:
> 
> I am running 80+ CentOS 7 systems all configured identically.  I found one of 
> my groups is not working on several of them.  sssd's log shows:
> 
> Aug 18 08:25:07 cn33 sssd[nss][3804]: More groups have the same GID [7021] in 
> directory server. SSSD will not work correctly.
> 
> sssd is configured to query Active Directory so I ran an query for 
> 'gidNumber=7021' via ldapsearch.  That returned only one result.  7021 is 
> also not in /etc/group.  I have not been able to find any useful information 
> in sssd's other logs (nor do I know what each log is, so I'm just digging 
> blindly):
> 
> # ls  -lh /var/log/sssd/
> total 75M
> -rw------- 1 root root    0 Aug 16 03:42 krb5_child.log
> -rw------- 1 root root 5.4K Aug 18 08:10 krb5_child.log-20160816
> -rw------- 1 root root 3.5K Aug 18 08:26 ldap_child.log
> -rw------- 1 root root  18K Jul 26 07:03 ldap_child.log-20160720.gz
> -rw------- 1 root root 1.5M Aug 18 08:25 ldap_child.log-20160727
> -rw------- 1 root root  19M Aug 18 08:28 sssd_ad.wsu.edu.log
> -rw------- 1 root root 497K Aug  8 03:37 sssd_ad.wsu.edu.log-20160808.gz
> -rw------- 1 root root  53M Aug 14 03:47 sssd_ad.wsu.edu.log-20160814
> -rw------- 1 root root    0 May 10 10:43 sssd.log
> -rw------- 1 root root 593K Aug 18 08:25 sssd_nss.log
> -rw------- 1 root root 3.8K Aug 12 03:14 sssd_nss.log-20160812.gz
> -rw------- 1 root root 722K Aug 14 03:49 sssd_nss.log-20160814
> -rw------- 1 root root    0 May 10 10:43 sssd_pam.log
> 
> I restarted the daemon and cleared cache with `sss_cache -E`, still gets the 
> same error.  Oddly, this is not effecting all of my systems.  On some, the 
> group works:
> 

sss_cache does not clear the cache. It just expires the cache so that the next 
lookup will refresh the entries from the server. The cache may contain 
credentials as well, so it might potentially be dangerous to remove it.

When this bug happens again, can you instead search the ldb cache for entries 
with duplicate IDs?

yum install ldb-tools
ldbsearch -H /var/lib/sss/db/cache_ad.wsu.edu.ldb gidNumber=7021

should do the trick.

> $ getent group 7021
> its_p_sto_qa_hpc_kamiak-kelley:*:7021:person1,person2,whatever
> 
> The working and failing systems are running the same version of sssd:
> 
> # salt -L cn31,cn33,cn29,cn5,cn34,cn17 cmd.run 'rpm -q sssd' # broken nodes
> cn33:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn29:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn31:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn34:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn17:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn5:
>    sssd-1.13.0-40.el7_2.2.x86_64
> 
> # salt -L cn28,cn16,cn44,cn1,cn42,cn9 cmd.run 'rpm -q sssd' # working nodes
> cn16:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn28:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn42:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn1:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn44:
>    sssd-1.13.0-40.el7_2.2.x86_64
> cn9:
>    sssd-1.13.0-40.el7_2.2.x86_64
> 
> So what now?  How can I determine what sssd is unhappy about?
> 
> -- 
> Jeff White
> HPC Systems Engineer
> Information Technology Services - WSU
> _______________________________________________
> sssd-users mailing list
> [email protected]
> https://lists.fedorahosted.org/admin/lists/[email protected]
_______________________________________________
sssd-users mailing list
[email protected]
https://lists.fedorahosted.org/admin/lists/[email protected]

Reply via email to