Sumit and others, Our level 1 server support team has identified 107 servers that dropped out of the domain in Aug. By far, that's their biggest burden with sssd -- the automatic machine account renewal.
Over the long weekend, our team ran a report that identified any pingable candidates that (according to AD) had a passwordLastSet age between 31 and 40 days. These would be our interesting candidates; candidates > 40 days would not be of interest to us because AD would have locked the account. We identified 13 candidates today. From our various research, so far we have determined 8 categories of such sssd "automatic machine account renewal" failure. 1. Some SE cloned VM and renamed hostname, IP address, rejoined AD. Old <HOSTNAME>$ entries early in /etc/krb5.keytab file and adcli update grabs first entry in /etc/krb5.keytab with $ at end of it. 2. CPU spiked to 100% for 30 days. 3. Polkit service not running. 4. msDS-KeyVersionNumber in AD set to one more than KVNO in local /etc/krb5.keytab file. passwordLastSet Set to 30 days past last timestamp in local /etc/krb5.keytab file. IOW, sssd called adcli update after 30 days. Adcli update updated AD, not local /etc/krb5.keytab file. 5. DNS firewall problems. Specifically, DNS TCP port 53 blocked, so adcli update could not find Kerberos servers (_kerberos._ tcp.AMER.COMPANY.COM) or LDAP servers (_ldap._tcp.AMER.COMPANY.COM). 6. SELinux enabled; adcli not allowed to update /etc/krb5.keytab file (from Sumit). 7. Time sync too far out for adcli update to successfully do an update on machine account. 8. /var filesystem: Input/Output errors. By far, today the most common category is #4. It amounted to 9 of the 13 candidates today. Category #7 was another 2 candidates today. So by far, it's category #4 we want to drill down into -- if we can eliminate that, we've strongly decreased the sssd burden. Also, we think we can put pro-active monitoring in place for category #3 and #7. Our old commerical AD integration product didn't depend on polkit/dbus. So categories #3 and #4 are new for sssd. If we can understand #4 and proactively monitor for #3, we can reduce the sssd burden to that of the former product. Category #4 appears to occur randomly -- no rhyme or reason. Also we have not found any repeat offenders -- so it's very hard to track down. We plan to turn on sssd debug_level 7 (on that one sssd [domain/xxx] stanza only). Debug level 7 is min level to get verbose output from adcli update. We know that turning on debug level 9 on all sssd stanzas (nss, pam, ifp, [domain/xxx]) fills /var/log filesystem to 100% in a few days. Spike On Tue, Sep 7, 2021 at 9:53 AM Patrick Goetz <[email protected]> wrote: > > > On 9/6/21 4:49 AM, Sumit Bose wrote: > > Am Thu, Sep 02, 2021 at 10:02:54AM -0500 schrieb Patrick Goetz: > >> > >> On 9/2/21 12:49 AM, Sumit Bose wrote: > >>> The reason is that 'kinit -k' constructs the principal by calling > >>> gethostname() or similar, adding the 'host/' prefix and the realm. But > >>> by default this principal in AD is only a service principal can cannot > >>> be used to request a TGT as kinit does. AD only allows user principals > >>> for request a TGT and this is by default '[email protected]'. If the > >>> userPrincipalName attribute is set, this principal given here is > allowed > >>> as well. > >>> > >> > >> This raises a couple of questions. Because of AD's flat address space, > we > >> use a host naming convention in AD as a sort of low rent namespacing; > so, > >> for example, for this host the college is cns and the research group > cryo, > >> so the AD hostname is cns-cryo-ross1$ > >> > >> However, > >> > >> # hostname > >> rossmann.biosci.utexas.edu > >> > >> > >> which is easier for the users to remember for ssh purposes. We set > >> > >> ad_hostname = cns-cryo-ross1.austin.utexas.edu > >> > >> in /etc/sssd/sssd.conf. > >> > >> But I just checked, and kinit does not use ad_hostname, so I have to > run it > >> as > >> > >> kinit -k -R cns-cryo-ross1$ > >> > >> The question is, then what does use the ad_hostname key/value pair? > >> > >> Next, the kinit example provided by Spike was `kinit -k` -- we always > run > >> `kinit -k -R` > >> > >> -R renews the TGT, which is what I thought is the thing set to expire > in AD > >> that needs to be periodically renewed. What's the purpose of running > `kinit > >> -k` without the -R? > > > > Hi, > > > > there are two different things. > > > > First, there are the host keys in the keytab which are equivalent to a > > user password. Those keys are renewed by 'adcli update' if they are > > older then 30 days, similar as you would renew you user password if the > > AD DC tells you to do it. > > > > Second, with those keys you can request a Kerberos TGT > > > > kinit -k 'shortname$' > > > > I thought, based on the kinit man page, that the -k flag is just an > ordinary ticket request and that you need to add the -R flag to request > a TGT. What you're saying is it also renews the TGT? > > OTOH I thought `kinit -k` was updating the computer account password on > the domain controller, but that doesn't seem to be the case, in which > case I'm not even sure what the purpose of an ordinary (non-TGT) ticket > is if you're not requesting automatic login to some specifically > requested service. > > Also, just to make sure I'm clear on this, the "renew until" doesn't > change because this is based on the computer account password > expiration, and further that sssd runs `adcli update` for you > periodically? How often, by the way? > > > > as you can do with your user password: > > > > kinit user@REALM > > Password for user@REALM > > > > This TGT has a lifetime and it might have a renewal time as well: > > > > # klist > > Ticket cache: KCM:0:69840 > > Default principal: [email protected] > > > > Valid starting Expires Service principal > > 09/06/2021 09:39:28 09/06/2021 19:39:28 krbtgt/[email protected] > > renew until 09/07/2021 09:39:24 > > > > > > In the example above the TGT will expire at '09/06/2021 19:39:28' but > > can be renewed until '09/07/2021 09:39:24'. This means that if you call > > > > kinit -R > > > > before '09/06/2021 19:39:28' you will get a fresh TGT without entering > > your password. The new TGT will have a new lifetime but 'renew until' > > will stay the same. After '09/07/2021 09:39:24' 'kinit -R' will not work > > anymore and you have to enter your password again. It does not matter > > here if the TGT was originally requested with a keytab with 'kinit -k' > > or with plain 'kinit' and a password. > > > > However, since the keytab is present in the file system calling > > > > kinit -k 'shortname$' > > > > will always get a fresh TGT without manual intervention. So in case you > > have a valid keytab this is even more flexible than 'kinit -R' > > > > HTH > > > > bye, > > Sumit > > > >> > >> _______________________________________________ > >> sssd-users mailing list -- [email protected] > >> To unsubscribe send an email to [email protected] > >> Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > >> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > >> List Archives: > https://lists.fedorahosted.org/archives/list/[email protected] > >> Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure > > _______________________________________________ > > sssd-users mailing list -- [email protected] > > To unsubscribe send an email to [email protected] > > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > > List Archives: > https://lists.fedorahosted.org/archives/list/[email protected] > > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure > >>> This message is from an external sender. Learn more about why this << > >>> matters at https://links.utexas.edu/rtyclf. << > _______________________________________________ > sssd-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedorahosted.org/archives/list/[email protected] > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure >
_______________________________________________ sssd-users mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/[email protected] Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
