On Mon, 19 Jun 2017, Rob Crittenden wrote:

Rob Foehl wrote:
On Thu, 15 Jun 2017, Rob Crittenden wrote:

Rob Foehl wrote:
Can I at least get a yes or no on whether external CA certificate
renewal has ever been tested when that certificate is nearing

Yes. I tested this with IPA v3.0. Did it break in between? Possible.

As I pointed out certmonger is unaware of the certificate chain and
focuses only on the cert not-after date and resubmits the CSR to the CA
that issued the certificate originally.

Thanks for the reply.

certmonger not knowing about the chain is understandable, as is the
resubmission of each tracked cert to the existing CA.  Doing this
results in a pile of certs that expire relatively quickly, being tied to
the old CA, but that's also not surprising -- the surprise is that it
only did that once, and has since appeared to ignore them all, even
after the CA was renewed manually and the newly-issued-but-short-lived
certs tied to the old CA expired.

Ok, I'll need to try to reproduce it. It may take me a while to get
around to this so feel free to nag me.

Consider this that, maybe... I just got around to beating my head against this some more myself. I'm still trying to convince myself that use of an external CA is viable, so I'd resurrected the test VM from May/June and this time actually managed to sort it out. More detail below.

I just duplicated last week's result using an earlier snapshot of the
same VM and a renewed CA cert with a 3-day validity.  certmonger ignored
every other cert that it already renewed once with the original CA;
whole system is hosed after the original cert expires.  It's probably
possible to recover by manually replacing every certificate, but I
haven't had time to try that.

certmonger checks at days 28, 7, 3, 2 and 1 before expiration by default
for certificate expiration so it should have looked at the certs at
least two times, three depending on timing (and really, it's seconds
before expiration). Did you let the system sit for 3 days before things
died? Was anything logged to syslog? Moving time forward a day at a time
is insufficient to test this without restarting certmonger.

I let the original VM snapshot run for a month straight, renewing the
IPA CA by hand after the first round of certmonger-initiated renewals
with 14 days til expiration and on the second attempt after expiration.
The first attempt used another 30-day cert, the second used a 3-day and
was allowed to run straight through.  No time jumps while the VM is
running, and all snapshots with the VM powered off, so it always booted
with an accurate clock.

certmonger never logged anything after the first renewal cycle on either
attempt.  A 'getcert list' on the long-running VM shows all of the
tracked certificates with an expiration date of 2017-06-24, which
matches the lifetime of the renewed CA cert, but none of the services
attempting to load or use them are happy.

It depends on why they aren't happy. Are they not happy due to expired
certs or something else?

They weren't happy due to the expired CA certs, and in some cases the leaf certificates hadn't been updated in place due to SELinux denials.

I'm still not sure why certmonger thought it'd replaced certificates when it hadn't, and I don't remember which of the last ~30 snapshots left them in this state, or I'd dig deeper :)

But httpd still refuses to start with that NSSDB, and this appears to be

# certutil -L -n Signing-Cert -d /etc/httpd/alias
        Version: 3 (0x2)
        Serial Number: 9 (0x9)
        Signature Algorithm: PKCS #1 SHA-256 With RSA Encryption
        Issuer: "CN=Certificate Authority,O=EXAMPLE.COM"
            Not Before: Mon May 08 06:33:16 2017
            Not After : Wed Jun 07 06:25:53 2017
        Subject: "CN=Object Signing Cert,O=EXAMPLE.COM"

mod_nss shouldn't be considering the signing cert so I doubt this is

The startup failures may have been related to the WSGI modules trying to connect to other services with expired certs. Either way, that expired CA cert is what gets presented to HTTPS clients, so it's still a problem.

Does certmonger know how to replace the entire certificate chain in the
respective store(s)?

(The third certificate in there, ipaCert / CN=IPA RA, has the same dates
as the Server-Cert above.)

So it was renewed as well.

certmonger doesn't push out new chains so if that changed in between
that would do it. This is another way to test cert validation from the

# certutil -V -u V -d /etc/httpd/alias -n Server-Cert

If you want to see if updating the CA cert(s) makes any difference.

Even in a worst-case scenario, where all the certs expire, it is a
fairly straightforward process to get the services back up by going back
in time, renewing the IPA CA then restarting certmonger to renew the
service certificates.

Is it perfect? No. A search of the users forum should make that
apparent. It has been difficult to reproduce the failures because it's
difficult to simulate by moving time around. Several years ago I left
VMs running for months to try to simulate failures and it always worked
for me.

I haven't tried kicking the clock around yet...  The second attempt
booted from a month-old snapshot and immediately blew itself up;
renewing the CA cert and restarting certmonger (really, the whole VM)
didn't change anything.

If the chain changes then yeah, that'd cause problems.

I think I've stumbled onto what happened here, but I don't know how to reliably reproduce it. See below.

Note too that there is a difference between certmonger and the renewals.
certmonger renews certs but there are helpers that need to fire off to
update information within IPA as well and to distribute updated
certificates to replicas. These scripts were updated significantly since
I wrote them to be much more robust in terms of reliability and logging.

Consider uses of "certmonger" above to include these...  Another
wrinkle, discovered early on, was broken SELinux policy that prevented
certmonger from running any of them.  That was (apparently) fixed by a
later selinux-policy-targeted package release, but I haven't tried the
whole process from a bare install since.  The second test with the 3-day
lifetime on the IPA CA renewal should've been okay here.  I can try
again with a fresh install and relatively short IPA CA cert lifetimes,
say 4 days per renewal if that'll be sufficient to provoke this a bit

I'm still worried about the missing "phase 2" when it comes to
distributing a new external CA certificate -- the CA I have expires in 3
years, and it'd be nice to know whether I'm shooting myself in the foot
if I try signing the for-real IPA CA with it now.

The really tricky bit is distributing the updated CA chain around. I've
been away from IPA for a while but I can give you some bread crumbs. I
believe that ipa-cacert-manage can be used to update the stored CA chain
in LDAP and then running ipa-certupdate will pull the chain down, it
just needs to be run on every master and client.

Bingo. The necessity of running ipa-certupdate in this case isn't really covered anywhere in the documentation, with the best description I could find in https://www.freeipa.org/page/V4/CA_certificate_renewal starting with "there will be a new utility"...

Here's what it took to coerce everything back into working order:

- 'setenforce 0', followed by a shower attempting to wash away the shame

  Seriously, the lack of idempotent helper scripts is a huge problem here,
  and is the underlying cause of most of this pain. certmonger can wind up
  in a state where it thinks it's replaced certs when it hasn't; various
  services (including Dogtag and the KDC proxy) can wind up unable to
  connect to the directory service; et cetera.

  See https://bugzilla.redhat.com/show_bug.cgi?id=1475528 for the specific
  instance still affecting pki-tomcatd.

- Modify /etc/pki/pki-tomcat/ca/CS.cfg and
  /etc/pki/pki-tomcat/password.conf to use plain LDAP connections, based
  in part on information found in this post:

  This step was necessary to get pki-tomcatd to start at all, after its
  client cert had been partially mangled by the earlier renewal attempt.

- Stop certmonger, IPA, and chronyd or ntpd, as appropriate, and roll the
  clock back to a date when the originally installed certs were valid

- Really stop certmonger, violently, then remove /var/run/ipa/renewal.lock

- Start IPA services via 'ipactl start', wait for everything to come up,
  then start certmonger and wait for it to settle (which takes a while if
  it's decided to attempt renewals with the old CA)

- Run 'ipa-cacert-manage renew --external-ca' and sign the resulting CSR
  with a validity interval that overlaps the original CA cert

- Run 'ipa-cacert-manage renew --external-cert-file=/path/to/ipa-ca.pem
  --external-cert-file=/path/to/ca.pem' to import the resulting CA chain

- Stop certmonger again, clean up as above if necessary

- Run 'ipa-certupdate', possibly after 'kinit admin' to get a ticket

- Step clock forward to a day or two prior to original leaf certificate
  expiration, as imposed by the original CA lifetime and within the
  validity period of the new CA cert

- Start certmonger, wait for it to renew all the leaf certificates, and
  verify the results with 'getcert list', paying attention to the
  expiration times across the board

- Assuming this worked: stop all services again, revert the CS.cfg and
  password.conf changes, and either manually fix the clock and restart
  everything (including the time service) or just reboot

Here's the catch: this worked the first time I did it, with a new CA set to expire 30 days after the last one and only stepping the clock forward enough to land in the middle of that one. I repeated the whole process (less the CS.cfg steps) with another externally signed CA cert for another 30 days, and after that pass, certmonger refused to update anything using the new CA, clinging instead to the one from the first attempt and reusing its expiration date on all renewed certs.

Why this happened isn't entirely clear, but one thing I did notice after that attempt is that the newly replaced CA wasn't the first one listed in (at least) the NSSDBs for httpd and pki-tomcatd, instead coming second in the list when examined with certutil. I generated another CSR and CA with different dates and a different offset from the second attempt, and ran through the whole process again; the result was even more bizarre, with all five CAs (the original, first renewal, and three recent) now all appearing in the correct order in the NSSDBs, and certmonger happily renewing the leaf certs, pinned to the new CA expiration date.

I'm not sure what to take away from that, other than that it worked eventually, and I now have a functional IPA instance which I'd thought was a lost cause the last time I looked at it. Happy to share anything anyone wants a look at, including the NSSDBs which now look like this:

# certutil -L -d /etc/pki/pki-tomcat/alias

Certificate Nickname                                         Trust Attributes

OU=example.com CA,O=example.com,C=US                         CT,C,C
caSigningCert cert-pki-ca                                    CTu,Cu,Cu
caSigningCert cert-pki-ca                                    CTu,Cu,Cu
Server-Cert cert-pki-ca                                      u,u,u
auditSigningCert cert-pki-ca                                 u,u,Pu
ocspSigningCert cert-pki-ca                                  u,u,u
caSigningCert cert-pki-ca                                    CTu,Cu,Cu
caSigningCert cert-pki-ca                                    CTu,Cu,Cu
caSigningCert cert-pki-ca                                    CTu,Cu,Cu
subsystemCert cert-pki-ca                                    u,u,u

(Aside: is there any sane way to clean these up?)

I'll keep this image around for a while, although I don't plan on spending too much more time with it. Been enough "fun" already...

FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org

Reply via email to