Ian Pilcher via FreeIPA-users wrote:
> SHORT VERSION:
> 
> I run IPA (4.8) on a low powered CentOS 7 system, and the thundering
> herd of dogtag-ipa-renew-agent-submit processes that certmonger
> spawns at startup appears to be causing issues.
> 
> I'm looking for some way to limit the number of concurrent requests
> that certmonger spawns at startup.
> 
> 
> LONG VERSION:
> 
> I just updated my CentOS 7 IPA server to 4.6.8-5.el7, and I noticed
> that getcert was showing some (but not all) of my certificates as
> CA_UNREACHABLE.  (I noticed this when checking the system after the
> upgrade, but I don't actually know if the two are related.)
> 
>  # getcert list | grep status
>         status: MONITORING
>         status: MONITORING
>         status: CA_UNREACHABLE
>         status: CA_UNREACHABLE
>         status: CA_UNREACHABLE
>         status: MONITORING
>         status: CA_UNREACHABLE
>         status: MONITORING
>         status: MONITORING
>         status: MONITORING
>         status: MONITORING
>         status: MONITORING
>         status: MONITORING
>         status: MONITORING
>         status: MONITORING
>         status: MONITORING
> 
> There's no obvious difference between the "unreachable" certificates and
> the others.  For example:
> 
>  Request ID '20181001154020':
>          status: CA_UNREACHABLE
>          ca-error: Internal error
>          stuck: no
>          key pair storage:
> type=NSSDB,location='/etc/pki/pki-tomcat/alias',nickname='subsystemCert
> cert-pki-ca',token='NSS Certificate DB',pin set
>          certificate:
> type=NSSDB,location='/etc/pki/pki-tomcat/alias',nickname='subsystemCert
> cert-pki-ca',token='NSS Certificate DB'
>          CA: dogtag-ipa-ca-renew-agent
>          issuer: CN=Certificate Authority,O=PENURIO.US
>          subject: CN=CA Subsystem,O=PENURIO.US
>          expires: 2021-04-04 15:55:04 UTC
>          key usage:
> digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment
>          eku: id-kp-serverAuth,id-kp-clientAuth
>          pre-save command: /usr/libexec/ipa/certmonger/stop_pkicad
>          post-save command: /usr/libexec/ipa/certmonger/renew_ca_cert
> "subsystemCert cert-pki-ca"
>          track: yes
>          auto-renew: yes
>  Request ID '20181001154023':
>          status: MONITORING
>          stuck: no
>          key pair storage:
> type=NSSDB,location='/etc/pki/pki-tomcat/alias',nickname='caSigningCert
> cert-pki-ca',token='NSS Certificate DB',pin set
>          certificate:
> type=NSSDB,location='/etc/pki/pki-tomcat/alias',nickname='caSigningCert
> cert-pki-ca',token='NSS Certificate DB'
>          CA: dogtag-ipa-ca-renew-agent
>          issuer: CN=Certificate Authority,O=PENURIO.US
>          subject: CN=Certificate Authority,O=PENURIO.US
>          expires: 2033-07-22 21:47:43 UTC
>          key usage: digitalSignature,nonRepudiation,keyCertSign,cRLSign
>          pre-save command: /usr/libexec/ipa/certmonger/stop_pkicad
>          post-save command: /usr/libexec/ipa/certmonger/renew_ca_cert
> "caSigningCert cert-pki-ca"
>          track: yes
>          auto-renew: yes
> 
> I do see a bunch of dogtag-ipa-ca-renew-agent-submit errors in the log:
> 
>  Mar 26 10:07:56 asterisk.penurio.us
> dogtag-ipa-ca-renew-agent-submit[2575]: Traceback (most recent call last):
>      File "/usr/libexec/certmonger/dogtag-ipa-ca-renew-agent-submit",
> line 533, in <module>
>        sys.exit(main())
>      File "/usr/libexec/certmonger/dogtag-ipa-ca-renew-agent-submit",
> line 507, in main
>        kinit_keytab(principal, paths.KRB5_KEYTAB, ccache_filename)
>      File "/usr/lib/python2.7/site-packages/ipalib/install/kinit.py",
> line 47, in kinit_keytab
>        cred = gssapi.Credentials(name=name, store=store, usage='initiate')
>      File "/usr/lib64/python2.7/site-packages/gssapi/creds.py", line 64,
> in __new__
>        store=store)
>      File "/usr/lib64/python2.7/site-packages/gssapi/creds.py", line
> 148, in acquire
>        usage)
>      File "ext_cred_store.pyx", line 182, in
> gssapi.raw.ext_cred_store.acquire_cred_from
> (gssapi/raw/ext_cred_store.c:1732)
>    GSSError: Major (851968): Unspecified GSS failure.  Minor code may
> provide more information, Minor (2529639068): Cannot contact any KDC for
> realm 'PENURIO.US'
> 
> The KDC (and everything else) appear to be running:
> 
>  # ipactl status
>  Directory Service: RUNNING
>  krb5kdc Service: RUNNING
>  kadmin Service: RUNNING
>  named Service: RUNNING
>  httpd Service: RUNNING
>  ipa-custodia Service: RUNNING
>  pki-tomcatd Service: RUNNING
>  ipa-otpd Service: RUNNING
>  ipa-dnskeysyncd Service: RUNNING
>  ipa: INFO: The ipactl command was successful
> 
> I've been able to "fix" the certificate requests by running
> ipa-getcert resubmit on all of them.  In doing so, I noticed (via top)
> that it seems to take several minutes for each request to complete,
> during which time CPU utilization is *very* high.  (I honestly can't
> imagine what certmonger, dogtag, etc. are doing that requires so much
> CPU time to renew a certificate.)
> 
> This leads me to believe that the root cause of my issue is the
> "thundering herd" of dogtag-ipa-renew-agent-submit processes that
> certmonger spawns at startup.  It starts at least 30 instances of
> dogtag-ipa-renew-agent-submit (almost twice the number of requests
> shown by getcert list).
> 
> My server is a dual-core Atom N2800, which runs various network services
> in my home network.  It's relatively low-powered, but it isn't a
> Rasberry Pi.
> 
> In researching this, I came upon this (which is about a Rasberry Pi):
> 
> 
> https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahosted.org/thread/KS2QPAGQYNJ6GO3VZNIGI5G6YQUBGHYH/
> 
> 
> One comment in that thread struck me:
> 
>  On startup certmonger examines all the certs to see if, for example, the
>  roots have changed. There are all the processes because there is one per
>  tracked cert I assume. There is serialization in the IPA certmonger
>  config (ipa-server-guard) so they go one at at time.

This only comes into play while specific certs are being renewed.

> 
> This certainly doesn't appear to be happening on my system.  I see ~30
> dogtag-ipa-renew-agent-submit, all consuming as much CPU as they can
> get (which admittedly isn't very much).
> 
.
It well could be as there are multiple dogtag-ipa-ca-renew-agent-submit
CA's configured (three in fact). At startup the CA helpers contact the
CA to determine its capabilities. This is done in parallel.

To simulate what its doing you can run this against a helper, in this
case the IPA helper:

for name in IDENTIFY FETCH-ROOTS GET-SUPPORTED-TEMPLATES
GET-DEFAULT-TEMPLATE GET-NEW-REQUEST-REQUIREMENTS
GET-RENEW-REQUEST-REQUIREMENTS; \
do \
echo $name; \
export CERTMONGER_OPERATION=$name; \
/usr/libexec/certmonger/ipa-submit; \
echo "----"; \
done

Each is a fork. Given your OS version, I'm assuming your missing an
upstream patch which alleviates most of this.

Each fork was closing all possible file descriptors. In 0.79.9 this was
changed to use /proc/self/fd to determine the open descriptors so only a
few are closed (see https://bugzilla.redhat.com/show_bug.cgi?id=1656519)

So particularly if your max fd limit is high this is going to cause a
lot of churn.

This wouldn't affect the CA_UNREACHABLE status unless the system was
just so overloaded by certmonger that the CA was slow to start up. The
ca-error for the CA_UNREACHABLE status is the string certmonger got back
from the CA, so something was up with the CA at the time. Probably
related to the startup load.

certmonger circles back to certs so it probably would have corrected
itself over time.
_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahosted.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Reply via email to