On 05/23/2011 03:59 PM, Sigbjorn Lie wrote:
On 05/23/2011 11:37 PM, Rich Megginson wrote:
On 05/22/2011 04:16 AM, Sigbjorn Lie wrote:
On 05/17/2011 07:24 PM, Rich Megginson wrote:
On 05/17/2011 06:40 AM, Sigbjorn Lie wrote:
On 05/16/2011 04:56 PM, Rich Megginson wrote:
On 05/16/2011 08:43 AM, Sigbjorn Lie wrote:
On 05/16/2011 03:52 PM, Simo Sorce wrote:
On Sat, 2011-05-14 at 16:46 +0200, Sigbjorn Lie wrote:
I've noticed that if the machine running IPA is very busy at startup,
the IPA services will not be online when the machine is started.

I noticed this is as my test virtualization host has had it's power cord
knocked out a few times. When I restart the host machine, all the
virtual machines is started at the same time, causing (a lot) higher
than normal latency for each virtual machine.

This causes the IPA daemons to start, while during the startup one or several IPA daemons fails due to dependencies of other daemons which is not started yet, and all the IPA daemons is stopped as not all the IPA daemons started successfully. I've noticed that the default behavior of the ipactl command is to shut down all the IPA daemons, if any of the
IPA daemons should fail during startup.

This can be seen in the logs of the individual services, as some is started successfully, just to receive a shutdown signal shortly after. It seem to be the pki-ca which shut down my IPA services this morning.

When rebooting the virtual machine running the IPA daemons during normal
load of the host machine, all the IPA daemons start successfully.
Logging on to the IPA server and manually starting the IPA daemons after
the load of the host machine has decreased also works.

I suggest changing the startup scripts to allow (a lot) longer startup
times for the IPA daemons prior to failing them.
At the moment we just run service<name> start and wait until it is done. If the pki-cad service timeouts and returns an error I think we need to open a bug against the dogtag component as that is the cause.

Can you open a bug in the freeipa trac with logs showing that service is
responsible for the failure ?

I haven't been able to figure out which service that failed IPA yet. A lot of log files scattered around. As you can see from the slapd errors file, the slapd daemon was available for almost 3 minutes before receiving the shutdown signal. I notice now that the PKI daemon failed 8 seconds after slapd had shut down, so I was wrong in blaming the PKI daemon.

See below for a list of log files I've been trough. They all have on thing in common, the daemons starts when the host machine is started, at approx 06:34, then receives a shutdown signal around 06:37. Some time later when the host has calmed down, I'm logging in and manually starting IPA using "ipactl start", and all the daemons start without any problem. And they keep running after my manual intervention.

I wish I could be more specific, but I'm unsure where else to look. Suggestions?


/var/log/krb5kdc.log
/var/log/pki-ca/catalina.out
/var/log/dirsrv/slapd-IX-TEST-COM/errors
/var/log/dirsrv/slapd-PKI-IPA/errors
/var/log/httpd/error_log
/var/log/messages (named log)

slapd errors:

[14/May/2011:06:33:52 +0200] - 389-Directory/1.2.8.rc1 B2011.062.1416 starting up [14/May/2011:06:33:54 +0200] - Detected Disorderly Shutdown last time Directory Server was running, recovering database.
1) Disorderly Shutdown means a) crash b) kill -9 or similar - neither of which should be happening - is this the replica install or the first master install?



First master install.

What is in the slapd errors log before [14/May/2011:06:33:52 +0200] - 389-Directory/1.2.8.rc1 B2011.062.1416 starting up?


Hi,

Rich, there is nothing above that line. Previous entry was from last time the server started.
So it shows the server startup lines, but not server shutdown lines, then server startup again? How much time passed between the server startups?

Those log entries was after the power cord had been pulled by accident, which is why there was no shut down events. :)
Hmm - I wonder if the directory server database recovery took too long, and the startup timed out thinking that it could not be started?









Yesterday I rebooted my host platform, graceful shutdown this time, and the same problem occurred again when the host, and all the virtual machines started. I had a look in my boot.log file, see below for output. As you can see the "Starting pki-ca" return an "OK", but the next line says: "Failed to start CA Service"
"Shutting down".

Looking at the timestamps, it looks like the dirsrv instance is shut down before the pki-ca is given a chance to start, or am I looking at the incorrect log files?

I have included my boot.log, and the PKI-CA dirsrv log, and the pki-ca debug log.
Yeah, weird. Something shuts down the directory server at 18:50:00 but pki-ca is still attempting to use it at 18:50:16




/var/log/boot.log:

Starting Directory Service
Starting dirsrv:
    IX-TEST-COM...                                       [  OK  ]
    PKI-IPA...                                             [  OK  ]
Starting KDC Service
Starting Kerberos 5 KDC:                                   [  OK  ]
Starting KPASSWD Service
Starting ipa_kpasswd:                                      [  OK  ]
Starting DNS Service
Starting named:                                            [  OK  ]
Starting HTTP Service
Starting httpd:                                            [  OK  ]
Starting CA Service
Starting pki-ca:                                           [  OK  ]
Failed to start CA Service
Shutting down
Stopping Kerberos 5 KDC:                                   [  OK  ]
Shutting down ipa_kpasswd:                                 [  OK  ]
Stopping named:                                            [  OK  ]
Stopping httpd:                                            [  OK  ]
Stopping pki-ca:                                           [FAILED]
Shutting down dirsrv:
    IX-TEST-COM...                                       [  OK  ]
    PKI-IPA...                                             [  OK  ]
Aborting ipactl




/var/log/dirsrv/slapd-PKI-IPA/errors:

[21/May/2011:18:47:41 +0200] - 389-Directory/1.2.8.2 B2011.104.2252 starting up [21/May/2011:18:47:48 +0200] - slapd started. Listening on All Interfaces port 7389 for LDAP requests [21/May/2011:18:47:49 +0200] - Listening on All Interfaces port 7390 for LDAPS requests [21/May/2011:18:50:00 +0200] - slapd shutting down - signaling operation threads [21/May/2011:18:50:01 +0200] - slapd shutting down - closing down internal subsystems and plugins
[21/May/2011:18:50:01 +0200] - Waiting for 4 database threads to stop
[21/May/2011:18:50:02 +0200] - All database threads now stopped
[21/May/2011:18:50:02 +0200] - slapd stopped.




/var/lib/pki-ca/logs/debug:

[21/May/2011:18:50:15][main]: ============================================ [21/May/2011:18:50:15][main]: ===== DEBUG SUBSYSTEM INITIALIZED ======= [21/May/2011:18:50:15][main]: ============================================
[21/May/2011:18:50:15][main]: CMSEngine: done init id=debug
[21/May/2011:18:50:15][main]: CMSEngine: initialized debug
[21/May/2011:18:50:15][main]: CMSEngine: initSubsystem id=log
[21/May/2011:18:50:15][main]: CMSEngine: ready to init id=log
[21/May/2011:18:50:16][main]: LogFile: log event type selected: AUDIT_LOG_STARTUP [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUDIT_LOG_SHUTDOWN [21/May/2011:18:50:16][main]: LogFile: log event type selected: ROLE_ASSUME [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_CERT_POLICY [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_CERT_PROFILE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_CRL_PROFILE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_OCSP_PROFILE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_AUTH [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_ROLE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_ACL [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_SIGNED_AUDIT [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_ENCRYPTION [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_TRUSTED_PUBLIC_KEY [21/May/2011:18:50:16][main]: LogFile: log event type selected: CONFIG_DRM [21/May/2011:18:50:16][main]: LogFile: log event type selected: SELFTESTS_EXECUTION [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUDIT_LOG_DELETE [21/May/2011:18:50:16][main]: LogFile: log event type selected: LOG_PATH_CHANGE [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_ARCHIVE_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_ARCHIVE_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_EXPORT_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: PRIVATE_KEY_EXPORT_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST_ASYNC [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_AGENT_LOGIN [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_RECOVERY_REQUEST_PROCESSED_ASYNC [21/May/2011:18:50:16][main]: LogFile: log event type selected: KEY_GEN_ASYMMETRIC [21/May/2011:18:50:16][main]: LogFile: log event type selected: NON_PROFILE_CERT_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: PROFILE_CERT_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_STATUS_CHANGE_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_STATUS_CHANGE_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTHZ_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTHZ_FAIL [21/May/2011:18:50:16][main]: LogFile: log event type selected: INTER_BOUNDARY [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTH_FAIL [21/May/2011:18:50:16][main]: LogFile: log event type selected: AUTH_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: CERT_PROFILE_APPROVAL [21/May/2011:18:50:16][main]: LogFile: log event type selected: PROOF_OF_POSSESSION [21/May/2011:18:50:16][main]: LogFile: log event type selected: CRL_RETRIEVAL [21/May/2011:18:50:16][main]: LogFile: log event type selected: CRL_VALIDATION [21/May/2011:18:50:16][main]: LogFile: log event type selected: CMC_SIGNED_REQUEST_SIG_VERIFY [21/May/2011:18:50:16][main]: LogFile: log event type selected: SERVER_SIDE_KEYGEN_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: SERVER_SIDE_KEYGEN_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: SERVER_SIDE_KEYGEN_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_SESSION_KEY_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_SESSION_KEY_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_SESSION_KEY_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: DIVERSIFY_KEY_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: DIVERSIFY_KEY_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: DIVERSIFY_KEY_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: ENCRYPT_DATA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: ENCRYPT_DATA_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: ENCRYPT_DATA_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_ADD_CA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_ADD_CA_REQUEST_PROCESSED [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_REMOVE_CA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_REMOVE_CA_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: OCSP_REMOVE_CA_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_RANDOM_DATA_REQUEST [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_RANDOM_DATA_REQUEST_PROCESSED_SUCCESS [21/May/2011:18:50:16][main]: LogFile: log event type selected: COMPUTE_RANDOM_DATA_REQUEST_PROCESSED_FAILURE [21/May/2011:18:50:16][main]: LogFile: log event type selected: CIMC_CERT_VERIFICATION
[21/May/2011:18:50:16][main]: CMSEngine: done init id=log
[21/May/2011:18:50:16][main]: CMSEngine: initialized log
[21/May/2011:18:50:16][main]: CMSEngine: initSubsystem id=os
[21/May/2011:18:50:16][main]: CMSEngine: ready to init id=os
[21/May/2011:18:50:16][main]: CMSEngine: done init id=os
[21/May/2011:18:50:16][main]: CMSEngine: initialized os
[21/May/2011:18:50:16][main]: CMSEngine: initSubsystem id=jss
[21/May/2011:18:50:16][main]: CMSEngine: ready to init id=jss
[21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_rc4_40_md5 [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_rc2_40_md5 [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_rc4_128_md5 [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_3des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_fips_des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_fips_3des_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher fortezza [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher fortezza_rc4_128_sha [21/May/2011:18:50:16][main]: JSSSubsystem: initSSL(): setting ssl cipher rsa_null_md5
[21/May/2011:18:50:16][main]: CMSEngine: done init id=jss
[21/May/2011:18:50:16][main]: CMSEngine: initialized jss
[21/May/2011:18:50:16][main]: CMSEngine: initSubsystem id=dbs
[21/May/2011:18:50:16][main]: CMSEngine: ready to init id=dbs
[21/May/2011:18:50:16][main]: LdapBoundConnFactory: init
[21/May/2011:18:50:16][main]: LdapBoundConnFactory:doCloning true
[21/May/2011:18:50:16][main]: LdapAuthInfo: init()
[21/May/2011:18:50:16][main]: LdapAuthInfo: init begins
[21/May/2011:18:50:16][main]: LdapAuthInfo: init: prompt is Internal LDAP Database [21/May/2011:18:50:16][main]: LdapAuthInfo: init: try getting from memory cache [21/May/2011:18:50:16][main]: LdapAuthInfo: init: password not in memory [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: try to get it from password store [21/May/2011:18:50:16][main]: CMSEngine: getPasswordStore(): password store initialized before. [21/May/2011:18:50:16][main]: CMSEngine: getPasswordStore(): password store initialized. [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: about to get from passwored store: Internal LDAP Database [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: password store available [21/May/2011:18:50:16][main]: LdapAuthInfo: getPasswordFromStore: password for Internal LDAP Database not found, trying internaldb [21/May/2011:18:50:16][main]: LdapAuthInfo: password ok: store in memory cache
[21/May/2011:18:50:16][main]: LdapAuthInfo: init ends
[21/May/2011:18:50:16][main]: init: before makeConnection errorIfDown is true
[21/May/2011:18:50:16][main]: makeConnection: errorIfDown true
[21/May/2011:18:50:16][main]: CMS:Caught EBaseException
Internal Database Error encountered: Could not connect to LDAP server host ipa01.ix.test.com port 7389 Error netscape.ldap.LDAPException: failed to conn
ect to server ldap://ipa01.ix.test.com:7389 (91)
    at com.netscape.cmscore.dbs.DBSubsystem.init(DBSubsystem.java:674)
at com.netscape.cmscore.apps.CMSEngine.initSubsystem(CMSEngine.java:837) at com.netscape.cmscore.apps.CMSEngine.initSubsystems(CMSEngine.java:766)
    at com.netscape.cmscore.apps.CMSEngine.init(CMSEngine.java:302)
    at com.netscape.certsrv.apps.CMS.init(CMS.java:153)
    at com.netscape.certsrv.apps.CMS.start(CMS.java:1530)
at com.netscape.cms.servlet.base.CMSStartServlet.init(CMSStartServlet.java:85) at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1173) at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:993) at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4187) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4496) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1041) at org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:964) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:502) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:516) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
    at org.apache.catalina.startup.Catalina.start(Catalina.java:593)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
    at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
[21/May/2011:18:50:16][main]: CMSEngine.shutdown()




_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Reply via email to