On 05/16/2011 03:52 PM, Simo Sorce wrote:
On Sat, 2011-05-14 at 16:46 +0200, Sigbjorn Lie wrote:
I've noticed that if the machine running IPA is very busy at startup,
the IPA services will not be online when the machine is started.

I noticed this is as my test virtualization host has had it's power cord
knocked out a few times. When I restart the host machine, all the
virtual machines is started at the same time, causing (a lot) higher
than normal latency for each virtual machine.

This causes the IPA daemons to start, while during the startup one or
several IPA daemons fails due to dependencies of other daemons which is
not started yet, and all the IPA daemons is stopped as not all the IPA
daemons started successfully. I've noticed that the default behavior of
the ipactl command is to shut down all the IPA daemons, if any of the
IPA daemons should fail during startup.

This can be seen in the logs of the individual services, as some is
started successfully, just to receive a shutdown signal shortly after.
It seem to be the pki-ca which shut down my IPA services this morning.

When rebooting the virtual machine running the IPA daemons during normal
load of the host machine, all the IPA daemons start successfully.
Logging on to the IPA server and manually starting the IPA daemons after
the load of the host machine has decreased also works.

I suggest changing the startup scripts to allow (a lot) longer startup
times for the IPA daemons prior to failing them.
At the moment we just run service<name>  start and wait until it is
done. If the pki-cad service timeouts and returns an error I think we
need to open a bug against the dogtag component as that is the cause.

Can you open a bug in the freeipa trac with logs showing that service is
responsible for the failure ?

I haven't been able to figure out which service that failed IPA yet. A lot of log files scattered around. As you can see from the slapd errors file, the slapd daemon was available for almost 3 minutes before receiving the shutdown signal. I notice now that the PKI daemon failed 8 seconds after slapd had shut down, so I was wrong in blaming the PKI daemon.

See below for a list of log files I've been trough. They all have on thing in common, the daemons starts when the host machine is started, at approx 06:34, then receives a shutdown signal around 06:37. Some time later when the host has calmed down, I'm logging in and manually starting IPA using "ipactl start", and all the daemons start without any problem. And they keep running after my manual intervention.

I wish I could be more specific, but I'm unsure where else to look. Suggestions?


/var/log/krb5kdc.log
/var/log/pki-ca/catalina.out
/var/log/dirsrv/slapd-IX-TEST-COM/errors
/var/log/dirsrv/slapd-PKI-IPA/errors
/var/log/httpd/error_log
/var/log/messages (named log)

slapd errors:

[14/May/2011:06:33:52 +0200] - 389-Directory/1.2.8.rc1 B2011.062.1416 starting up [14/May/2011:06:33:54 +0200] - Detected Disorderly Shutdown last time Directory Server was running, recovering database. [14/May/2011:06:34:39 +0200] schema-compat-plugin - warning: no entries set up under , ou=SUDOers, dc=ix,dc=TEST,dc=com [14/May/2011:06:34:39 +0200] - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=ix,dc=TEST,dc=com--no CoS Templates found, which should be added b
efore the CoS Definition.
[14/May/2011:06:34:40 +0200] - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=ix,dc=TEST,dc=com--no CoS Templates found, which should be added b
efore the CoS Definition.
[14/May/2011:06:34:41 +0200] - slapd started. Listening on All Interfaces port 389 for LDAP requests [14/May/2011:06:34:41 +0200] - Listening on All Interfaces port 636 for LDAPS requests [14/May/2011:06:34:42 +0200] - Listening on /var/run/slapd-IX-TEST-COM.socket for LDAPI requests [14/May/2011:06:37:30 +0200] - slapd shutting down - signaling operation threads [14/May/2011:06:37:30 +0200] - slapd shutting down - closing down internal subsystems and plugins
[14/May/2011:06:37:31 +0200] - Waiting for 4 database threads to stop
[14/May/2011:06:37:32 +0200] - All database threads now stopped
[14/May/2011:06:37:32 +0200] - slapd stopped.


/var/log/pki-ca/system:
1871.main - [14/May/2011:06:37:40 CEST] [8] [3] In Ldap (bound) connection pool to host ipasrv01.ix.TEST.com port 7389, Cannot connect to LDAP server. Error: netscape.ldap.LDAPException: failed to connect to server ldap://ipasrv01.ix.TEST.com:7389 (91)

_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Reply via email to