Ian Levesque wrote:
On our production IPA servers, we have been running in a multi-master state
successfully for several weeks. Yesterday, while attempting to modify some
permissions and roles using the web UI, we had an odd problem where the web UI
became unresponsive. In an attempt to resolve the issue, I issued an `ipactl
restart` and when that didn't fix the web UI, I rebooted the VM. When IPA
services came back up, the replica would try to sync and the primary would
crash. I noticed that if IPA on the replica was off, the primary server was
fine. So, after fighting with this for a few hours I decided to remove the
replica and start the replication process again.
Replica reinstall didn't go so well:
[root@sbgrid-directory ~]# ipa-replica-manage disconnect
[root@sbgrid-directory ~]# ipa-replica-manage del
(this failed, unfortunately I didn't record the error)
[root@sbgrid-directory ~]# ipa-replica-manage del -f
[root@sbgrid-directory-replica ~]# ipa-server-install --uninstall
[root@sbgrid-directory-replica ~]# ipa-replica-install
Starting replication, please wait until this has completed.
[sbgrid-directory.in.hwlab] reports: Update failed! Status: [-2 -
creation of replica failed: Failed to start replication
Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.
When I try to start the primary (sbgrid-directory) server, I see these errors:
ns-slapd: GSSAPI Error: Unspecified GSS failure. Minor code may
provide more information (Cannot contact any KDC for requested realm)
NSMMReplicationPlugin - repl_set_mtn_referrals: could not set referrals
for replica dc=sbgrid,dc=org: 20
set_krb5_creds - Could not get initial credentials for principal
[ldap/sbgrid-directory.in.hw...@sbgrid.org] in keytab
[WRFILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC for
slapd_ldap_sasl_interactive_bind - Error: could not perform interactive
bind for id  mech [GSSAPI]: error -2 (Local error) (SASL(-1): generic
failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more
information (Credentials cache file '/tmp/krb5cc_496' not found))
slapi_ldap_bind - Error: could not perform interactive bind for id 
mech [GSSAPI]: error -2 (Local error)
Yikes, what a mess -- thanks for any help.
Strange. Is your 389-ds instance running? If so can you run this query:
ldapsearch -x -b 'cn=services,cn=accounts,dc=sbgrid,dc=org'
I have the feeling that the principals for your IPA server have gone away.
Note that when removing a replica it is often necessary to restart its
replication partners because sometimes there are old tickets cached.
I've never seen a case where principals were actually removed though.
What version of IPA are you running, on what distro?
Freeipa-users mailing list