On Tue, Oct 06, 2015 at 09:35:08AM -0400, Rob Crittenden wrote: > Andrew E. Bruno wrote: > > On Mon, Oct 05, 2015 at 02:48:48PM -0400, Rob Crittenden wrote: > >> Andrew E. Bruno wrote: > >>> On Mon, Oct 05, 2015 at 12:40:42PM +0200, Martin Kosek wrote: > >>>> On 10/02/2015 06:00 PM, Andrew E. Bruno wrote: > >>>>> On Fri, Oct 02, 2015 at 09:56:47AM -0400, Andrew E. Bruno wrote: > >>>>>> What's the best way to re-initialize a replica? > >>>>>> > >>>>>> Suppose one of your replicas goes south.. is there a command to tell > >>>>>> that replicate to re-initialize from the first master (instead of > >>>>>> removing/re-adding the replica from the topology)? > >>>>> > >>>>> Found the command I was looking for: > >>>>> ipa-replica-manage re-initialize --from xxx > >>>>> > >>>>> However, one of our replicates is down and can't seem to re-initialize > >>>>> it. Starting ipa fails (via systemctl restart ipa): > >>>>> > >>>>> ipactl status > >>>>> Directory Service: RUNNING > >>>>> krb5kdc Service: STOPPED > >>>>> kadmin Service: STOPPED > >>>>> named Service: STOPPED > >>>>> ipa_memcached Service: STOPPED > >>>>> httpd Service: STOPPED > >>>>> pki-tomcatd Service: STOPPED > >>>>> ipa-otpd Service: STOPPED > >>>>> ipa: INFO: The ipactl command was successful > >>>>> > >>>>> > >>>>> Errors from the dirsrv show: > >>>>> > >>>>> : GSSAPI Error: Unspecified GSS failure. Minor code may provide more > >>>>> information (No Kerberos credentials available)) errno 0 (Success) > >>>>> [02/Oct/2015:11:45:05 -0400] slapi_ldap_bind - Error: could not perform > >>>>> interactive bind for id [] authentication mechanism [GSSAPI]: error -2 > >>>>> (Local error) > >>>>> [02/Oct/2015:11:50:05 -0400] set_krb5_creds - Could not get initial > >>>>> credentials for principal [ldap/server@realm] in keytab > >>>>> [FILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC for > >>>>> requested realm) > >>>>> [02/Oct/2015:11:50:05 -0400] slapd_ldap_sasl_interactive_bind - Error: > >>>>> could not perform interactive bind for id [] mech [GSSAPI]: LDAP error > >>>>> -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified > >>>>> GSS failure. Minor code may provide more information (No Kerberos > >>>>> credentials available)) errno 0 (Success) > >>>>> [02/Oct/2015:11:50:05 -0400] slapi_ldap_bind - Error: could not perform > >>>>> interactive bind for id [] authentication mechanism [GSSAPI]: error -2 > >>>>> (Local error) > >>>>> > >>>>> > >>>>> Attempting to re-initialize fails: > >>>>> > >>>>> ipa-replica-manage re-initialize --from master > >>>>> Connection timed out. > >>>>> > >>>>> > >>>>> I verified time is in sync and DNS forward/reverse resolution is > >>>>> working. > >>>>> > >>>>> Any pointers on what else to try? > >>>>> > >>>>> Thanks! > >>>>> > >>>>> --Andrew > >>>> > >>>> Given that your Kerberos server instance is down, I would start > >>>> investigating > >>>> Kerberos logs to see why. > >>> > >>> > >>> So looks like the dirsrv service comes up but with GSS errors about kerb > >>> credentials. However, the rest of the services including the krb5kdc > >>> fail to come up. Errors from the kdc logs suggest DNS: > >> > >> DS complaining about GSS is somewhat normal during startup as it is a > >> bit noisy. The other errors suggest there is no data in the backend. An > >> ldapsearch would confirm that. > >> > >>> > >>> LOOKING_UP_CLIENT: DNS/replica@REALM Server error > >>> > >>> FreeIPA is configured to serve DNS and this replica resolves it's own > >>> DNS in /etc/resolv.conf (127.0.0.1) > >>> > >>> I tried pointing /etc/resolv.conf to another (good) replica and even > >>> tried adjusting /etc/krb5.conf to point at another kdc to try and get a > >>> ticket however it still tries to connect to the local kdc (which fails > >>> to start). > >>> > >>> I'm inclined to re-install this replica and start fresh. I'm curious if > >>> we can re-kickstart this host from a fresh os/freeipa install and run > >>> the ipa-replica-manage re-initialize --from master command. The replica > >>> will have the same name.. is this possible? Would we need to backup the > >>> /var/lib/ipa/replica-info-XXX.gpg file? > >> > >> It needs to have its own principal in order to re-initialize. It sounds > >> like it has nothing which is why replication is failing. > >> > >> I'd recommend generating a new replica file. There is no value in > >> re-using the old one and it could be harmful if the certificates are > >> expired. > >> > >> You'll need to delete all replication agreements this master had and > >> you'll need to use the --force option since it won't be accessible. When > >> you re-install the master it will get all the current data as part of > >> the setup so no need to re-initialize after that. > > > > I force removed the replica and still seeing the RUV's show up. > > > > # ipa-replica-manage -v --force del srv-m14-30.cbls.ccr.buffalo.edu > > > > > > From the logs: > > > > [06/Oct/2015:07:43:47 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Initiating CleanAllRUV Task... > > [06/Oct/2015:07:43:47 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Retrieving maxcsn... > > [06/Oct/2015:07:43:47 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Found maxcsn (5600051d001000050000) > > [06/Oct/2015:07:43:47 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Cleaning rid (5)... > > [06/Oct/2015:07:43:47 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Waiting to process all the updates from the deleted replica... > > [06/Oct/2015:07:43:47 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Waiting for all the replicas to be online... > > [06/Oct/2015:07:43:47 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Waiting for all the replicas to receive all the deleted replica updates... > > [06/Oct/2015:07:43:48 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Sending cleanAllRUV task to all the replicas... > > [06/Oct/2015:07:43:48 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Cleaning local ruv's... > > [06/Oct/2015:07:43:48 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Waiting for all the replicas to be cleaned... > > [06/Oct/2015:07:43:48 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Replica is not cleaned yet > > (agmt="cn=meTosrv-m14-31-02.cbls.ccr.buffalo.edu" (srv-m14-31-02:389)) > > [06/Oct/2015:07:43:48 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Replicas have not been cleaned yet, retrying in 10 seconds > > [06/Oct/2015:07:43:59 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Waiting for all the replicas to finish cleaning... > > [06/Oct/2015:07:43:59 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > > Successfully cleaned rid(5). > > > > The replica is not showing up when running ipa-replica-manage list. > > > > # ipa-replica-manage list > > srv-m14-32.cbls.ccr.buffalo.edu: master > > srv-m14-31-02.cbls.ccr.buffalo.edu: master > > > > > > However, still seeing the ruvs in ldapsearch: > > > > ldapsearch -Y GSSAPI -b "cn=mapping tree,cn=config" > > objectClass=nsDS5ReplicationAgreement -LL > > > > > > nsds50ruv: {replica 5 ldap://srv-m14-30.cbls.ccr.buffalo.edu:389} > > 55afec6b0000 > > 00050000 55b2aa68000200050000 > > > > > > .. > > > > nsds50ruv: {replica 91 ldap://srv-m14-30.cbls.ccr.buffalo.edu:389} > > 55afecb0000 > > 0005b0000 55b13e740000005b0000 > > > > > > Should I clean these manually? or can I run: ipa-replica-manage clean-ruv 5 > > > > Thanks again for the all the help. > > > > --Andrew > > > > > > Note that the list of masters comes from entries in IPA, not from > replication agreements. > > ipa-replica-manage list-ruv will show the RUV data in a simpler way. > > Yeah, I'd use clean-ruv to clean them up. > > rob > >
I get an error trying to clean-ruv: # ipa-replica-manage clean-ruv 5 Replica ID 5 not found Can these safely be ignored? or will we hit problems when adding the replica back in? Thanks again. -- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go to http://freeipa.org for more info on the project