On Thu, 2012-09-06 at 12:42 -0400, Mark Reynolds wrote: > > On 09/06/2012 12:27 PM, Martin Kosek wrote: > > On 09/06/2012 06:13 PM, Rich Megginson wrote: > >> On 09/06/2012 10:09 AM, Martin Kosek wrote: > >>> On 09/06/2012 06:09 PM, Martin Kosek wrote: > >>>> On 09/06/2012 06:05 PM, Martin Kosek wrote: > >>>>> On 09/06/2012 05:55 PM, Rob Crittenden wrote: > >>>>>> Rob Crittenden wrote: > >>>>>>> Rob Crittenden wrote: > >>>>>>>> Martin Kosek wrote: > >>>>>>>>> On 09/05/2012 08:06 PM, Rob Crittenden wrote: > >>>>>>>>>> Rob Crittenden wrote: > >>>>>>>>>>> Martin Kosek wrote: > >>>>>>>>>>>> On 07/05/2012 08:39 PM, Rob Crittenden wrote: > >>>>>>>>>>>>> Martin Kosek wrote: > >>>>>>>>>>>>>> On 07/03/2012 04:41 PM, Rob Crittenden wrote: > >>>>>>>>>>>>>>> Deleting a replica can leave a replication vector (RUV) on the > >>>>>>>>>>>>>>> other servers. > >>>>>>>>>>>>>>> This can confuse things if the replica is re-added, and it > >>>>>>>>>>>>>>> also > >>>>>>>>>>>>>>> causes the > >>>>>>>>>>>>>>> server to calculate changes against a server that may no > >>>>>>>>>>>>>>> longer > >>>>>>>>>>>>>>> exist. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 389-ds-base provides a new task that self-propogates itself > >>>>>>>>>>>>>>> to all > >>>>>>>>>>>>>>> available > >>>>>>>>>>>>>>> replicas to clean this RUV data. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> This patch will create this task at deletion time to hopefully > >>>>>>>>>>>>>>> clean things up. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> It isn't perfect. If any replica is down or unavailable at the > >>>>>>>>>>>>>>> time > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>> cleanruv task fires, and then comes back up, the old RUV data > >>>>>>>>>>>>>>> may be > >>>>>>>>>>>>>>> re-propogated around. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> To make things easier in this case I've added two new > >>>>>>>>>>>>>>> commands to > >>>>>>>>>>>>>>> ipa-replica-manage. The first lists the replication ids of > >>>>>>>>>>>>>>> all the > >>>>>>>>>>>>>>> servers we > >>>>>>>>>>>>>>> have a RUV for. Using this you can call clean_ruv with the > >>>>>>>>>>>>>>> replication id of a > >>>>>>>>>>>>>>> server that no longer exists to try the cleanallruv step > >>>>>>>>>>>>>>> again. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> This is quite dangerous though. If you run cleanruv against a > >>>>>>>>>>>>>>> replica id that > >>>>>>>>>>>>>>> does exist it can cause a loss of data. I believe I've put in > >>>>>>>>>>>>>>> enough scary > >>>>>>>>>>>>>>> warnings about this. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> rob > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> Good work there, this should make cleaning RUVs much easier > >>>>>>>>>>>>>> than > >>>>>>>>>>>>>> with the > >>>>>>>>>>>>>> previous version. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> This is what I found during review: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 1) list_ruv and clean_ruv command help in man is quite lost. I > >>>>>>>>>>>>>> think > >>>>>>>>>>>>>> it would > >>>>>>>>>>>>>> help if we for example have all info for commands indented. > >>>>>>>>>>>>>> This > >>>>>>>>>>>>>> way > >>>>>>>>>>>>>> user could > >>>>>>>>>>>>>> simply over-look the new commands in the man page. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 2) I would rename new commands to clean-ruv and list-ruv to > >>>>>>>>>>>>>> make > >>>>>>>>>>>>>> them > >>>>>>>>>>>>>> consistent with the rest of the commands (re-initialize, > >>>>>>>>>>>>>> force-sync). > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 3) It would be nice to be able to run clean_ruv command in an > >>>>>>>>>>>>>> unattended way > >>>>>>>>>>>>>> (for better testing), i.e. respect --force option as we already > >>>>>>>>>>>>>> do for > >>>>>>>>>>>>>> ipa-replica-manage del. This fix would aid test automation in > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>> future. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 4) (minor) The new question (and the del too) does not react > >>>>>>>>>>>>>> too > >>>>>>>>>>>>>> well for > >>>>>>>>>>>>>> CTRL+D: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> # ipa-replica-manage clean_ruv 3 --force > >>>>>>>>>>>>>> Clean the Replication Update Vector for > >>>>>>>>>>>>>> vm-055.idm.lab.bos.redhat.com:389 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Cleaning the wrong replica ID will cause that server to no > >>>>>>>>>>>>>> longer replicate so it may miss updates while the process > >>>>>>>>>>>>>> is running. It would need to be re-initialized to maintain > >>>>>>>>>>>>>> consistency. Be very careful. > >>>>>>>>>>>>>> Continue to clean? [no]: unexpected error: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 5) Help for clean_ruv command without a required parameter is > >>>>>>>>>>>>>> quite > >>>>>>>>>>>>>> confusing > >>>>>>>>>>>>>> as it reports that command is wrong and not the parameter: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> # ipa-replica-manage clean_ruv > >>>>>>>>>>>>>> Usage: ipa-replica-manage [options] > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> ipa-replica-manage: error: must provide a command [clean_ruv | > >>>>>>>>>>>>>> force-sync | > >>>>>>>>>>>>>> disconnect | connect | del | re-initialize | list | list_ruv] > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> It seems you just forgot to specify the error message in the > >>>>>>>>>>>>>> command > >>>>>>>>>>>>>> definition > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 6) When the remote replica is down, the clean_ruv command fails > >>>>>>>>>>>>>> with an > >>>>>>>>>>>>>> unexpected error: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> [root@vm-086 ~]# ipa-replica-manage clean_ruv 5 > >>>>>>>>>>>>>> Clean the Replication Update Vector for > >>>>>>>>>>>>>> vm-055.idm.lab.bos.redhat.com:389 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Cleaning the wrong replica ID will cause that server to no > >>>>>>>>>>>>>> longer replicate so it may miss updates while the process > >>>>>>>>>>>>>> is running. It would need to be re-initialized to maintain > >>>>>>>>>>>>>> consistency. Be very careful. > >>>>>>>>>>>>>> Continue to clean? [no]: y > >>>>>>>>>>>>>> unexpected error: {'desc': 'Operations error'} > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors: > >>>>>>>>>>>>>> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - > >>>>>>>>>>>>>> cleanAllRUV_task: failed > >>>>>>>>>>>>>> to connect to repl agreement connection > >>>>>>>>>>>>>> (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> tree,cn=config), error 105 > >>>>>>>>>>>>>> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - > >>>>>>>>>>>>>> cleanAllRUV_task: replica > >>>>>>>>>>>>>> (cn=meTovm-055.idm.lab. > >>>>>>>>>>>>>> bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> tree, cn=config) has not been cleaned. You will need to > >>>>>>>>>>>>>> rerun > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>> CLEANALLRUV task on this replica. > >>>>>>>>>>>>>> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin - > >>>>>>>>>>>>>> cleanAllRUV_task: Task > >>>>>>>>>>>>>> failed (1) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> In this case I think we should inform user that the command > >>>>>>>>>>>>>> failed, > >>>>>>>>>>>>>> possibly > >>>>>>>>>>>>>> because of disconnected replicas and that they could enable the > >>>>>>>>>>>>>> replicas and > >>>>>>>>>>>>>> try again. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 7) (minor) "pass" is now redundant in replication.py: > >>>>>>>>>>>>>> + except ldap.INSUFFICIENT_ACCESS: > >>>>>>>>>>>>>> + # We can't make the server we're removing > >>>>>>>>>>>>>> read-only > >>>>>>>>>>>>>> but > >>>>>>>>>>>>>> + # this isn't a show-stopper > >>>>>>>>>>>>>> + root_logger.debug("No permission to switch > >>>>>>>>>>>>>> replica to > >>>>>>>>>>>>>> read-only, > >>>>>>>>>>>>>> continuing anyway") > >>>>>>>>>>>>>> + pass > >>>>>>>>>>>>>> > >>>>>>>>>>>>> I think this addresses everything. > >>>>>>>>>>>>> > >>>>>>>>>>>>> rob > >>>>>>>>>>>> Thanks, almost there! I just found one more issue which needs to > >>>>>>>>>>>> be > >>>>>>>>>>>> fixed > >>>>>>>>>>>> before we push: > >>>>>>>>>>>> > >>>>>>>>>>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force > >>>>>>>>>>>> Directory Manager password: > >>>>>>>>>>>> > >>>>>>>>>>>> Unable to connect to replica vm-055.idm.lab.bos.redhat.com, > >>>>>>>>>>>> forcing > >>>>>>>>>>>> removal > >>>>>>>>>>>> Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc': > >>>>>>>>>>>> "Can't > >>>>>>>>>>>> contact LDAP server"} > >>>>>>>>>>>> Forcing removal on 'vm-086.idm.lab.bos.redhat.com' > >>>>>>>>>>>> > >>>>>>>>>>>> There were issues removing a connection: %d format: a number is > >>>>>>>>>>>> required, not str > >>>>>>>>>>>> > >>>>>>>>>>>> Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc': > >>>>>>>>>>>> "Can't > >>>>>>>>>>>> contact LDAP server"} > >>>>>>>>>>>> > >>>>>>>>>>>> This is a traceback I retrieved: > >>>>>>>>>>>> Traceback (most recent call last): > >>>>>>>>>>>> File "/sbin/ipa-replica-manage", line 425, in del_master > >>>>>>>>>>>> del_link(realm, r, hostname, options.dirman_passwd, > >>>>>>>>>>>> force=True) > >>>>>>>>>>>> File "/sbin/ipa-replica-manage", line 271, in del_link > >>>>>>>>>>>> repl1.cleanallruv(replica_id) > >>>>>>>>>>>> File > >>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py", > >>>>>>>>>>>> line 1094, in cleanallruv > >>>>>>>>>>>> root_logger.debug("Creating CLEANALLRUV task for replica > >>>>>>>>>>>> id > >>>>>>>>>>>> %d" % > >>>>>>>>>>>> replicaId) > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> The problem here is that you don't convert replica_id to int in > >>>>>>>>>>>> this > >>>>>>>>>>>> part: > >>>>>>>>>>>> + replica_id = None > >>>>>>>>>>>> + if repl2: > >>>>>>>>>>>> + replica_id = repl2._get_replica_id(repl2.conn, None) > >>>>>>>>>>>> + else: > >>>>>>>>>>>> + servers = get_ruv(realm, replica1, dirman_passwd) > >>>>>>>>>>>> + for (netloc, rid) in servers: > >>>>>>>>>>>> + if netloc.startswith(replica2): > >>>>>>>>>>>> + replica_id = rid > >>>>>>>>>>>> + break > >>>>>>>>>>>> > >>>>>>>>>>>> Martin > >>>>>>>>>>>> > >>>>>>>>>>> Updated patch using new mechanism in 389-ds-base. This should more > >>>>>>>>>>> thoroughly clean out RUV data when a replica is being deleted, and > >>>>>>>>>>> provide for a way to delete RUV data afterwards too if necessary. > >>>>>>>>>>> > >>>>>>>>>>> rob > >>>>>>>>>> Rebased patch > >>>>>>>>>> > >>>>>>>>>> rob > >>>>>>>>>> > >>>>>>>>> 0) As I wrote in a review for your patch 1041, changelog entry > >>>>>>>>> slipped > >>>>>>>>> elsewhere. > >>>>>>>>> > >>>>>>>>> 1) The following KeyboardInterrupt except class looks suspicious. I > >>>>>>>>> know why > >>>>>>>>> you have it there, but since it is generally a bad thing to do, some > >>>>>>>>> comment > >>>>>>>>> why it is needed would be useful. > >>>>>>>>> > >>>>>>>>> @@ -256,6 +263,17 @@ def del_link(realm, replica1, replica2, > >>>>>>>>> dirman_passwd, > >>>>>>>>> force=False): > >>>>>>>>> repl1.delete_agreement(replica2) > >>>>>>>>> repl1.delete_referral(replica2) > >>>>>>>>> > >>>>>>>>> + if type1 == replication.IPA_REPLICA: > >>>>>>>>> + if repl2: > >>>>>>>>> + ruv = repl2._get_replica_id(repl2.conn, None) > >>>>>>>>> + else: > >>>>>>>>> + ruv = get_ruv_by_host(realm, replica1, replica2, > >>>>>>>>> dirman_passwd) > >>>>>>>>> + > >>>>>>>>> + try: > >>>>>>>>> + repl1.cleanallruv(ruv) > >>>>>>>>> + except KeyboardInterrupt: > >>>>>>>>> + pass > >>>>>>>>> + > >>>>>>>>> > >>>>>>>>> Maybe you just wanted to do some cleanup and then "raise" again? > >>>>>>>> No, it is there because it is safe to break out of it. The task will > >>>>>>>> continue to run. I added some verbiage. > >>>>>>>> > >>>>>>>>> 2) This is related to 1), but when some replica is down, > >>>>>>>>> "ipa-replica-manage > >>>>>>>>> del" may wait indefinitely when some remote replica is down, right? > >>>>>>>>> > >>>>>>>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com > >>>>>>>>> Deleting a master is irreversible. > >>>>>>>>> To reconnect to the remote master you will need to prepare a new > >>>>>>>>> replica file > >>>>>>>>> and re-install. > >>>>>>>>> Continue to delete? [no]: y > >>>>>>>>> ipa: INFO: Setting agreement > >>>>>>>>> cn=meTovm-086.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> tree,cn=config schedule to 2358-2359 0 to force synch > >>>>>>>>> ipa: INFO: Deleting schedule 2358-2359 0 from agreement > >>>>>>>>> cn=meTovm-086.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> tree,cn=config > >>>>>>>>> ipa: INFO: Replication Update in progress: FALSE: status: 0 Replica > >>>>>>>>> acquired > >>>>>>>>> successfully: Incremental update succeeded: start: 0: end: 0 > >>>>>>>>> Background task created to clean replication data > >>>>>>>>> > >>>>>>>>> ... after about a minute I hit CTRL+C > >>>>>>>>> > >>>>>>>>> ^CDeleted replication agreement from > >>>>>>>>> 'vm-086.idm.lab.bos.redhat.com' to > >>>>>>>>> 'vm-055.idm.lab.bos.redhat.com' > >>>>>>>>> Failed to cleanup vm-055.idm.lab.bos.redhat.com DNS entries: NS > >>>>>>>>> record > >>>>>>>>> does not > >>>>>>>>> contain 'vm-055.idm.lab.bos.redhat.com.' > >>>>>>>>> You may need to manually remove them from the tree > >>>>>>>>> > >>>>>>>>> I think it would be better to inform user that some remote replica > >>>>>>>>> is > >>>>>>>>> down or > >>>>>>>>> at least that we are waiting for the task to complete. Something > >>>>>>>>> like > >>>>>>>>> that: > >>>>>>>>> > >>>>>>>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com > >>>>>>>>> ... > >>>>>>>>> Background task created to clean replication data > >>>>>>>>> Replication data clean up may take very long time if some replica is > >>>>>>>>> unreachable > >>>>>>>>> Hit CTRL+C to interrupt the wait > >>>>>>>>> ^C Clean up wait interrupted > >>>>>>>>> .... > >>>>>>>>> [continue with del] > >>>>>>>> Yup, did this in #1. > >>>>>>>> > >>>>>>>>> 3) (minor) When there is a cleanruv task running and you run > >>>>>>>>> "ipa-replica-manage del", there is a unexpected error message with > >>>>>>>>> duplicate > >>>>>>>>> task object in LDAP: > >>>>>>>>> > >>>>>>>>> # ipa-replica-manage del vm-072.idm.lab.bos.redhat.com --force > >>>>>>>>> Unable to connect to replica vm-072.idm.lab.bos.redhat.com, forcing > >>>>>>>>> removal > >>>>>>>>> FAIL > >>>>>>>>> Failed to get data from 'vm-072.idm.lab.bos.redhat.com': {'desc': > >>>>>>>>> "Can't > >>>>>>>>> contact LDAP server"} > >>>>>>>>> Forcing removal on 'vm-086.idm.lab.bos.redhat.com' > >>>>>>>>> > >>>>>>>>> There were issues removing a connection: This entry already exists > >>>>>>>>> <<<<<<<<< > >>>>>>>>> > >>>>>>>>> Failed to get data from 'vm-072.idm.lab.bos.redhat.com': {'desc': > >>>>>>>>> "Can't > >>>>>>>>> contact LDAP server"} > >>>>>>>>> Failed to cleanup vm-072.idm.lab.bos.redhat.com DNS entries: NS > >>>>>>>>> record > >>>>>>>>> does not > >>>>>>>>> contain 'vm-072.idm.lab.bos.redhat.com.' > >>>>>>>>> You may need to manually remove them from the tree > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I think it should be enough to just catch for "entry already > >>>>>>>>> exists" in > >>>>>>>>> cleanallruv function, and in such case print a relevant error > >>>>>>>>> message > >>>>>>>>> bail out. > >>>>>>>>> Thus, self.conn.checkTask(dn, dowait=True) would not be called too. > >>>>>>>> Good catch, fixed. > >>>>>>>> > >>>>>>>>> 4) (minor): In make_readonly function, there is a redundant "pass" > >>>>>>>>> statement: > >>>>>>>>> > >>>>>>>>> + def make_readonly(self): > >>>>>>>>> + """ > >>>>>>>>> + Make the current replication agreement read-only. > >>>>>>>>> + """ > >>>>>>>>> + dn = DN(('cn', 'userRoot'), ('cn', 'ldbm database'), > >>>>>>>>> + ('cn', 'plugins'), ('cn', 'config')) > >>>>>>>>> + > >>>>>>>>> + mod = [(ldap.MOD_REPLACE, 'nsslapd-readonly', 'on')] > >>>>>>>>> + try: > >>>>>>>>> + self.conn.modify_s(dn, mod) > >>>>>>>>> + except ldap.INSUFFICIENT_ACCESS: > >>>>>>>>> + # We can't make the server we're removing read-only but > >>>>>>>>> + # this isn't a show-stopper > >>>>>>>>> + root_logger.debug("No permission to switch replica to > >>>>>>>>> read-only, > >>>>>>>>> continuing anyway") > >>>>>>>>> + pass<<<<<<<<<<<<<<< > >>>>>>>> Yeah, this is one of my common mistakes. I put in a pass initially, > >>>>>>>> then > >>>>>>>> add logging in front of it and forget to delete the pass. Its gone > >>>>>>>> now. > >>>>>>>> > >>>>>>>>> 5) In clean_ruv, I think allowing a --force option to bypass the > >>>>>>>>> user_input > >>>>>>>>> would be helpful (at least for test automation): > >>>>>>>>> > >>>>>>>>> + if not ipautil.user_input("Continue to clean?", False): > >>>>>>>>> + sys.exit("Aborted") > >>>>>>>> Yup, added. > >>>>>>>> > >>>>>>>> rob > >>>>>>> Slightly revised patch. I still had a window open with one unsaved > >>>>>>> change. > >>>>>>> > >>>>>>> rob > >>>>>>> > >>>>>> Apparently there were two unsaved changes, one of which was lost. This > >>>>>> adds in > >>>>>> the 'entry already exists' fix. > >>>>>> > >>>>>> rob > >>>>>> > >>>>> Just one last thing (otherwise the patch is OK) - I don't think this is > >>>>> what we > >>>>> want :-) > >>>>> > >>>>> # ipa-replica-manage clean-ruv 8 > >>>>> Clean the Replication Update Vector for > >>>>> vm-055.idm.lab.bos.redhat.com:389 > >>>>> > >>>>> Cleaning the wrong replica ID will cause that server to no > >>>>> longer replicate so it may miss updates while the process > >>>>> is running. It would need to be re-initialized to maintain > >>>>> consistency. Be very careful. > >>>>> Continue to clean? [no]: y<<<<<< > >>>>> Aborted > >>>>> > >>>>> > >>>>> Nor this exception, (your are checking for wrong exception): > >>>>> > >>>>> # ipa-replica-manage clean-ruv 8 > >>>>> Clean the Replication Update Vector for > >>>>> vm-055.idm.lab.bos.redhat.com:389 > >>>>> > >>>>> Cleaning the wrong replica ID will cause that server to no > >>>>> longer replicate so it may miss updates while the process > >>>>> is running. It would need to be re-initialized to maintain > >>>>> consistency. Be very careful. > >>>>> Continue to clean? [no]: > >>>>> unexpected error: This entry already exists > >>>>> > >>>>> This is the exception: > >>>>> > >>>>> Traceback (most recent call last): > >>>>> File "/sbin/ipa-replica-manage", line 651, in<module> > >>>>> main() > >>>>> File "/sbin/ipa-replica-manage", line 648, in main > >>>>> clean_ruv(realm, args[1], options) > >>>>> File "/sbin/ipa-replica-manage", line 373, in clean_ruv > >>>>> thisrepl.cleanallruv(ruv) > >>>>> File > >>>>> "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py", > >>>>> line 1136, in cleanallruv > >>>>> self.conn.addEntry(e) > >>>>> File "/usr/lib/python2.7/site-packages/ipaserver/ipaldap.py", line > >>>>> 503, in > >>>>> addEntry > >>>>> self.__handle_errors(e, arg_desc=arg_desc) > >>>>> File "/usr/lib/python2.7/site-packages/ipaserver/ipaldap.py", line > >>>>> 321, in > >>>>> __handle_errors > >>>>> raise errors.DuplicateEntry() > >>>>> ipalib.errors.DuplicateEntry: This entry already exists > >>>>> > >>>>> Martin > >>>>> > >>>> On another matter, I just noticed that CLEANRUV is not proceeding if I > >>>> have a > >>>> winsync replica defined (and it is even up): > >>>> > >>>> # ipa-replica-manage list > >>>> dc.ad.test: winsync<<<<<<< > >>>> vm-072.idm.lab.bos.redhat.com: master > >>>> vm-086.idm.lab.bos.redhat.com: master > >>>> > >>>> [06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > >>>> Waiting > >>>> for all the replicas to receive all the deleted replica updates... > >>>> [06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > >>>> Failed > >>>> to contact agmt (agmt="cn=meTodc.ad.test" (dc:389)) error (10), will > >>>> retry > >>>> later. > >>>> [06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > >>>> Not all > >>>> replicas caught up, retrying in 10 seconds > >>>> [06/Sep/2012:11:59:20 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > >>>> Failed > >>>> to contact agmt (agmt="cn=meTodc.ad.test" (dc:389)) error (10), will > >>>> retry > >>>> later. > >>>> [06/Sep/2012:11:59:20 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > >>>> Not all > >>>> replicas caught up, retrying in 20 seconds > >>>> [06/Sep/2012:11:59:40 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > >>>> Failed > >>>> to contact agmt (agmt="cn=meTodc.ad.test" (dc:389)) error (10), will > >>>> retry > >>>> later. > >>>> [06/Sep/2012:11:59:40 -0400] NSMMReplicationPlugin - CleanAllRUV Task: > >>>> Not all > >>>> replicas caught up, retrying in 40 seconds > >>>> > >>>> I don't think this is OK. Adding Rich to CC to follow on this one. > >>>> > >>>> Martin > >>>> > >>> And now the actual CC. > >>> Martin > >>> > >> Yeah - looks like CLEANALLRUV needs to ignore windows sync agreements > > Have you created a 389 ticket for that or would you like me to do that? > If you could, that would be great.
https://fedorahosted.org/389/ticket/450 Martin _______________________________________________ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel