Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task

Martin Kosek Mon, 17 Sep 2012 02:04:08 -0700

On 09/14/2012 09:17 PM, Rob Crittenden wrote:
> Martin Kosek wrote:
>> On 09/06/2012 11:17 PM, Rob Crittenden wrote:
>>> Martin Kosek wrote:
>>>> On 09/06/2012 05:55 PM, Rob Crittenden wrote:
>>>>> Rob Crittenden wrote:
>>>>>> Rob Crittenden wrote:
>>>>>>> Martin Kosek wrote:
>>>>>>>> On 09/05/2012 08:06 PM, Rob Crittenden wrote:
>>>>>>>>> Rob Crittenden wrote:
>>>>>>>>>> Martin Kosek wrote:
>>>>>>>>>>> On 07/05/2012 08:39 PM, Rob Crittenden wrote:
>>>>>>>>>>>> Martin Kosek wrote:
>>>>>>>>>>>>> On 07/03/2012 04:41 PM, Rob Crittenden wrote:
>>>>>>>>>>>>>> Deleting a replica can leave a replication vector (RUV) on the
>>>>>>>>>>>>>> other servers.
>>>>>>>>>>>>>> This can confuse things if the replica is re-added, and it also
>>>>>>>>>>>>>> causes the
>>>>>>>>>>>>>> server to calculate changes against a server that may no longer
>>>>>>>>>>>>>> exist.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 389-ds-base provides a new task that self-propogates itself to 
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> available
>>>>>>>>>>>>>> replicas to clean this RUV data.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This patch will create this task at deletion time to hopefully
>>>>>>>>>>>>>> clean things up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It isn't perfect. If any replica is down or unavailable at the
>>>>>>>>>>>>>> time
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> cleanruv task fires, and then comes back up, the old RUV data
>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>> re-propogated around.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To make things easier in this case I've added two new commands to
>>>>>>>>>>>>>> ipa-replica-manage. The first lists the replication ids of all 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> servers we
>>>>>>>>>>>>>> have a RUV for. Using this you can call clean_ruv with the
>>>>>>>>>>>>>> replication id of a
>>>>>>>>>>>>>> server that no longer exists to try the cleanallruv step again.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is quite dangerous though. If you run cleanruv against a
>>>>>>>>>>>>>> replica id that
>>>>>>>>>>>>>> does exist it can cause a loss of data. I believe I've put in
>>>>>>>>>>>>>> enough scary
>>>>>>>>>>>>>> warnings about this.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> rob
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Good work there, this should make cleaning RUVs much easier than
>>>>>>>>>>>>> with the
>>>>>>>>>>>>> previous version.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is what I found during review:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) list_ruv and clean_ruv command help in man is quite lost. I
>>>>>>>>>>>>> think
>>>>>>>>>>>>> it would
>>>>>>>>>>>>> help if we for example have all info for commands indented. This
>>>>>>>>>>>>> way
>>>>>>>>>>>>> user could
>>>>>>>>>>>>> simply over-look the new commands in the man page.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2) I would rename new commands to clean-ruv and list-ruv to make
>>>>>>>>>>>>> them
>>>>>>>>>>>>> consistent with the rest of the commands (re-initialize,
>>>>>>>>>>>>> force-sync).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) It would be nice to be able to run clean_ruv command in an
>>>>>>>>>>>>> unattended way
>>>>>>>>>>>>> (for better testing), i.e. respect --force option as we already
>>>>>>>>>>>>> do for
>>>>>>>>>>>>> ipa-replica-manage del. This fix would aid test automation in the
>>>>>>>>>>>>> future.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 4) (minor) The new question (and the del too) does not react too
>>>>>>>>>>>>> well for
>>>>>>>>>>>>> CTRL+D:
>>>>>>>>>>>>>
>>>>>>>>>>>>> # ipa-replica-manage clean_ruv 3 --force
>>>>>>>>>>>>> Clean the Replication Update Vector for
>>>>>>>>>>>>> vm-055.idm.lab.bos.redhat.com:389
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cleaning the wrong replica ID will cause that server to no
>>>>>>>>>>>>> longer replicate so it may miss updates while the process
>>>>>>>>>>>>> is running. It would need to be re-initialized to maintain
>>>>>>>>>>>>> consistency. Be very careful.
>>>>>>>>>>>>> Continue to clean? [no]: unexpected error:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 5) Help for clean_ruv command without a required parameter is 
>>>>>>>>>>>>> quite
>>>>>>>>>>>>> confusing
>>>>>>>>>>>>> as it reports that command is wrong and not the parameter:
>>>>>>>>>>>>>
>>>>>>>>>>>>> # ipa-replica-manage clean_ruv
>>>>>>>>>>>>> Usage: ipa-replica-manage [options]
>>>>>>>>>>>>>
>>>>>>>>>>>>> ipa-replica-manage: error: must provide a command [clean_ruv |
>>>>>>>>>>>>> force-sync |
>>>>>>>>>>>>> disconnect | connect | del | re-initialize | list | list_ruv]
>>>>>>>>>>>>>
>>>>>>>>>>>>> It seems you just forgot to specify the error message in the
>>>>>>>>>>>>> command
>>>>>>>>>>>>> definition
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 6) When the remote replica is down, the clean_ruv command fails
>>>>>>>>>>>>> with an
>>>>>>>>>>>>> unexpected error:
>>>>>>>>>>>>>
>>>>>>>>>>>>> [root@vm-086 ~]# ipa-replica-manage clean_ruv 5
>>>>>>>>>>>>> Clean the Replication Update Vector for
>>>>>>>>>>>>> vm-055.idm.lab.bos.redhat.com:389
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cleaning the wrong replica ID will cause that server to no
>>>>>>>>>>>>> longer replicate so it may miss updates while the process
>>>>>>>>>>>>> is running. It would need to be re-initialized to maintain
>>>>>>>>>>>>> consistency. Be very careful.
>>>>>>>>>>>>> Continue to clean? [no]: y
>>>>>>>>>>>>> unexpected error: {'desc': 'Operations error'}
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/dirsrv/slapd-IDM-LAB-BOS-REDHAT-COM/errors:
>>>>>>>>>>>>> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin -
>>>>>>>>>>>>> cleanAllRUV_task: failed
>>>>>>>>>>>>> to connect to repl        agreement connection
>>>>>>>>>>>>> (cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica,
>>>>>>>>>>>>>
>>>>>>>>>>>>> cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> tree,cn=config), error 105
>>>>>>>>>>>>> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin -
>>>>>>>>>>>>> cleanAllRUV_task: replica
>>>>>>>>>>>>> (cn=meTovm-055.idm.lab.
>>>>>>>>>>>>> bos.redhat.com,cn=replica,cn=dc\3Didm\2Cdc\3Dlab\2Cdc\3Dbos\2Cdc\3Dredhat\2Cdc\3Dcom,cn=mapping
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> tree,   cn=config) has not been cleaned.  You will need to rerun
>>>>>>>>>>>>> the
>>>>>>>>>>>>> CLEANALLRUV task on this replica.
>>>>>>>>>>>>> [04/Jul/2012:06:28:16 -0400] NSMMReplicationPlugin -
>>>>>>>>>>>>> cleanAllRUV_task: Task
>>>>>>>>>>>>> failed (1)
>>>>>>>>>>>>>
>>>>>>>>>>>>> In this case I think we should inform user that the command 
>>>>>>>>>>>>> failed,
>>>>>>>>>>>>> possibly
>>>>>>>>>>>>> because of disconnected replicas and that they could enable the
>>>>>>>>>>>>> replicas and
>>>>>>>>>>>>> try again.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 7) (minor) "pass" is now redundant in replication.py:
>>>>>>>>>>>>> +        except ldap.INSUFFICIENT_ACCESS:
>>>>>>>>>>>>> +            # We can't make the server we're removing read-only
>>>>>>>>>>>>> but
>>>>>>>>>>>>> +            # this isn't a show-stopper
>>>>>>>>>>>>> +            root_logger.debug("No permission to switch replica to
>>>>>>>>>>>>> read-only,
>>>>>>>>>>>>> continuing anyway")
>>>>>>>>>>>>> +            pass
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think this addresses everything.
>>>>>>>>>>>>
>>>>>>>>>>>> rob
>>>>>>>>>>>
>>>>>>>>>>> Thanks, almost there! I just found one more issue which needs to be
>>>>>>>>>>> fixed
>>>>>>>>>>> before we push:
>>>>>>>>>>>
>>>>>>>>>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force
>>>>>>>>>>> Directory Manager password:
>>>>>>>>>>>
>>>>>>>>>>> Unable to connect to replica vm-055.idm.lab.bos.redhat.com, forcing
>>>>>>>>>>> removal
>>>>>>>>>>> Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc':
>>>>>>>>>>> "Can't
>>>>>>>>>>> contact LDAP server"}
>>>>>>>>>>> Forcing removal on 'vm-086.idm.lab.bos.redhat.com'
>>>>>>>>>>>
>>>>>>>>>>> There were issues removing a connection: %d format: a number is
>>>>>>>>>>> required, not str
>>>>>>>>>>>
>>>>>>>>>>> Failed to get data from 'vm-055.idm.lab.bos.redhat.com': {'desc':
>>>>>>>>>>> "Can't
>>>>>>>>>>> contact LDAP server"}
>>>>>>>>>>>
>>>>>>>>>>> This is a traceback I retrieved:
>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>       File "/sbin/ipa-replica-manage", line 425, in del_master
>>>>>>>>>>>         del_link(realm, r, hostname, options.dirman_passwd, 
>>>>>>>>>>> force=True)
>>>>>>>>>>>       File "/sbin/ipa-replica-manage", line 271, in del_link
>>>>>>>>>>>         repl1.cleanallruv(replica_id)
>>>>>>>>>>>       File
>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py",
>>>>>>>>>>> line 1094, in cleanallruv
>>>>>>>>>>>         root_logger.debug("Creating CLEANALLRUV task for replica id
>>>>>>>>>>> %d" %
>>>>>>>>>>> replicaId)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The problem here is that you don't convert replica_id to int in this
>>>>>>>>>>> part:
>>>>>>>>>>> +    replica_id = None
>>>>>>>>>>> +    if repl2:
>>>>>>>>>>> +        replica_id = repl2._get_replica_id(repl2.conn, None)
>>>>>>>>>>> +    else:
>>>>>>>>>>> +        servers = get_ruv(realm, replica1, dirman_passwd)
>>>>>>>>>>> +        for (netloc, rid) in servers:
>>>>>>>>>>> +            if netloc.startswith(replica2):
>>>>>>>>>>> +                replica_id = rid
>>>>>>>>>>> +                break
>>>>>>>>>>>
>>>>>>>>>>> Martin
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Updated patch using new mechanism in 389-ds-base. This should more
>>>>>>>>>> thoroughly clean out RUV data when a replica is being deleted, and
>>>>>>>>>> provide for a way to delete RUV data afterwards too if necessary.
>>>>>>>>>>
>>>>>>>>>> rob
>>>>>>>>>
>>>>>>>>> Rebased patch
>>>>>>>>>
>>>>>>>>> rob
>>>>>>>>>
>>>>>>>>
>>>>>>>> 0) As I wrote in a review for your patch 1041, changelog entry slipped
>>>>>>>> elsewhere.
>>>>>>>>
>>>>>>>> 1) The following KeyboardInterrupt except class looks suspicious. I
>>>>>>>> know why
>>>>>>>> you have it there, but since it is generally a bad thing to do, some
>>>>>>>> comment
>>>>>>>> why it is needed would be useful.
>>>>>>>>
>>>>>>>> @@ -256,6 +263,17 @@ def del_link(realm, replica1, replica2,
>>>>>>>> dirman_passwd,
>>>>>>>> force=False):
>>>>>>>>         repl1.delete_agreement(replica2)
>>>>>>>>         repl1.delete_referral(replica2)
>>>>>>>>
>>>>>>>> +    if type1 == replication.IPA_REPLICA:
>>>>>>>> +        if repl2:
>>>>>>>> +            ruv = repl2._get_replica_id(repl2.conn, None)
>>>>>>>> +        else:
>>>>>>>> +            ruv = get_ruv_by_host(realm, replica1, replica2,
>>>>>>>> dirman_passwd)
>>>>>>>> +
>>>>>>>> +        try:
>>>>>>>> +            repl1.cleanallruv(ruv)
>>>>>>>> +        except KeyboardInterrupt:
>>>>>>>> +            pass
>>>>>>>> +
>>>>>>>>
>>>>>>>> Maybe you just wanted to do some cleanup and then "raise" again?
>>>>>>>
>>>>>>> No, it is there because it is safe to break out of it. The task will
>>>>>>> continue to run. I added some verbiage.
>>>>>>>
>>>>>>>>
>>>>>>>> 2) This is related to 1), but when some replica is down,
>>>>>>>> "ipa-replica-manage
>>>>>>>> del" may wait indefinitely when some remote replica is down, right?
>>>>>>>>
>>>>>>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com
>>>>>>>> Deleting a master is irreversible.
>>>>>>>> To reconnect to the remote master you will need to prepare a new
>>>>>>>> replica file
>>>>>>>> and re-install.
>>>>>>>> Continue to delete? [no]: y
>>>>>>>> ipa: INFO: Setting agreement
>>>>>>>> cn=meTovm-086.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> tree,cn=config schedule to 2358-2359 0 to force synch
>>>>>>>> ipa: INFO: Deleting schedule 2358-2359 0 from agreement
>>>>>>>> cn=meTovm-086.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> tree,cn=config
>>>>>>>> ipa: INFO: Replication Update in progress: FALSE: status: 0 Replica
>>>>>>>> acquired
>>>>>>>> successfully: Incremental update succeeded: start: 0: end: 0
>>>>>>>> Background task created to clean replication data
>>>>>>>>
>>>>>>>> ... after about a minute I hit CTRL+C
>>>>>>>>
>>>>>>>> ^CDeleted replication agreement from 'vm-086.idm.lab.bos.redhat.com' to
>>>>>>>> 'vm-055.idm.lab.bos.redhat.com'
>>>>>>>> Failed to cleanup vm-055.idm.lab.bos.redhat.com DNS entries: NS record
>>>>>>>> does not
>>>>>>>> contain 'vm-055.idm.lab.bos.redhat.com.'
>>>>>>>> You may need to manually remove them from the tree
>>>>>>>>
>>>>>>>> I think it would be better to inform user that some remote replica is
>>>>>>>> down or
>>>>>>>> at least that we are waiting for the task to complete. Something like
>>>>>>>> that:
>>>>>>>>
>>>>>>>> # ipa-replica-manage del vm-055.idm.lab.bos.redhat.com
>>>>>>>> ...
>>>>>>>> Background task created to clean replication data
>>>>>>>> Replication data clean up may take very long time if some replica is
>>>>>>>> unreachable
>>>>>>>> Hit CTRL+C to interrupt the wait
>>>>>>>> ^C Clean up wait interrupted
>>>>>>>> ....
>>>>>>>> [continue with del]
>>>>>>>
>>>>>>> Yup, did this in #1.
>>>>>>>
>>>>>>>>
>>>>>>>> 3) (minor) When there is a cleanruv task running and you run
>>>>>>>> "ipa-replica-manage del", there is a unexpected error message with
>>>>>>>> duplicate
>>>>>>>> task object in LDAP:
>>>>>>>>
>>>>>>>> # ipa-replica-manage del vm-072.idm.lab.bos.redhat.com --force
>>>>>>>> Unable to connect to replica vm-072.idm.lab.bos.redhat.com, forcing
>>>>>>>> removal
>>>>>>>> FAIL
>>>>>>>> Failed to get data from 'vm-072.idm.lab.bos.redhat.com': {'desc': 
>>>>>>>> "Can't
>>>>>>>> contact LDAP server"}
>>>>>>>> Forcing removal on 'vm-086.idm.lab.bos.redhat.com'
>>>>>>>>
>>>>>>>> There were issues removing a connection: This entry already exists
>>>>>>>> <<<<<<<<<
>>>>>>>>
>>>>>>>> Failed to get data from 'vm-072.idm.lab.bos.redhat.com': {'desc': 
>>>>>>>> "Can't
>>>>>>>> contact LDAP server"}
>>>>>>>> Failed to cleanup vm-072.idm.lab.bos.redhat.com DNS entries: NS record
>>>>>>>> does not
>>>>>>>> contain 'vm-072.idm.lab.bos.redhat.com.'
>>>>>>>> You may need to manually remove them from the tree
>>>>>>>>
>>>>>>>>
>>>>>>>> I think it should be enough to just catch for "entry already exists" in
>>>>>>>> cleanallruv function, and in such case print a relevant error message
>>>>>>>> bail out.
>>>>>>>> Thus, self.conn.checkTask(dn, dowait=True) would not be called too.
>>>>>>>
>>>>>>> Good catch, fixed.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 4) (minor): In make_readonly function, there is a redundant "pass"
>>>>>>>> statement:
>>>>>>>>
>>>>>>>> +    def make_readonly(self):
>>>>>>>> +        """
>>>>>>>> +        Make the current replication agreement read-only.
>>>>>>>> +        """
>>>>>>>> +        dn = DN(('cn', 'userRoot'), ('cn', 'ldbm database'),
>>>>>>>> +                ('cn', 'plugins'), ('cn', 'config'))
>>>>>>>> +
>>>>>>>> +        mod = [(ldap.MOD_REPLACE, 'nsslapd-readonly', 'on')]
>>>>>>>> +        try:
>>>>>>>> +            self.conn.modify_s(dn, mod)
>>>>>>>> +        except ldap.INSUFFICIENT_ACCESS:
>>>>>>>> +            # We can't make the server we're removing read-only but
>>>>>>>> +            # this isn't a show-stopper
>>>>>>>> +            root_logger.debug("No permission to switch replica to
>>>>>>>> read-only,
>>>>>>>> continuing anyway")
>>>>>>>> +            pass         <<<<<<<<<<<<<<<
>>>>>>>
>>>>>>> Yeah, this is one of my common mistakes. I put in a pass initially, then
>>>>>>> add logging in front of it and forget to delete the pass. Its gone now.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 5) In clean_ruv, I think allowing a --force option to bypass the
>>>>>>>> user_input
>>>>>>>> would be helpful (at least for test automation):
>>>>>>>>
>>>>>>>> +    if not ipautil.user_input("Continue to clean?", False):
>>>>>>>> +        sys.exit("Aborted")
>>>>>>>
>>>>>>> Yup, added.
>>>>>>>
>>>>>>> rob
>>>>>>
>>>>>> Slightly revised patch. I still had a window open with one unsaved 
>>>>>> change.
>>>>>>
>>>>>> rob
>>>>>>
>>>>>
>>>>> Apparently there were two unsaved changes, one of which was lost. This
>>>>> adds in
>>>>> the 'entry already exists' fix.
>>>>>
>>>>> rob
>>>>>
>>>>
>>>> Just one last thing (otherwise the patch is OK) - I don't think this is
>>>> what we
>>>> want :-)
>>>>
>>>> # ipa-replica-manage clean-ruv 8
>>>> Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389
>>>>
>>>> Cleaning the wrong replica ID will cause that server to no
>>>> longer replicate so it may miss updates while the process
>>>> is running. It would need to be re-initialized to maintain
>>>> consistency. Be very careful.
>>>> Continue to clean? [no]: y   <<<<<<
>>>> Aborted
>>>>
>>>>
>>>> Nor this exception, (your are checking for wrong exception):
>>>>
>>>> # ipa-replica-manage clean-ruv 8
>>>> Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389
>>>>
>>>> Cleaning the wrong replica ID will cause that server to no
>>>> longer replicate so it may miss updates while the process
>>>> is running. It would need to be re-initialized to maintain
>>>> consistency. Be very careful.
>>>> Continue to clean? [no]:
>>>> unexpected error: This entry already exists
>>>>
>>>> This is the exception:
>>>>
>>>> Traceback (most recent call last):
>>>>     File "/sbin/ipa-replica-manage", line 651, in <module>
>>>>       main()
>>>>     File "/sbin/ipa-replica-manage", line 648, in main
>>>>       clean_ruv(realm, args[1], options)
>>>>     File "/sbin/ipa-replica-manage", line 373, in clean_ruv
>>>>       thisrepl.cleanallruv(ruv)
>>>>     File 
>>>> "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py",
>>>> line 1136, in cleanallruv
>>>>       self.conn.addEntry(e)
>>>>     File "/usr/lib/python2.7/site-packages/ipaserver/ipaldap.py", line 
>>>> 503, in
>>>> addEntry
>>>>       self.__handle_errors(e, arg_desc=arg_desc)
>>>>     File "/usr/lib/python2.7/site-packages/ipaserver/ipaldap.py", line 
>>>> 321, in
>>>> __handle_errors
>>>>       raise errors.DuplicateEntry()
>>>> ipalib.errors.DuplicateEntry: This entry already exists
>>>>
>>>> Martin
>>>>
>>>
>>> Fixed that and a couple of other problems. When doing a disconnect we should
>>> not also call clean-ruv.
>>
>> Ah, good self-catch.
>>
>>>
>>> I also got tired of seeing crappy error messages so I added a little convert
>>> utility.
>>>
>>> rob
>>
>> 1) There is CLEANALLRUV stuff included in 1050-3 and not here. There are also
>> some finding for this new code.
>>
>>
>> 2) We may want to bump Requires to higher version of 389-ds-base
>> (389-ds-base-1.2.11.14-1) - it contains a fix for CLEANALLRUV+winsync bug I
>> found earlier.
>>
>>
>> 3) I just discovered another suspicious behavior. When we are deleting a 
>> master
>> that has links also to other master(s) we delete those too. But we also
>> automatically run CLEANALLRUV in these cases, so we may end up in multiple
>> tasks being started on different masters - this does not look right.
>>
>> I think we may rather want to at first delete all links and then run
>> CLEANALLRUV task, just for one time. This is what I get with current code:
>>
>> # ipa-replica-manage del vm-072.idm.lab.bos.redhat.com
>> Directory Manager password:
>>
>> Deleting a master is irreversible.
>> To reconnect to the remote master you will need to prepare a new replica file
>> and re-install.
>> Continue to delete? [no]: yes
>> ipa: INFO: Setting agreement
>> cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
>>
>> tree,cn=config schedule to 2358-2359 0 to force synch
>> ipa: INFO: Deleting schedule 2358-2359 0 from agreement
>> cn=meTovm-055.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
>>
>> tree,cn=config
>> ipa: INFO: Replication Update in progress: FALSE: status: 0 Replica acquired
>> successfully: Incremental update succeeded: start: 0: end: 0
>> Background task created to clean replication data. This may take a while.
>> This may be safely interrupted with Ctrl+C
>>
>> ^CWait for task interrupted. It will continue to run in the background
>>
>> Deleted replication agreement from 'vm-055.idm.lab.bos.redhat.com' to
>> 'vm-072.idm.lab.bos.redhat.com'
>> ipa: INFO: Setting agreement
>> cn=meTovm-086.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
>>
>> tree,cn=config schedule to 2358-2359 0 to force synch
>> ipa: INFO: Deleting schedule 2358-2359 0 from agreement
>> cn=meTovm-086.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
>>
>> tree,cn=config
>> ipa: INFO: Replication Update in progress: FALSE: status: 0 Replica acquired
>> successfully: Incremental update succeeded: start: 0: end: 0
>> Background task created to clean replication data. This may take a while.
>> This may be safely interrupted with Ctrl+C
>>
>> ^CWait for task interrupted. It will continue to run in the background
>>
>> Deleted replication agreement from 'vm-086.idm.lab.bos.redhat.com' to
>> 'vm-072.idm.lab.bos.redhat.com'
>> Failed to cleanup vm-072.idm.lab.bos.redhat.com DNS entries: NS record does 
>> not
>> contain 'vm-072.idm.lab.bos.redhat.com.'
>> You may need to manually remove them from the tree
>>
>> Martin
>>
> 
> All issues addressed and I pulled in abort-clean-ruv from 1050. I added a
> list-clean-ruv command as well.
> 
> rob


1) Patch 1031-9 needs to get squashed with 1031-8


2) Patch needs a rebase (conflict in freeipa.spec.in)


3) New list-clean-ruv man entry is not right:

       list-clean-ruv [REPLICATION_ID]
              - List all running CLEANALLRUV and abort CLEANALLRUV tasks.

REPLICATION_ID is not its argument.

Btw. new list-clean-ruv command proved very useful for me.


4) I just found out we need to do a better job with make_readonly() command. I
get into trouble when disconnecting one link to a remote replica as it was
marked readonly and then I was then unable to manage the disconnected replica
properly (vm-072 is the replica made readonly):

[root@vm-055 ~]# ipa-replica-manage disconnect vm-072.idm.lab.bos.redhat.com

[root@vm-072 ~]# ipa-replica-manage del vm-055.idm.lab.bos.redhat.com
Deleting a master is irreversible.
To reconnect to the remote master you will need to prepare a new replica file
and re-install.
Continue to delete? [no]: yes
Deleting replication agreements between vm-055.idm.lab.bos.redhat.com and
vm-072.idm.lab.bos.redhat.com
ipa: INFO: Setting agreement
cn=meTovm-072.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
tree,cn=config schedule to 2358-2359 0 to force synch
ipa: INFO: Deleting schedule 2358-2359 0 from agreement
cn=meTovm-072.idm.lab.bos.redhat.com,cn=replica,cn=dc\=idm\,dc\=lab\,dc\=bos\,dc\=redhat\,dc\=com,cn=mapping
tree,cn=config
ipa: INFO: Replication Update in progress: FALSE: status: 0 Replica acquired
successfully: Incremental update succeeded: start: 0: end: 0
Deleted replication agreement from 'vm-072.idm.lab.bos.redhat.com' to
'vm-055.idm.lab.bos.redhat.com'
Unable to remove replication agreement for vm-055.idm.lab.bos.redhat.com from
vm-072.idm.lab.bos.redhat.com.
Background task created to clean replication data. This may take a while.
This may be safely interrupted with Ctrl+C
^CWait for task interrupted. It will continue to run in the background

Failed to cleanup vm-055.idm.lab.bos.redhat.com entries: Server is unwilling to
perform: database is read-only arguments:
dn=krbprincipalname=ldap/vm-055.idm.lab.bos.redhat....@idm.lab.bos.redhat.com,cn=services,cn=accounts,dc=idm,dc=lab,dc=bos,dc=redhat,dc=com

You may need to manually remove them from the tree
ipa: INFO: Unhandled LDAPError: {'info': 'database is read-only', 'desc':
'Server is unwilling to perform'}

Failed to cleanup vm-055.idm.lab.bos.redhat.com DNS entries: Server is
unwilling to perform: database is read-only

You may need to manually remove them from the tree


--cleanup did not work for me as well:
[root@vm-072 ~]# ipa-replica-manage del vm-055.idm.lab.bos.redhat.com --force
--cleanup
Cleaning a master is irreversible.
This should not normally be require, so use cautiously.
Continue to clean master? [no]: yes
unexpected error: Server is unwilling to perform: database is read-only
arguments:
dn=krbprincipalname=ldap/vm-055.idm.lab.bos.redhat....@idm.lab.bos.redhat.com,cn=services,cn=accounts,dc=idm,dc=lab,dc=bos,dc=redhat,dc=com

Martin

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task

Reply via email to