Re: [Freeipa-devel] [PATCH 0019] handle cleanRUV in the topology plugin
On 10/23/2015 06:39 AM, Ludwig Krispenz wrote: On 10/23/2015 11:24 AM, thierry bordaz wrote: On 10/23/2015 11:00 AM, thierry bordaz wrote: On 10/12/2015 01:17 PM, Ludwig Krispenz wrote: On 10/12/2015 12:44 PM, Martin Basti wrote: On 23.07.2015 10:46, Ludwig Krispenz wrote: The attached patch moves the cleaning of the RUV into the topology plugin. I encountered a problem when removing a replica, which disconnects the topology, but it was fixed with my WIP for #5072. I want to keep these issues separate, so please review and test the patch and let me know about issues found Ludwig Is this patch still valid and pending review? it should be still valid, waiting for review, wanted to rebase after topology/promotion patches have been checked in and resend Hello Ludwig, The patch looks good. I have few minor remarks: * Are the hostname in ruv always fqdn ? to retrieve the RUV element of a given host you use 'strstr'. If you have host vm-11 and vm-112, I wonder if it could pickup the wrong RUV element * In ipa_topo_util_cleanruv_element you need a pblock_done/free (or destroy) * In it fails to add the clearn-ruv task, you should log a message so that the admin knows what to do. thanks thierry Hi Ludwig, I will adress the points raised, thank you Additional question. cleanruv is done with 'replica-force-cleaning: yes'. Currently ipa-replica-manage does not implement this flag. Why do you use it in topology plugin. there are two potential problems with the cleanallruv task: 1] the rid could come back if not all servers were in sync, but with cleaning the changelog as part of cleanallruv I think the ris is low now 2] the cleanallruv is stuck on waiting for the task to complete on other servers even if they cannot be reached The fix/rfe to cleanallruv that allows the force option to skip the online replica checks(https://fedorahosted.org/389/ticket/48218) has not been pushed yet. Currently its set for 1.3.5. FYI, Mark I want to avoid 2] therefore I choose this setting My concern is that if we delete a host before all the updates from that host has been received, could we receive a late update that will recreate the ruv element ? thanks thierry -- Manage your subscription for the Freeipa-devel mailing list: https://www.redhat.com/mailman/listinfo/freeipa-devel Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code
Re: [Freeipa-devel] [PATCH 0291] Limit max age of replication changelog
On 07/20/2015 11:24 AM, Rob Crittenden wrote: Martin Basti wrote: https://fedorahosted.org/freeipa/ticket/5086 Patch attached. Is this going to be a shock on upgrades for people who until now may be relying on the fact that there is no limit? Just throwing my 2 cents in. The replication changelog is not something that can typically be used externally, unlike the retro changelog. It's really a blackbox to the outside world. The risk of setting a changelog max age depends on how long any replica has been down for. So if the max age is set to 7 days, and a replica has been down for more than 7 days, then when it comes online it will not be able to catch up with the other active replicas and it will need to be reinitialized. Mark Should there be a way for an admin to manage this, via the config module perhaps? IMHO this is a significant change and red flags need to be raised so users are aware of it. rob -- Manage your subscription for the Freeipa-devel mailing list: https://www.redhat.com/mailman/listinfo/freeipa-devel Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code
Re: [Freeipa-devel] [PATCH 0291] Limit max age of replication changelog
On 07/20/2015 12:50 PM, Martin Basti wrote: On 20/07/15 17:48, Petr Vobornik wrote: On 07/20/2015 05:24 PM, Rob Crittenden wrote: Martin Basti wrote: https://fedorahosted.org/freeipa/ticket/5086 Patch attached. Is this going to be a shock on upgrades for people who until now may be relying on the fact that there is no limit? Not making any point, but have to note: Ludwig raised a question on users list but there was no feedback from users. https://www.redhat.com/archives/freeipa-users/2015-July/msg00022.html Should there be a way for an admin to manage this, via the config module perhaps? IMHO this is a significant change and red flags need to be raised so users are aware of it. rob IIUC there is purge delay 7 days, so if changelog max age is 7 or more days, it will not break replication. The issue is if somebody uses changelog for different purpose, right? Well the replication changelog can not be used for anything else but the multimaster replication plugin. If a customer increased the replication purge delay you could potentially run into issues, but again this only comes into play when a replica is down for a very long time. I'm not sure if IPA even provides the option to adjust the replication purge delay, but that doesn't mean a customer can not adjust these settings on their own. Mark -- Manage your subscription for the Freeipa-devel mailing list: https://www.redhat.com/mailman/listinfo/freeipa-devel Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code
Re: [Freeipa-devel] Update: Re: Fedora 20 Release
On 12/17/2013 11:35 AM, Rich Megginson wrote: On 12/16/2013 08:07 AM, Petr Spacek wrote: Hello list, we have to decide what we will do with 389-ds-base package in Fedora 20. Currently, we know about following problems: Schema problems: https://fedorahosted.org/389/ticket/47631 (regression) Fixed. Referential Integrity: https://fedorahosted.org/389/ticket/47621 (new functionality) https://fedorahosted.org/389/ticket/47624 (regression) Fixed. Replication: https://fedorahosted.org/389/ticket/47632 (?) Cannot reproduce. Closed as WORKSFORME. Stability: https://bugzilla.redhat.com/show_bug.cgi?id=1041732 Fixed. https://fedorahosted.org/389/ticket/47629 (we are not sure if the syncrepl really plays some role or not) We are still trying to determine the cause, and if this is related to the use of syncrepl. If it turns out to be related to syncrepl, I would like to release 1.3.2.9 in F20, and just disable the use of syncrepl in 389 clients. Is everyone ok with this? Rich I found a crash in 1.3.2 and 1.3.1. This should go into 1.3.2.9(or a 1.3.2.10). One option is to fix 1.3.2.x as quickly as possible. Another option is to build 1.3.1.x for F20 with Epoch == 1 and release it as quickly as possible. The problem with downgrade to 1.3.1.x is that it requires manual change in dse.ldif file. You have to disable 'content synchronization' (syncrepl) and 'whoami' plugins which are not in 1.3.1.x packages but were added and enabled by 1.3.2.x packages. In our tests, the downgraded DS server starts and works after manual dse.ldif correction (but be careful - we didn't test replication). Here is the main problem: 389-ds-base 1.3.2.8 is baked to Fedora 20 ISO images and there is not way how to replace it there. It means that somebody can do F19-F20 upgrade from ISO and *then* upgrade from repos will break his DS configuration (because of new plugins...). Simo thinks that this is a reason why 'downgrade package' with 1.3.1.x inevitably needs automated script which will purge two missing plugins from dse.ldif. Nathan, is it manageable before Christmas? One or either way? Is you think that the downgrade is safe from data format perspective? (I mean DB format upgrades etc.?) -- Mark Reynolds 389 Development Team Red Hat, Inc mreyno...@redhat.com ___ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
not contain 'vm-072.idm.lab.bos.redhat.com.' You may need to manually remove them from the tree I think it should be enough to just catch for entry already exists in cleanallruv function, and in such case print a relevant error message bail out. Thus, self.conn.checkTask(dn, dowait=True) would not be called too. Good catch, fixed. 4) (minor): In make_readonly function, there is a redundant pass statement: +def make_readonly(self): + +Make the current replication agreement read-only. + +dn = DN(('cn', 'userRoot'), ('cn', 'ldbm database'), +('cn', 'plugins'), ('cn', 'config')) + +mod = [(ldap.MOD_REPLACE, 'nsslapd-readonly', 'on')] +try: +self.conn.modify_s(dn, mod) +except ldap.INSUFFICIENT_ACCESS: +# We can't make the server we're removing read-only but +# this isn't a show-stopper +root_logger.debug(No permission to switch replica to read-only, continuing anyway) +pass Yeah, this is one of my common mistakes. I put in a pass initially, then add logging in front of it and forget to delete the pass. Its gone now. 5) In clean_ruv, I think allowing a --force option to bypass the user_input would be helpful (at least for test automation): +if not ipautil.user_input(Continue to clean?, False): +sys.exit(Aborted) Yup, added. rob Slightly revised patch. I still had a window open with one unsaved change. rob Apparently there were two unsaved changes, one of which was lost. This adds in the 'entry already exists' fix. rob Just one last thing (otherwise the patch is OK) - I don't think this is what we want :-) # ipa-replica-manage clean-ruv 8 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: y Aborted Nor this exception, (your are checking for wrong exception): # ipa-replica-manage clean-ruv 8 Clean the Replication Update Vector for vm-055.idm.lab.bos.redhat.com:389 Cleaning the wrong replica ID will cause that server to no longer replicate so it may miss updates while the process is running. It would need to be re-initialized to maintain consistency. Be very careful. Continue to clean? [no]: unexpected error: This entry already exists This is the exception: Traceback (most recent call last): File /sbin/ipa-replica-manage, line 651, inmodule main() File /sbin/ipa-replica-manage, line 648, in main clean_ruv(realm, args[1], options) File /sbin/ipa-replica-manage, line 373, in clean_ruv thisrepl.cleanallruv(ruv) File /usr/lib/python2.7/site-packages/ipaserver/install/replication.py, line 1136, in cleanallruv self.conn.addEntry(e) File /usr/lib/python2.7/site-packages/ipaserver/ipaldap.py, line 503, in addEntry self.__handle_errors(e, arg_desc=arg_desc) File /usr/lib/python2.7/site-packages/ipaserver/ipaldap.py, line 321, in __handle_errors raise errors.DuplicateEntry() ipalib.errors.DuplicateEntry: This entry already exists Martin On another matter, I just noticed that CLEANRUV is not proceeding if I have a winsync replica defined (and it is even up): # ipa-replica-manage list dc.ad.test: winsync vm-072.idm.lab.bos.redhat.com: master vm-086.idm.lab.bos.redhat.com: master [06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: Waiting for all the replicas to receive all the deleted replica updates... [06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: Failed to contact agmt (agmt=cn=meTodc.ad.test (dc:389)) error (10), will retry later. [06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV Task: Not all replicas caught up, retrying in 10 seconds [06/Sep/2012:11:59:20 -0400] NSMMReplicationPlugin - CleanAllRUV Task: Failed to contact agmt (agmt=cn=meTodc.ad.test (dc:389)) error (10), will retry later. [06/Sep/2012:11:59:20 -0400] NSMMReplicationPlugin - CleanAllRUV Task: Not all replicas caught up, retrying in 20 seconds [06/Sep/2012:11:59:40 -0400] NSMMReplicationPlugin - CleanAllRUV Task: Failed to contact agmt (agmt=cn=meTodc.ad.test (dc:389)) error (10), will retry later. [06/Sep/2012:11:59:40 -0400] NSMMReplicationPlugin - CleanAllRUV Task: Not all replicas caught up, retrying in 40 seconds I don't think this is OK. Adding Rich to CC to follow on this one. Martin And now the actual CC. Martin Yeah - looks like CLEANALLRUV needs to ignore windows sync agreements Yeah, sorry not that familiar with winsync(didn't know it used the same repl agmts). I will have to do another fix... ___ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel -- Mark Reynolds
Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task
to winsync agreement? It might, but I will have to check for winsync agreements and ignore them. So it should not be an issue moving forward. Martin ___ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel -- Mark Reynolds Senior Software Engineer Red Hat, Inc mreyno...@redhat.com ___ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel
[Freeipa-devel] please review ticket #337 - improve CLEANRUV functionality
https://fedorahosted.org/389/ticket/337 https://fedorahosted.org/389/attachment/ticket/337/0001-Ticket-337-RFE-Improve-CLEANRUV-functionality.patch Previously the steps to remove a replica and its RUV was problematic. I created two new tasks to take care of the entire replication environment. [1] The new task CLEANALLRUVrid - run it once on any master * This marks the rid as invalid. Used to reject updates to the changelog, and the database RUV * It then sends a CLEANRUV extended operation to each agreement. * Then it cleans its own RUV. * The CLEANRUV extended op then triggers that replica to send the same CLEANRUV extop to its replicas, then it cleans its own RID. Basically this operation cascades through the entire replication environment. [2] The RELEASERUVrid task - run it once on any master * Once the RUV's have been cleaned on all the replicas, you need to release the rid so that it can be reused. This operation also cascades through the entire replication environment. This also triggers changelog trimming. For all of this to work correctly, there is a list of steps that needs to be followed. This procedure is attached to the ticket. https://fedorahosted.org/389/attachment/ticket/337/cleanruv-proceedure Thanks, Mark ___ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel