Re: [Freeipa-devel] [PATCH 0019] handle cleanRUV in the topology plugin

2015-10-26 Thread Mark Reynolds



On 10/23/2015 06:39 AM, Ludwig Krispenz wrote:


On 10/23/2015 11:24 AM, thierry bordaz wrote:

On 10/23/2015 11:00 AM, thierry bordaz wrote:

On 10/12/2015 01:17 PM, Ludwig Krispenz wrote:


On 10/12/2015 12:44 PM, Martin Basti wrote:



On 23.07.2015 10:46, Ludwig Krispenz wrote:
The attached patch moves the cleaning of the RUV into the 
topology plugin.


I encountered a problem when removing a replica, which 
disconnects the topology, but it was fixed with my WIP for #5072.


I want to keep these issues separate, so please review and test 
the patch and let me know about issues found


Ludwig




Is this patch still valid and pending review?
it should be  still valid, waiting for review, wanted to rebase 
after topology/promotion patches have been checked in and resend





Hello Ludwig,

The patch looks good. I have few minor remarks:

  * Are the hostname in ruv always fqdn ? to retrieve the RUV
element of a given host you use 'strstr'.
If you have host vm-11 and vm-112, I wonder if it could pickup
the wrong RUV element
  * In ipa_topo_util_cleanruv_element you need a pblock_done/free
(or destroy)
  * In it fails to add the clearn-ruv task, you should log a message
so that the admin knows what to do.

thanks
thierry




Hi Ludwig,


I will adress the points raised, thank you
Additional question. cleanruv is done with 'replica-force-cleaning: 
yes'. Currently ipa-replica-manage does not implement this flag.

Why do you use it in topology plugin.

there are two potential problems with the cleanallruv task:
1] the rid could come back if not all servers were in sync,  but with 
cleaning the changelog as part of cleanallruv I think the ris is low now
2] the cleanallruv is stuck on waiting for the task to complete on 
other servers even if they cannot be reached
The fix/rfe to cleanallruv that allows the force option to skip the 
online replica checks(https://fedorahosted.org/389/ticket/48218) has not 
been pushed yet.  Currently its set for 1.3.5.


FYI,
Mark


I want to avoid 2] therefore I choose this setting
My concern is that if we delete a host before all the updates from 
that host has been received, could we receive a late update that will 
recreate the ruv element ?


thanks
thierry






-- 
Manage your subscription for the Freeipa-devel mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-devel
Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code

Re: [Freeipa-devel] [PATCH 0291] Limit max age of replication changelog

2015-07-20 Thread Mark Reynolds



On 07/20/2015 11:24 AM, Rob Crittenden wrote:

Martin Basti wrote:

https://fedorahosted.org/freeipa/ticket/5086

Patch attached.


Is this going to be a shock on upgrades for people who until now may 
be relying on the fact that there is no limit?
Just throwing my 2 cents in.  The replication changelog is not something 
that can typically be used externally, unlike the retro changelog.  It's 
really a blackbox to the outside world.  The risk of setting a changelog 
max age depends on how long any replica has been down for.  So if the 
max age is set to 7 days, and a replica has been down for more than 7 
days, then when it comes online it will not be able to catch up with the 
other active replicas and it will need to be reinitialized.


Mark


Should there be a way for an admin to manage this, via the config 
module perhaps?


IMHO this is a significant change and red flags need to be raised so 
users are aware of it.


rob



--
Manage your subscription for the Freeipa-devel mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-devel
Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code


Re: [Freeipa-devel] [PATCH 0291] Limit max age of replication changelog

2015-07-20 Thread Mark Reynolds



On 07/20/2015 12:50 PM, Martin Basti wrote:

On 20/07/15 17:48, Petr Vobornik wrote:

On 07/20/2015 05:24 PM, Rob Crittenden wrote:

Martin Basti wrote:

https://fedorahosted.org/freeipa/ticket/5086

Patch attached.


Is this going to be a shock on upgrades for people who until now may be
relying on the fact that there is no limit?


Not making any point, but have to note: Ludwig raised a question on 
users list but there was no feedback from users.


https://www.redhat.com/archives/freeipa-users/2015-July/msg00022.html



Should there be a way for an admin to manage this, via the config 
module

perhaps?

IMHO this is a significant change and red flags need to be raised so
users are aware of it.

rob






IIUC there is purge delay 7 days, so if changelog max age is 7 or more 
days, it will not break replication.

The issue is if somebody uses changelog for different purpose, right?
Well the replication changelog can not be used for anything else but the 
multimaster replication plugin.  If a customer increased the replication 
purge delay you could potentially run into issues, but again this only 
comes into play when a replica is down for a very long time.  I'm not 
sure if IPA even provides the option to adjust the replication purge 
delay, but that doesn't mean a customer can not adjust these settings on 
their own.


Mark

--
Manage your subscription for the Freeipa-devel mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-devel
Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code


Re: [Freeipa-devel] Update: Re: Fedora 20 Release

2013-12-17 Thread Mark Reynolds


On 12/17/2013 11:35 AM, Rich Megginson wrote:

On 12/16/2013 08:07 AM, Petr Spacek wrote:

Hello list,

we have to decide what we will do with 389-ds-base package in Fedora 20.

Currently, we know about following problems:

Schema problems:
   https://fedorahosted.org/389/ticket/47631 (regression)


Fixed.



Referential Integrity:
   https://fedorahosted.org/389/ticket/47621 (new functionality)
   https://fedorahosted.org/389/ticket/47624 (regression)

Fixed.


Replication:
   https://fedorahosted.org/389/ticket/47632 (?)


Cannot reproduce.  Closed as WORKSFORME.



Stability:
   https://bugzilla.redhat.com/show_bug.cgi?id=1041732

Fixed.
https://fedorahosted.org/389/ticket/47629 (we are not sure if the 
syncrepl really plays some role or not)


We are still trying to determine the cause, and if this is related to 
the use of syncrepl.  If it turns out to be related to syncrepl, I 
would like to release 1.3.2.9 in F20, and just disable the use of 
syncrepl in 389 clients.


Is everyone ok with this?

Rich I found a crash in 1.3.2 and 1.3.1.  This should go into 1.3.2.9(or 
a 1.3.2.10).


One option is to fix 1.3.2.x as quickly as possible.

Another option is to build 1.3.1.x for F20 with Epoch == 1 and 
release it as quickly as possible.


The problem with downgrade to 1.3.1.x is that it requires manual 
change in dse.ldif file. You have to disable 'content 
synchronization' (syncrepl) and 'whoami' plugins which are not in 
1.3.1.x packages but were added and enabled by 1.3.2.x packages.


In our tests, the downgraded DS server starts and works after manual 
dse.ldif correction (but be careful - we didn't test replication).


Here is the main problem:
389-ds-base 1.3.2.8 is baked to Fedora 20 ISO images and there is not 
way how to replace it there. It means that somebody can do F19-F20 
upgrade from ISO and *then* upgrade from repos will break his DS 
configuration (because of new plugins...).


Simo thinks that this is a reason why 'downgrade package' with 
1.3.1.x inevitably needs automated script which will purge two 
missing plugins from dse.ldif.


Nathan, is it manageable before Christmas? One or either way? Is you 
think that the downgrade is safe from data format perspective? (I 
mean DB format upgrades etc.?)






--
Mark Reynolds
389 Development Team
Red Hat, Inc
mreyno...@redhat.com

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task

2012-09-06 Thread Mark Reynolds
 not
contain 'vm-072.idm.lab.bos.redhat.com.'
You may need to manually remove them from the tree


I think it should be enough to just catch for entry already 
exists in
cleanallruv function, and in such case print a relevant error 
message

bail out.
Thus, self.conn.checkTask(dn, dowait=True) would not be called 
too.

Good catch, fixed.



4) (minor): In make_readonly function, there is a redundant pass
statement:

+def make_readonly(self):
+
+Make the current replication agreement read-only.
+
+dn = DN(('cn', 'userRoot'), ('cn', 'ldbm database'),
+('cn', 'plugins'), ('cn', 'config'))
+
+mod = [(ldap.MOD_REPLACE, 'nsslapd-readonly', 'on')]
+try:
+self.conn.modify_s(dn, mod)
+except ldap.INSUFFICIENT_ACCESS:
+# We can't make the server we're removing 
read-only but

+# this isn't a show-stopper
+root_logger.debug(No permission to switch replica to
read-only,
continuing anyway)
+pass
Yeah, this is one of my common mistakes. I put in a pass 
initially, then
add logging in front of it and forget to delete the pass. Its 
gone now.




5) In clean_ruv, I think allowing a --force option to bypass the
user_input
would be helpful (at least for test automation):

+if not ipautil.user_input(Continue to clean?, False):
+sys.exit(Aborted)

Yup, added.

rob
Slightly revised patch. I still had a window open with one 
unsaved change.


rob

Apparently there were two unsaved changes, one of which was lost. 
This adds in

the 'entry already exists' fix.

rob

Just one last thing (otherwise the patch is OK) - I don't think 
this is what we

want :-)

# ipa-replica-manage clean-ruv 8
Clean the Replication Update Vector for 
vm-055.idm.lab.bos.redhat.com:389


Cleaning the wrong replica ID will cause that server to no
longer replicate so it may miss updates while the process
is running. It would need to be re-initialized to maintain
consistency. Be very careful.
Continue to clean? [no]: y
Aborted


Nor this exception, (your are checking for wrong exception):

# ipa-replica-manage clean-ruv 8
Clean the Replication Update Vector for 
vm-055.idm.lab.bos.redhat.com:389


Cleaning the wrong replica ID will cause that server to no
longer replicate so it may miss updates while the process
is running. It would need to be re-initialized to maintain
consistency. Be very careful.
Continue to clean? [no]:
unexpected error: This entry already exists

This is the exception:

Traceback (most recent call last):
   File /sbin/ipa-replica-manage, line 651, inmodule
 main()
   File /sbin/ipa-replica-manage, line 648, in main
 clean_ruv(realm, args[1], options)
   File /sbin/ipa-replica-manage, line 373, in clean_ruv
 thisrepl.cleanallruv(ruv)
   File 
/usr/lib/python2.7/site-packages/ipaserver/install/replication.py,

line 1136, in cleanallruv
 self.conn.addEntry(e)
   File /usr/lib/python2.7/site-packages/ipaserver/ipaldap.py, 
line 503, in

addEntry
 self.__handle_errors(e, arg_desc=arg_desc)
   File /usr/lib/python2.7/site-packages/ipaserver/ipaldap.py, 
line 321, in

__handle_errors
 raise errors.DuplicateEntry()
ipalib.errors.DuplicateEntry: This entry already exists

Martin



On another matter, I just noticed that CLEANRUV is not proceeding if 
I have a

winsync replica defined (and it is even up):

# ipa-replica-manage list
dc.ad.test: winsync
vm-072.idm.lab.bos.redhat.com: master
vm-086.idm.lab.bos.redhat.com: master

[06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV 
Task: Waiting

for all the replicas to receive all the deleted replica updates...
[06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV 
Task: Failed
to contact agmt (agmt=cn=meTodc.ad.test (dc:389)) error (10), will 
retry later.
[06/Sep/2012:11:59:10 -0400] NSMMReplicationPlugin - CleanAllRUV 
Task: Not all

replicas caught up, retrying in 10 seconds
[06/Sep/2012:11:59:20 -0400] NSMMReplicationPlugin - CleanAllRUV 
Task: Failed
to contact agmt (agmt=cn=meTodc.ad.test (dc:389)) error (10), will 
retry later.
[06/Sep/2012:11:59:20 -0400] NSMMReplicationPlugin - CleanAllRUV 
Task: Not all

replicas caught up, retrying in 20 seconds
[06/Sep/2012:11:59:40 -0400] NSMMReplicationPlugin - CleanAllRUV 
Task: Failed
to contact agmt (agmt=cn=meTodc.ad.test (dc:389)) error (10), will 
retry later.
[06/Sep/2012:11:59:40 -0400] NSMMReplicationPlugin - CleanAllRUV 
Task: Not all

replicas caught up, retrying in 40 seconds

I don't think this is OK. Adding Rich to CC to follow on this one.

Martin


And now the actual CC.
Martin


Yeah - looks like CLEANALLRUV needs to ignore windows sync agreements
Yeah, sorry not that familiar with winsync(didn't know it used the same 
repl agmts).  I will have to do another fix...


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


--
Mark Reynolds

Re: [Freeipa-devel] [PATCH] 1031 run cleanallruv task

2012-09-06 Thread Mark Reynolds
to winsync agreement?
It might, but I will have to check for winsync agreements and ignore 
them.  So it should not be an issue moving forward.


Martin

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


--
Mark Reynolds
Senior Software Engineer
Red Hat, Inc
mreyno...@redhat.com

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


[Freeipa-devel] please review ticket #337 - improve CLEANRUV functionality

2012-04-23 Thread Mark Reynolds

https://fedorahosted.org/389/ticket/337

https://fedorahosted.org/389/attachment/ticket/337/0001-Ticket-337-RFE-Improve-CLEANRUV-functionality.patch

Previously the steps to remove a replica and its RUV was problematic. I 
created two new tasks to take care of the entire replication environment.


[1] The new task CLEANALLRUVrid - run it once on any master

 * This marks the rid as invalid. Used to reject updates to the
   changelog, and the database RUV
 * It then sends a CLEANRUV extended operation to each agreement.
 * Then it cleans its own RUV.

 * The CLEANRUV extended op then triggers that replica to send the same
   CLEANRUV extop to its replicas, then it cleans its own RID.
   Basically this operation cascades through the entire replication
   environment.

[2] The RELEASERUVrid task - run it once on any master

 * Once the RUV's have been cleaned on all the replicas, you need to
   release the rid so that it can be reused. This operation also
   cascades through the entire replication environment. This also
   triggers changelog trimming.

For all of this to work correctly, there is a list of steps that needs 
to be followed. This procedure is attached to the ticket.


https://fedorahosted.org/389/attachment/ticket/337/cleanruv-proceedure

Thanks,
Mark
___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel