Re: [Freeipa-users] stubborn old replicas

Ludwig Krispenz Wed, 02 Sep 2015 05:59:47 -0700

Hi Janelle,
On 09/01/2015 06:17 PM, Janelle wrote:

On 8/28/15 8:17 AM, Vaclav Adamec wrote:

You could try this (RH recommended way). It works for me better thancleanallruv.pl <http://cleanallruv.pl/> as this sometimes leads toldap freeze)

unable to decode: {replica 30} 5548fa200000001e00005548fa200000001e0000 unable to decode: {replica 26}5548a9a80000001a0000 5548a9a80000001a0000


for all of them, on-by-one:

ldapmodify -x -D "cn=directory manager" -w XXXXXXX dn:cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=configchangetype: modify replace: nsds5task nsds5task: CLEANRUV30 <enter> +<Ctrl + D>

On Fri, Aug 28, 2015 at 4:55 PM, Guillermo Fuentes<guillermo.fuen...@modernizingmedicine.com> wrote:


    Hi Janelle,

    Using the cleanallruv.pl <http://cleanallruv.pl> tool was the
    only way I was able to get ride of the "unable to decode:
    {replica x}" entries.

    This is how I used it, cleaning a replica ID at a time:
    # For replica id: 40
    cleanallruv.pl <http://cleanallruv.pl> -v -D "cn=directory
    manager" -w - -b 'dc=example,dc=com' -r 40

    Note that the "-w -" will make the tool prompt you for the
    directory manager password.

    Hope this helps,
    Guillermo


    On Thu, Aug 27, 2015 at 10:27 AM, Janelle
    <janellenicol...@gmail.com> wrote:

        On 8/27/15 1:05 AM, thierry bordaz wrote:

        On 08/27/2015 09:41 AM, Ludwig Krispenz wrote:


        On 08/27/2015 09:08 AM, Martin Kosek wrote:

        On 08/26/2015 05:31 PM, Simo Sorce wrote:

        On Wed, 2015-08-26 at 06:36 -0700, Janelle wrote:

        Hello all,

        My biggest problem is losing replicas and then trying to
        delete the
        entries and rebuild them. Here is a perfect example, I
        simply can't get
        rid of these  (see below). I have tried (of course after
        the ORIGINAL
        "ipa-replica-manage del hostname --force --clean":

        ipa-replica-manage clean-ruv 25

        ldapmodify... with:
            dn: cn=clean 25, cn=cleanallruv, cn=tasks, cn=config
            objectclass: extensibleObject
            replica-base-dn: dc=example,dc=com
            replica-id: 25
            cn: clean 25

        And yet nothing works. Any suggestions? This is perhaps
        the most
        frustrating part about maintaining IPA.

        ~J

        unable to decode: {replica 12} 5588dc2e0000000c0000
        559f3de60004000c0000
        unable to decode: {replica 14} 5587aa8d0000000e0000
        5587aa8d0003000e0000
        unable to decode: {replica 16} 5588f58f000000100000
        55bb7b08000500100000
        unable to decode: {replica 25} 55a4887b000000190000
        55a49242000400190000
        unable to decode: {replica 29} 55d199a50001001d0000
        55d199a50001001d0000
        unable to decode: {replica 3} 5587c5c3000000030000
        55b8a049000100030000
        unable to decode: {replica 5} 55cc82ab041d00050000
        55cc82ab041d00050000

        Have you tried restarting DS before trying to clean the
        ruv ?

        I run in a similar problem in a test install recently,
        and I got better
        results that way. The bug is known to the DS people and
        they are working
        to get out patches that fix the root issue.

        Simo.

        CCing DS folks. Wasn't there a recent DS fix that was
        supposed to improve the
        RUV situation?

        Looking at 389 DS Trac, I see some interesting RUV fixes
        in 1.3.4.x releases:

        
https://fedorahosted.org/389/query?summary=~RUV&status=closed&order=milestone&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone
        
<https://fedorahosted.org/389/query?summary=%7ERUV&status=closed&order=milestone&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone>


        I see that 389-ds-base-1.3.4.3 is already in Fedora 22+,
        does the RUV issue
        happen there?

        it should not, and I think Thierry verified the fix.
        The problem we resolved and which we think is the core of
        the corrupted RUV was that the cleanallruv task did only
        purge the RUV, but dit not purge the changelog. If
        cleanallruv was run and the server had a disorderly
        shutdown (crash or abort when shutdown was hanging) then at
        restart the changelog RUV was rebuilt from the data in the
        changelog and if it contained a csn from cleaned RIDs this
        was added to the RUV (but the reference to the server was
        lost and so the url part is missing from this RUV.
        The fix now does remove all references to the cleaned RID
        from the changelog and the problem should not reoccur with
        RIDs cleaned with the fix, of course th echangelog can
        still can contain references to RIDs cleaned before the fix
        - and if no changelog trimming is configured this is what
        will happen. So, even after the fix old RUVs could pop up
        and have to be (finally) cleaned.

        The other source is that these corrupted rivs can be
        "imported" from another server by exchanging ruvs in the
        repl protocol. Cleanallruv tries to address this and to
        propagate the cleanallruv tasks to all servers it thinks
        are connected. If there are replication agreements to
        servers which no longer exist or to servers which cannot be
        connetcted this will delay the ruv cleaning


        Hello,

        I verified the fix in 1.3.4.2 F22 /
        389-ds-base-1.3.4.0-6.el7 RHEL7, so after those versions
        CLEANALLRUV do not create any longer corrupted ruv elements.
        According to the timestamp in the ruv (for example
        csn2date.py 5587aa8d0003000e0000 --> 22/06/2015:06:26:21)
        this are old ruv elements. I think Ludwig is right, these
        corrupted ruv-elements come from old cleanallruv before the
        fix was applied.

        The problem is that even a fixed server can get those
        corrupted ruv-elements from others servers.
        All servers in the topology should be updated with that fix,
        so that at least they stop creating corrupted ruv-elements.
        Now to get rid of the existing ones, I imagine only brute
        option of recreating replica and reinit... I hope an other
        option is possible.

        thanks
        thierry

For a few minutes - almost an hour actually, I thought there washope. I found the cleanallruv.pl script and that not only seemed towork, but it wiped the "unable to decode" from all the servers evenjust running it on one. Sadly, within an hour, they all came back. :-(


unable to decode  {replica 12} 559f3de60004000c0000 559f3de60004000c0000
unable to decode  {replica 14} 5587aa8d0003000e0000 5587aa8d0003000e0000
unable to decode  {replica 16} 55bb7b08000500100000 55bb7b08000500100000
unable to decode  {replica 25} 55a49242000400190000 55a49242000400190000
unable to decode  {replica 29} 55d199a50001001d0000 55d199a50001001d0000
unable to decode  {replica 31} 55e4bc680005001f0000 55e4bc680005001f0000
unable to decode  {replica 3} 55b8a049000100030000 55b8a049000100030000
unable to decode  {replica 5} 55cc82ab041d00050000 55cc82ab041d00050000

I cried... Followed by heavy drinking.

does drinking  help, could be  a great workaround ?

Now, more seriously, I think you need a build including the mentionedimprovement for cleanallruv, we are currently checking if and where itis available for 7.1.But this fix will only help in future cleanallruvs, so you probably needto go thru a few iterations of cleaning.Since the core problem of the corrupted ruvs is that they can berecreated from the changlog I think configuring changlog trimming issomething that should be done.


~Janelle

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] stubborn old replicas

Reply via email to