On 05/21/2015 09:59 AM, Janelle wrote:
On 5/21/15 6:46 AM, Ludwig Krispenz wrote:
On 05/21/2015 03:28 PM, Janelle wrote:
I think I found the problem.
There was a lone replica running in another DC. It was installed as
a replica some time ago with all the others. Think of this -- the
original config had 5 servers, one of them was this server. Then the
other 4 servers were RE-BUILT from scratch, so all the replication
agreements were changed AND - this is the important part - the 5th
server was never added back in. BUT - the 5th server was left
running and never told it that it was not a member anymore. It still
thought it had a replication agreement with original "server 1", but
server 1 knew otherwise.
Now, although the first 4 servers were rebuilt, the same domain,
realm, AND passwords were used.
I am guessing that somehow, this 5th server keeps trying to
interject its info into the ring of 4 servers, kind of forcing its
way in. Somehow, because the original credentials still work (but
certs are all different) is leaving the first 4 servers with a
"can't decode" issue.
There should be some security checks so this can't happen. It should
also be easy to replicate.
Now I have to go re-initialize all the servers from a good server,
so everyone is happy again. The "problem" server has been shutdown
completely. (and yes, there were actually 3 of them in my scenario -
I just used 1 to simplify my example - but that explains the 3 CSNs
that just kept "appearing")
What concerns me most about this - were the servers outside of the
"good ring" somehow able to inject data into replication which might
have been causing bad data??? This is bad if it is true.
it depends a bit on what you mean by rebuilt from scratch.
A replication session needs to meet three conditions to be able to
send data:
- the supplier side needs to be able to authenticate and the
authenticated users has to be in the list of binddns of the replica
- the data generation of supplier and consumer side need to be the
same (they all have to have the same common origin)
- the supplier needs to have the changes (CSNs) to be able to
position in its changelog to send updates
now if you have 5 servers, forget about one of them and do not change
the credentials in the others and do not reinitialize the database by
an ldif import to generate a new database generation, the fifth
server will still be able to connect and eventually send updates -
how should the other servers know that this one is no longer a "good"
one
~Janelle
The only problem left now - is no matter what, this last entry will
NOT go away and now I have 2 "stuck" cleanruvs that will not "abort"
either.
unable to decode {replica 24} 554d53d3000000180000 554d54a4000200180000
CLEANALLRUV tasks
RID 24 None
No abort CLEANALLRUV tasks running
=====================================
ldapmodify -D "cn=directory manager" -W -a
dn: cn=abort 24, cn=abort cleanallruv, cn=tasks, cn=config
objectclass: extensibleObject
replica-base-dn: dc=example,dc=com
cn: abort 24
replica-id: 24
replica-certify-all: no
adding new entry *" cn=abort 24, cn=abort cleanallruv, cn=tasks,
cn=config" *
ldap_add: No such object (32)
There should not be a white space at the beginning: *" cn=abort 24,
cn=abort cleanallruv, cn=tasks, cn=config" **
*
When I run the abort task I don't have that extra white space, and the
task is successfully added:
[root@localhost ~]# ldapmodify -D cn=dm -w password -a
dn: cn=abort 24, cn=abort cleanallruv, cn=tasks, cn=config
objectclass: extensibleObject
replica-base-dn: dc=example,dc=com
cn: abort 24
replica-id: 24
replica-certify-all: no
adding new entry *"cn=abort 24, cn=abort cleanallruv, cn=tasks, cn=config"*
The extra white space is the probable cause of the error 32 (no such
object) you were seeing. You can verify this by looking at the access
log (/var/log/dirsrv/slapd-INSTANCE/access)
Like I said before you could also check the errors log for the reason
why the cleanAllRUV task is not completing as well.
Regards,
Mark
--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project