Re: [Freeipa-users] replication again :-(

Janelle Tue, 19 May 2015 06:28:48 -0700


On 5/19/15 12:17 AM, Ludwig Krispenz wrote:

On 05/19/2015 08:58 AM, thierry bordaz wrote:
On 05/19/2015 07:47 AM, Martin Kosek wrote:
On 05/19/2015 03:23 AM, Janelle wrote:
Once again, replication/sync has been lost. I really wish theproduct was more
stable, it is so much potential and yet.
Servers running for 6 days no issues. No new accounts or changes(maybe a fewusers changing passwords) and again, 5 out of 16 servers are nolonger in sync.
I can test it easily by adding an account and then waiting a fewminutes, thenrun "ipa user-show --all username" on all the servers, and only afew of them
have the account.  I have now waited 15 minutes, still no luck.
Oh well.. I guess I will go look at alternatives. I had such highhopes forthis tool. Thanks so much everyone for all your help in trying toget thingsstable, but for whatever reason, there is a random loss of syncamong the
servers and obviously this is not acceptable.
Hello Janelle,
I am very sorry to hear about your troubles. Would you be still OKwith helping us (mostly Ludwig and Thierry) investigate what is theroot cause of the instability of the replication agreements? This isobviously something that should not be happening at this rate as inyour deployment, so I would really like to be able to identity andfix this issue in the 389 DS.
Hello Janelle,

I can only join my voice to Martin to say how I am sorry to read this.
Would you turn on replication logging level (8192) on themaster/consumer and provide us the logs(access/error) and config(dse.ldif).The master is the instance where you can see the update and the thatis linked (replica agreement) to a replica(aka consumer) where theupdate is not received.
what puzzles me most, is that replication is working for quite sometime and then breaks, so we need to find out about the dynamics whichlead to that state. You reported errors about invalid credentials orabout a bind dn entry not found, these credentials don't get changedby ds or entries are not deleted by ds, so what triggers these changes.also for the suggestion by Thierry to debug, we need to determinewhere replication breaks, if you add an account and it is propagetedto some servers and not to others, where does it stop ? This dependson your replication topology, you said in anotehr post that you have aring topology, does it mean all 16 servers are conencted in a ringonly, and if two links break the topology is disconnected ?
thanks
thierry

Let me see about getting some debug logs going to provide more info. Asfor topology -- yes, ring, but also within the DC - the 3 servers areconnected in an internal ring. There have been no outages on the WANconnections, as I have logs showing network data at all times, so thisis not an issue. If I did lose a WAN, dozens of other inter-DC appswould blow up too, and they have not.

However, I guess you are right, I have not provided enough logging datato help diagnose this. Let me see what I can do. Not sure if this helps-- I do try and do all updates from a single master, never fromdifferent ones. Users are also forced to the same master to changepasswords and update things. So the "source" of changes is always the same.


Time to go do some log enabling...

~J

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication again :-(

Reply via email to