RE: [389-users] Replication and High Availalbiltiy

Bucl, Casper Wed, 18 Nov 2009 07:20:29 -0800


-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Rich Megginson
Sent: Tuesday, November 17, 2009 9:23 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] Replication and High Availalbiltiy


Bucl, Casper wrote:
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Rich 
> Megginson
> Sent: Tuesday, November 17, 2009 12:35 PM
> To: General discussion list for the 389 Directory server project.
> Subject: Re: [389-users] Replication and High Availalbiltiy
>
> Bucl, Casper wrote:
>   
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Rich 
>> Megginson
>> Sent: Tuesday, November 17, 2009 8:23 AM
>> To: General discussion list for the 389 Directory server project.
>> Subject: Re: [389-users] Replication and High Availalbiltiy
>>
>> Bucl, Casper wrote:
>>   
>>     
>>> Hi,
>>>
>>> I'm trying to create a high availability ldap for a system I have in 
>>> place that is currently using multimaster replication. Using a 
>>> shared storage system isn't an option in this case.
>>>
>>> To give you an idea of what our setup looks like,
>>>
>>> There are two nodes, that have replication set up. These are set up 
>>> as multimasters and have processes that write to both of them. These 
>>> changes replicate to the other ldap server.
>>>
>>> Now I need them to be in a high availability configuration.
>>>
>>> I have created duplicates of each node and gotten the high 
>>> availability portion on each of them to work correctly.
>>>
>>> The problem comes with fedora and replication.
>>>
>>> I have tried multiple ways of setting up fedora and replication and 
>>> they always seem to end up with changes not being replicated to the 
>>> other master when we have failed over to the secondary node. The two 
>>> most successful one's are below
>>>
>>> Configurations.
>>>
>>> Full Mesh: All links were set up as a two way replication.
>>>
>>> This always ends up with at least 2 nodes showing errors saying it 
>>> "Can't locate CSN" or "Duplicate node ID"
>>>
>>> Node1A ------- Node1B
>>>
>>> | \ / |
>>>
>>> | X |
>>>
>>> | / \ |
>>>
>>> Node2A ------- Node2B
>>>
>>> Single replication agreement between VIPs
>>>
>>> In this configuration, we initially copied over the slapd instance 
>>> directory on setup of the second HA node (Node1A to Node1B) so that 
>>> the settings and configurations are identical on both. Then as 
>>> changes were made to the ldap, we created backups using db2bak. 
>>> These backups are copied over to the failover box and then imported 
>>> on startup of fedora ds. This doesn't appear to backup the changelog 
>>> and ends up with an error saying "Can't locate CSN" again.
>>>
>>> Node1 VIP
>>>
>>> |
>>>
>>> |
>>>
>>> Node2 VIP
>>>
>>> I have tried other things as well and they were a lot less fruitful 
>>> than the two examples I have here.
>>>
>>> Has anyone set up a high availability scenario similar to this? Can 
>>> anyone suggest a different process or configuration that would 
>>> accomplish what I'm after?
>>>
>>>     
>>>       
>> Yes. Configurations like this have been working at high volume installations 
>> for several years.
>>
>> Let's start with - what platform are you running on your systems? What 
>> version of DS? What procedure did you use to set up and initialize your 
>> replicas?
>>   
>>     
>>> Thanks,
>>>
>>> Casper
>>>
>>> --------------------------------------------------------------------
>>> -
>>> -
>>> --
>>>
>>> --
>>> 389 users mailing list
>>> [email protected]
>>> https://www.redhat.com/mailman/listinfo/fedora-directory-users
>>>   
>>>     
>>>       
>> The environment is set up using Fedora-Directory 1.0.4
>>   
>>     
> What platform?  I would suggest using the latest (1.2.2 or 1.2.4 which is in 
> testing).  We have fixed many, many bugs in replication since 1.0.4, and the 
> issue with the CSN you are reporting sounds like a bug that has been fixed in 
> 1.2.x.
>   
>> To set up the multimaster replication I used the mmr.pl script. When 
>> reinitializing the consumers, I use ldapmodify and set the 
>> nsDS5BeginReplicaRefresh to start.
>>
>> Another question about the fully meshed configuration, Can there be more 
>> nodes? We will be wanting to add another HA node to the environment so this 
>> would take the total to 6 directory servers. The idea being that there is a 
>> central hub that is kind of the global master. Everyone replicates their 
>> info up to it and then it gets redistributed back out to the others.
>>   
>>     
> Yes, you can have more than 4 masters.
>   
>> Would the easier method be to copy the changelog information over to the 
>> standby node?
>>     
> No.
>   
>> Is there a method to do this?
>>   
>>     
> Not really.
>   
>> Thanks,
>> Casper
>>
>>
>> --
>> 389 users mailing list
>> [email protected]
>> https://www.redhat.com/mailman/listinfo/fedora-directory-users
>>   
>>     
> Hi Rich,
> What is the proper way to reinitialize the replication agreement in a 
> multimaster configuration. Whenever I try using the nsDS5BeginReplicaRefresh 
> method, it ends up creating a new changelog on the node being refreshed and 
> then it begins having replication issues. Notably the "Can't locate CSN" 
> error. Could this be something related to some of the bugs you were speaking 
> of?
>   
Yes.  See https://bugzilla.redhat.com/show_bug.cgi?id=388021 - fixed in 1.1.0

What platform are you running on?
> Casper
>
> --
> 389 users mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/fedora-directory-users
>   

We are using a version of linux based on Redhat linux.  Kernel version 2.6.26.8.
Thanks for all your help. I will have to work on an upgrade plan to get our 
LDAP updated.
Casper


--
389 users mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/fedora-directory-users

RE: [389-users] Replication and High Availalbiltiy

Reply via email to