Re: [389-users] multimaster replication and index corruption

Adrian Damian Tue, 10 Nov 2015 09:14:53 -0800

Rich,

Thanks for your help. Let me jump in with more details.

We've seen index corruption on a number of occasions. It seems to affectsearchable attributes for which there are indexes. Queries on anattribute in LDAP that used to work suddenly stopped working. They wouldreturn incomplete results and no results at all, although the data onthe server was the same. The fix on those situations was to drop theindex corresponding to the attribute and re-create it.

We've run the db fix script that the LDAP distribution comes with andthere are no reports of corruption when this problem occurs. That makesit very hard to detect. We don't know what else to look for when we runinto this again and more importantly, we don't know what triggers it andhow to prevent it.

Mind you we are currently doing active development changing both thesoftware clients that access the LDAP servers as well as theconfigurations of the servers. It is possible to had been written toboth masters in the master replication configuration when the problemoccurred but because there were multiple clients concurrently accessingthe servers it is hard to figure out what triggered the issue.



Adrian



On 11/09/2015 05:06 PM, Rich Megginson wrote:

On 11/09/2015 05:47 PM, Ghiurea, Isabella wrote:

Hi Rich,
Thank you for your feedback , as always greatly appreciate when comes from  
389-DS RH support.
We are  not using vm just plain hardware, here is the description  I  got from developers team 
related to the issues they are seeing when running   integration tests with multimaster replication 
: "index corruption: put content, run tests: OK, do more stuff (reads, writes, etc), ru tests: 
FAIL, notice "missing attributes", rebuild index(ices), run tests: OK. "

What does this mean?  What program is printing these index corruption
messages?  Is it some tool provided by Red Hat?

Unfortunately, I understood  this   cases/issue can not be reproduce on regular 
basis,  no mode details can be provide at this time

All reads and writes are going to  only the master replication DS, not slave .
    I totally agree with your this is the way to cfg and maintain Directory  
Server in a operation critical  env: multmaster replication only one master for 
writes.
   Here is the DS version:
   rpm -qa | grep 389-ds
389-ds-console-doc-1.2.6-1.el6.noarch
389-ds-base-libs-1.2.11.15-34.el6_5.x86_64
389-ds-1.2.2-1.el6.noarch
389-ds-base-1.2.11.15-34.el6_5.x86_64

This is quite an old version of 389-ds-base.  I suggest upgrading to
RHEL 6.7 with latest patches.

389-ds-console-1.2.6-1.el6.noarch


Thank you
Isabella

FWD:


We have cfg multimaster replication /fractional replication memberof plugging 
excluded , we are seeing from time to time index corruption with some indexes , 
there is a strong feeling from developers this are related to DS multimaster 
replication internal settings.
What version of 389?  rpm -q 389-ds-base
I'm assuming you are not using IPA.
What does "index corruption" mean?  What exactly do you see?

Are you running in virtual machines? If so, what kind? vmware? kvm? Are you 
using virtual disks or dedicated physical devices/paravirt?

We are writing to only one DS , same server at all time but reading from all DS 
's cfg for mutlmaster.
Are you seeing "index corruption" on the write master or on all servers?


Are other people seen this kind of issues with multimaster rep cfg , should we 
start avoiding this replication cfg at all ?

This is the recommended way to deploy. If this is not working for you, either 
you have a configuration problem, or there is some sort of vm or hardware 
problem, or there is a serious bug that requires fixing ASAP.

We choose the multimaster for the fast and reliable option to switch between 
master DS's , moving one step down to master/slave may require some down time 
when switching DS's back.
Isabella





Hi Rich,
Thank you for your feedback , as always greatly appreciate when comes from  
389-DS RH support.
We are  not using vm just plain hardware, here is the description  I  got from developers 
team related to the issues they are seeing when running  tests with multimaster 
replication  :index corruption: put content, run tests: OK, do more stuff (reads, writes, 
etc), ru tests: FAIL, notice "missing attributes", rebuild index(ices), run 
tests: OK.

I belive we the reads and writes right now are only the master replication DS , 
not slave .
I totally agree with your this is the way to cfg and maint DS in a operation 
env: multmaster replication with one master for writes.
More comments , imput I appreciate
   rpm -qa | grep 389-ds
389-ds-console-doc-1.2.6-1.el6.noarch
389-ds-base-libs-1.2.11.15-34.el6_5.x86_64
389-ds-1.2.2-1.el6.noarch
389-ds-base-1.2.11.15-34.el6_5.x86_64
389-ds-console-1.2.6-1.el6.noarch
389-dsgw-1.1.11-1.el6.x86_64

________________________________________
From: ghiureai [[email protected]]
Sent: Monday, November 09, 2015 1:05 PM
To: [email protected]
Subject: multimaster replication and index corruption

Hi List,
We have cfg   multimaster replication /fractional replication memberof
plugging excluded ,    we are seeing from time to time index corruption
with some indexes , there is a  strong feeling from developers this are
related to DS  multimaster replication internal settings.
We are writing to only one DS  , same server at all time but reading
from all DS 's cfg for mutlmaster.
Are other  people seen this kind of issues with multimaster rep cfg ,
should we start avoiding this replication  cfg  at all ?
We choose the multimaster for the fast and reliable option to switch
between  master DS's , moving one step down to master/slave may require
some down time  when switching DS's back.
Isabella

--
389 users mailing list
[email protected]
https://admin.fedoraproject.org/mailman/listinfo/389-users


--
389 users mailing list
[email protected]
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] multimaster replication and index corruption

Reply via email to