Re: syncrepl broken, and I can't update anything on the server

Alister Forbes Tue, 23 Aug 2011 05:15:13 -0700

All,

In the interests of sharing my stupidity.. I finally fixed this one.


The clue was in this line.

> Aug 19 04:23:11 rtp-ldap-server-dev-1 slapd[27609]: syncrepl_entry: rid=005 
> be_add cn={1}DUAConfigProfile,cn=schema,cn=config failed (53) 

When i actually looked on the server, it had already been connected to another 
test machine, and there was an existing scheme called cn={1}nis   

What I didn't realise, and what seems to be the behaviour is that syncrepl 
won't renumber cn items, so it couldn't just move on and create  
cn={2}DUAConfigProfile,cn=schema,cn=config it failed because of the existing 
cn={1}  

Solution:  I stopped the slapd on rtp-1, manually deleted the ldif files from 
etc/openldap/slapd.d/cn=config/cn=schema  and restarted slapd with 'slapd -c 
rid=001 -c rid=005'

Alister

On 19 Aug 2011, at 13:52, Alister Forbes wrote:

> Dmitriy,
> Thanks very much,  made the changes you suggested, and then restarted the 
> servers with -c for both machines, and finally things are up and sort of 
> running.
> 
> (which is a lot further on than I was)
> 
> Now, if I change something in cn=config, on one server, then it's replicated 
> to the other.
> 
> This only seems to be affecting new changes though.  on rtp-1 I have only 3 
> schemas declared whereas on bru-1, there are 12.
> 
> in the rtp-1 logs, I see this repeated over and over
> 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 fd=9 ACCEPT 
> from IP=144.254.10.212:58653 (IP=0.0.0.0:389) 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 op=0 BIND 
> dn="cn=admin,cn=config" method=128 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 op=0 BIND 
> dn="cn=admin,cn=config" mech=SIMPLE ssf=0 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 op=0 RESULT 
> tag=97 err=0 text= 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 op=1 SRCH 
> base="cn=config" scope=2 deref=0 filter="(objectClass=*)" 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 op=1 SRCH 
> attr=* + 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 op=1 SEARCH 
> RESULT tag=101 err=0 nentries=0 text= 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 op=2 UNBIND 
> Aug 19 04:23:10 rtp-ldap-server-dev-1 slapd[27609]: conn=1058 fd=9 closed 
> Aug 19 04:23:11 rtp-ldap-server-dev-1 slapd[27609]: null_callback : error 
> code 0x35 
> Aug 19 04:23:11 rtp-ldap-server-dev-1 slapd[27609]: syncrepl_entry: rid=005 
> be_add cn={1}DUAConfigProfile,cn=schema,cn=config failed (53) 
> Aug 19 04:23:11 rtp-ldap-server-dev-1 slapd[27609]: do_syncrepl: rid=005 rc 
> 53 retrying (4 retries left) 
> 
> 
> bru-1 is a solaris10 box, and I'm wondering if that might have something to 
> do with it.  If I run an ldapsearch from rtp-1 everything works fine…
> 
> ldapsearch -x -H ldap://bru-1.cisco.com  -b 'cn=config' -D 
> 'cn=admin,cn=config' -W -s base olcServerID
> 
> but if I try from the Solaris box I need to change the search parameters…
> 
> ldapsearch -b "cn=config" -h rtp-ldap-server-dev-1.cisco.com  -D 
> 'cn=admin,cn=config' -W -s base "olcServerID=*"
> 
> 
> Thinking about it logically, I don't see how that can be the problem, as when 
> I make a change on either machine, it is replicated across properly.  But at 
> the moment, I'm at my wits end.  Could anyone point me in the right direction 
> please?  (I'm perfectly willing to accept, "you idiot look at this bit of the 
> manual")
> 
> Thanks very much
> Alister
> 
> 
> 
> On 16 Aug 2011, at 20:39, Dmitriy Kirhlarov wrote:
> 
>> 
>> 
>> 16.08.2011 16:18, Alister Forbes пишет:
>>> All,
>>> 
>>> I have two servers, bru-1 and rtp-1
>>> 
>>> At one point I had cn=config working properly, and somehow managed to mess 
>>> that up.
>>> 
>>> The situation I'm in now is that syncing between the two machines doesn't 
>>> work, and I can't make any changes to the configs.
>>> 
>>> There are no special configurations, no SASL, or Kerberos, just plain 
>>> passwords.  I've been through the Server guide, and hopefully I'm just 
>>> missing something, but I can't seem to find any indication of a way to 
>>> solve my problem.
>>> 
>>> bru-1 is running solaris10, and rtp-1 is running RHEL 5 , both with hand 
>>> compiled openldap 2.4.23
>>> 
>>> bru-1:
>>> 
>>> dn: cn=config
>>> olcServerID: 1 ldap://rtp-1.cisco.com
>>> olcServerID: 5 ldap://bru-1.cisco.com
>>> 
>>> # {0}config, config
>>> dn: olcDatabase={0}config,cn=config
>>> olcSyncrepl: {0}rid=005 provider=ldap://bru-1.cisco.com binddn
>>> ="cn=admin,cn=config" bindmethod=simple credentials="testpass" 
>>> searchbase="cn
>>> =config" type=refreshAndPersist retry="5 5 300 5" timeout=3
>> olcSyncrepl: {1}rid=001 provider=ldap://rtp-1.cisco.com 
>> binddn="cn=admin,cn=config" bindmethod=simple credentials="testpass" 
>> searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3
>> olcMirrorMode: TRUE
>> 
>>> rtp-1
>>> 
>>> dn: cn=config
>>> olcServerID: 1 ldap://rtp-1.cisco.com
>>> olcServerID: 5 ldap://bru-1.cisco.com
>>> 
>>> # {0}config, config
>>> dn: olcDatabase={0}config,cn=config
>>> olcSyncrepl: {0}rid=005 provider=ldap://bru-1.cisco.com binddn
>>> ="cn=admin,cn=config" bindmethod=simple credentials="testpass" 
>>> searchbase="cn
>>> =config" type=refreshAndPersist retry="5 5 300 5" timeout=3
>> olcSyncrepl: {1}rid=001 provider=ldap://rtp-1.cisco.com 
>> binddn="cn=admin,cn=config" bindmethod=simple credentials="testpass" 
>> searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3
>> olcMirrorMode: TRUE
>> 
>> don't forget olcOverlay={0}syncprov,olcDatabase={0}config,cn=config
>> 
>> and correct A and PTR DNS-records
>> 
>> WBR
>> 
>>> 
>>> I'm having to run them with olcMirrorMode set to False at the moment, 
>>> because if I try to startup bru-1 with mirror mode enabled, it crashes out.
>>> 
>>> Aug 16 13:35:32 bru-1.cisco.com slapd[14591]: [ID 600618 local4.debug] 
>>> olcServerID: value #2: SID=0x005 (listener=ldap:///)
>>> Aug 16 13:35:33 bru-1.cisco.com slapd[14591]: [ID 309573 local4.debug] 
>>> olcSyncrepl: value #0: syncrepl will eventually stop retrying; the "retry" 
>>> parameter should end with a '+'.
>>> Aug 16 13:35:33 bru-1.cisco.com slapd[14591]: [ID 942748 local4.debug] 
>>> Config: ** successfully added syncrepl rid=005 "ldap://bru-1.cisco.com";
>>> Aug 16 13:35:33 bru-1.cisco.com slapd[14591]: [ID 801593 local4.debug] 
>>> olcMirrorMode: value #0:<olcMirrorMode>  database is not a shadow
>>> Aug 16 13:35:33 bru-1.cisco.com slapd[14591]: [ID 183426 local4.debug] 
>>> config error processing olcDatabase={0}config,cn=config:<olcMirrorMode>  
>>> database is not a shadow
>>> Aug 16 13:35:33 bru-1.cisco.com slapd[14591]: [ID 486161 local4.debug] 
>>> slapd stopped.
>>> Aug 16 13:35:33 bru-1.cisco.com slapd[14591]: [ID 432338 local4.debug] 
>>> connections_destroy: nothing to destroy.
>>> 
>>> Can anyone give me a pointer in the right direction please?  Even just how 
>>> to get my database back to being a shadow so I can work on the replication 
>>> problem later.  I realise it could all be linked, so this is the sort of 
>>> log error I'm seeing when I try running.  It looks almost like a password 
>>> problem, but that doesn't make any sense, as I can do searches on both 
>>> machines with ldapsearch, or even phpldapadmin.
>>> 
>>> Bru-1:
>>> 
>>> Aug 16 13:39:51 bru-1.cisco.com slapd[14662]: [ID 445809 local4.debug] 
>>> do_syncrepl: rid=008 rc -1 retrying (4 retries left)
>>> Aug 16 13:39:51 bru-1.cisco.com slapd[14662]: [ID 445809 local4.debug] 
>>> do_syncrepl: rid=007 rc -1 retrying (4 retries left)
>>> Aug 16 13:39:51 bru-1.cisco.com slapd[14662]: [ID 319573 local4.debug] 
>>> slap_client_connect: URI=ldap://rtp-1.cisco.com DN="cn=root,dc=ca" 
>>> ldap_sasl_bind_s failed (49)
>>> Aug 16 13:39:51 bru-1.cisco.com slapd[14662]: [ID 445809 local4.debug] 
>>> do_syncrepl: rid=006 rc 49 retrying (4 retries left)
>>> 
>>> rtp-1:
>>> 
>>> Aug 16 05:16:11 rtp-1 slapd[19928]: syncrepl_entry: rid=005 
>>> LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
>>> Aug 16 05:16:11 rtp-1 slapd[19928]: syncrepl_entry: rid=005 be_search (0)
>>> Aug 16 05:16:11 rtp-1 slapd[19928]: syncrepl_entry: rid=005 
>>> cn={1}DUAConfigProfile,cn=schema,cn=config
>>> Aug 16 05:16:11 rtp-1 slapd[19928]: null_callback : error code 0x35
>>> Aug 16 05:16:11 rtp-1 slapd[19928]: syncrepl_entry: rid=005 be_add 
>>> cn={1}DUAConfigProfile,cn=schema,cn=config (53)
>>> Aug 16 05:16:11 rtp-1 slapd[19928]: syncrepl_entry: rid=005 be_add 
>>> cn={1}DUAConfigProfile,cn=schema,cn=config failed (53)
>>> Aug 16 05:16:11 rtp-1 slapd[19928]: do_syncrepl: rid=005 rc 53 retrying (4 
>>> retries left)
>>> 
>>> Any clues or hints, would be greatly appreciated
>>> --
>>> Alister Forbes      TACSUNS             _.|._.|._ Cisco Systems
>>> 
>>> Please avoid sending me Word or PowerPoint attachments. See -
>>> http://www.gnu.org/philosophy/no-word-attachments.html
>>> 
>>> 
>> 
> 
> --
> Alister Forbes  TACSUNS             _.|._.|._ Cisco Systems
> 
> Please avoid sending me Word or PowerPoint attachments. See -
> http://www.gnu.org/philosophy/no-word-attachments.html
> 
> 

--
Alister Forbes      Work:   +32 2 704 5762    Internal: 322 5762
[email protected]    TACSUNS             _.|._.|._ Cisco Systems

Please avoid sending me Word or PowerPoint attachments. See -
http://www.gnu.org/philosophy/no-word-attachments.html

Re: syncrepl broken, and I can't update anything on the server

Reply via email to