Re:Slapd-meta stop at the first unreachable candidate

Michel Gruau Mon, 05 Sep 2011 07:34:13 -0700

Below is a sample configuration allowing to reproduce the problem : 

Three openldap data instances configured as follows: 
include         /opt/openldap/etc/openldap/schema/core.schema
include         /opt/openldap/etc/openldap/schema/cosine.schema
include         /opt/openldap/etc/openldap/schema/inetorgperson.schema
include         /opt/openldap/etc/openldap/schema/nis.schema
include         /opt/openldap/etc/openldap/schema/dyngroup.schema
include         /opt/openldap/etc/openldap/schema/misc.schema
pidfile         /opt/openldap/var/run/server1.pid
argsfile        /opt/openldap/var/run/server1.args
loglevel        stats
database        bdb
suffix          ou=orgunit,o=gouv,c=fr
directory       /opt/openldap/var/server1


Note: server1 is changed by server2 and server3 for other instances.

Each instance contains the following data: (only 4 entries):
dn: ou=orgunit,o=gouv,c=fr
objectClass: top
objectClass: organizationalUnit
ou: orgunit

dn: ou=dept1,ou=orgunit,o=gouv,c=fr
ou: dept1
objectClass: top
objectClass: organizationalUnit

dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
mail: [email protected]
cn: User 11
uid: user11
givenName: User
sn: 11

dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
mail: [email protected]
cn: User 12
uid: user12
givenName: User
sn: 12

Note: user1x and dept1 are substituted in instances 2 and 3 by user2x, dept2, 
user3x,dept3.

Data instances are launched using this command:
/opt/openldap/libexec/slapd -n server1 -f 
/opt/openldap/etc/openldap/server1.conf -h ldap://0.0.0.0:1001/
/opt/openldap/libexec/slapd -n server2 -f 
/opt/openldap/etc/openldap/server2.conf -h ldap://0.0.0.0:1002/
/opt/openldap/libexec/slapd -n server3 -f 
/opt/openldap/etc/openldap/server3.conf -h ldap://0.0.0.0:1003/

Meta instance is configured as follows: 
include         /opt/openldap/etc/openldap/schema/core.schema
include         /opt/openldap/etc/openldap/schema/cosine.schema
include         /opt/openldap/etc/openldap/schema/inetorgperson.schema
include         /opt/openldap/etc/openldap/schema/nis.schema
include         /opt/openldap/etc/openldap/schema/dyngroup.schema
include         /opt/openldap/etc/openldap/schema/anais.schema
include         /opt/openldap/etc/openldap/schema/misc.schema
pidfile         /opt/openldap/var/run/meta.pid
argsfile        /opt/openldap/var/run/meta.args
database        meta
suffix          ou=orgunit,o=gouv,c=fr
uri ldap://localhost:1001/ou=dept1,ou=orgunit,o=gouv,c=fr
#network-timeout 5
#timeout 3
uri ldap://localhost:1002/ou=dept2,ou=orgunit,o=gouv,c=fr
#network-timeout 5
#timeout 4
uri ldap://localhost:1003/ou=dept3,ou=orgunit,o=gouv,c=fr
#network-timeout 5
#timeout 4

and it is launched as follows:
/opt/openldap/libexec/slapd -n meta -f /opt/openldap/etc/openldap/meta.conf -h 
ldap://0.0.0.0:1000/ -d 256

# test with the 3 servers up
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr
dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr
=> entries from the three servers are returned

# stop server 2 (kill -INT ...) and perfom a new search:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> looks good : entries from server1 and server 3 are returned

Below are the meta instance logs:
conn=1001 fd=9 ACCEPT from IP=172.30.8.13:55048 (IP=0.0.0.0:1000)
conn=1001 op=0 BIND dn="" method=128
conn=1001 op=0 RESULT tag=97 err=0 text=
conn=1001 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 
filter="(objectClass=person)"
conn=1001 op=1 SRCH attr=dn
conn=1001 op=1 meta_back_retry[1]: retrying URI="ldap://localhost:1002"; DN="".
conn=1001 op=1 meta_back_retry[1]: meta_back_single_dobind=52
conn=1001 op=1 SEARCH RESULT tag=101 err=0 nentries=4 text=
conn=1001 op=2 UNBIND
conn=1001 fd=9 closed
=> looks good as nentries=4

# perform numerous new search without changing anything:
[root@pp-ae2-proxy2 log]# /opt/openldap/bin/ldapsearch -LLL -x -H 
ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr  objectclass=person 
dn  |grep dn:
=> nothing returned

Below are the corresponding logs:
conn=1002 fd=9 ACCEPT from IP=172.30.8.13:55049 (IP=0.0.0.0:1000)
conn=1002 op=0 BIND dn="" method=128
conn=1002 op=0 RESULT tag=97 err=0 text=
conn=1002 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 
filter="(objectClass=person)"
conn=1002 op=1 SRCH attr=dn
conn=1002 op=1 meta_search_dobind_init[1]: retrying URI="ldap://localhost:1002"; 
DN="".
conn=1002 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text=
conn=1002 op=2 UNBIND
conn=1002 fd=9 closed
=> looks bad as nentries=0
=> Only the first search after server2 stop is successfull.

# new search but using server1 ou:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=dept1,ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
=> looks good

# same search as earlier i.e. using root node:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
=> Looks good also. It looks like all is OK but only once a channel is opened 
to server1 using another manner

# new search but using server3 base object:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=dept3,ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> looks good

# new search but using slapd-meta base object:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> entries from server1 and server3 are returned
=> this confirms lookups in server1 and server3 are not performed until a 
channel is opened to both of them using their repective base object
=> another strange behavior : if search using server3 ou is performed before 
serach using server ou, then next search attempt using root node allows to 
retrieve entries from both server1 and server3 ...

# new search after server2 restart: 
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> good, all entries are returned

# new search after meta instance restart while server2 is already stopped 
opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
=> unlike the previous test case (server2 stopped while meta instance is 
already running) we do not see the single successfull search.

Then, behaviour is the same i.e. search on root node works again, but only once 
a search has been performed using ou=dept1 and ou=dept3.

In addition, behaviour is slightly different adding the "conn-ttl" parameter 
set to 3 (3 seconds). I could expose it in a new post.

Thanks for anyone who could help to identify whether it is a misconfiguration 
or a bug.

Michel Gruau

> Message du 19/08/11 13:13
> De : "Michel Gruau" 
> A : "openldap-technical openldap org" 
> Copie à : 
> Objet : Slapd-meta stop at the first unreachable candidate
>
> Hello,

It have a slapd-meta configuration as follows: 

database meta
suffix dc=com
uri ldap://server1:389/dc=suffix1,dc=com
uri ldap://server2:389/dc=suffix2,dc=com
uri ldap://server3:389/dc=suffix3,dc=com

I performed numerous tests using "base=com" and changing the order of the above 
list of uri (in slapd.cnof) and I see that as soon as a candidate directory is 
unreachable, all other directories located below the directory in failure are 
not requested by the proxy. For instance, in example below:
- if server2 is down, then server 3 is not requeted
- if server1 is down, then none of the directories is requested.

I have the felling this is a bug ... could you confirm ?

FYI, I also tried the "'onerrr continue" config, but did not change annything

Thanks in advance.

Michel

> 
>

Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ?
Je crée ma boîte mail www.laposte.net

Re:Slapd-meta stop at the first unreachable candidate

Reply via email to