Re:Slapd-meta stop at the first unreachable candidate

Michel Gruau Fri, 09 Sep 2011 09:55:03 -0700

I performed many new tests and provide below several new comments. I recall 
that of them concern the fact to perform an ldapsearch on the meta suffix while 
one URI is unrechable.


1/ When network-timeout is not set for the unreachable URI, then no entry is 
returned. Unless the URI becomes unreachable after the meta startup AND for the 
first request only.

2/ When network-timeout is set, then the meta always returns entries, but only 
from the URI which are above the unreachable URI is slapd.conf.

3/ When conn-ttl is set, the walk-around which is to perform a search below the 
root node in order to open the channel to the different URI, is not working. 
Channels are always lost after the ttl expire.

4/ I saw slightly different behaviours when the unrechable URI is the first one 
in slapd.conf. Most of my earlier tests were performed with an unreachable URI 
which is the second one in slapd.conf.

4/ Such behaviour concern the case in which the URI contains valid IP (I mean 
pingable) but the port is unreachable (remote ldap server is stopped or port in 
wrong). I mean : each time the connect error is returned very quickly from 
network layers. But there are other cases in which connect error is returned 
more slowly (for instance IP address is not "pingable" or IP address is a LB 
virtual IP address and the remote server is unreachable). In these later cases, 
the meta behaviour is greatly different. I mean the meta behaviour is quite 
normal and entries from all reachable URI are returned ... once the different 
timer have elapsed. However, this later case led me to several new problems 
described below.

5/ The timer which is taken into account for the connection is not the 
"network-timeout" parameter but the "timeout" parameter. According to the 
slapd-meta man page, this timeout is supposed to be intended for LDAP 
operations and not for the network connection step.

6/ The value of the "timeout" parameter is the only one which is taken into 
account, but ... only if the "network-timeout" parameter is set. In other 
cases, a default (and longer) timeout seems to be applied. The value of 
network-timeout remains meaningless, it has only to be set with any value.

7/ The "timeout" parameter which is used is not the parameter corresponding to 
the unreachable URI. After many tests, I discovered it is the maximum value 
among all "timeout" in the "database meta" section, whatever the URI is !!!

Conclusions: all these parameters provide a really powerfull way to cutomize 
the configuration but few of them are working as excepted. Could anyone explain 
me whether I can provide all those details as a bug report ? Or should I wait 
that anyone could reproduce ?

Thanks for all.

Michel

> Message du 05/09/11 16:31
> De : "Michel Gruau" 
> A : "openldap-technical openldap org" 
> Copie à : 
> Objet : Re:Slapd-meta stop at the first unreachable candidate
>
> Below is a sample configuration allowing to reproduce the problem : 
> 
> Three openldap data instances configured as follows: 
> include         /opt/openldap/etc/openldap/schema/core.schema
> include         /opt/openldap/etc/openldap/schema/cosine.schema
> include         /opt/openldap/etc/openldap/schema/inetorgperson.schema
> include         /opt/openldap/etc/openldap/schema/nis.schema
> include         /opt/openldap/etc/openldap/schema/dyngroup.schema
> include         /opt/openldap/etc/openldap/schema/misc.schema
> pidfile         /opt/openldap/var/run/server1.pid
> argsfile        /opt/openldap/var/run/server1.args
> loglevel        stats
> database        bdb
> suffix          ou=orgunit,o=gouv,c=fr
> directory       /opt/openldap/var/server1
> 
> Note: server1 is changed by server2 and server3 for other instances.
> 
> Each instance contains the following data: (only 4 entries):
> dn: ou=orgunit,o=gouv,c=fr
> objectClass: top
> objectClass: organizationalUnit
> ou: orgunit
> 
> dn: ou=dept1,ou=orgunit,o=gouv,c=fr
> ou: dept1
> objectClass: top
> objectClass: organizationalUnit
> 
> dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
> objectClass: top
> objectClass: person
> objectClass: organizationalPerson
> objectClass: inetOrgPerson
> mail: [email protected]
> cn: User 11
> uid: user11
> givenName: User
> sn: 11
> 
> dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
> objectClass: top
> objectClass: person
> objectClass: organizationalPerson
> objectClass: inetOrgPerson
> mail: [email protected]
> cn: User 12
> uid: user12
> givenName: User
> sn: 12
> 
> Note: user1x and dept1 are substituted in instances 2 and 3 by user2x, dept2, 
> user3x,dept3.
> 
> Data instances are launched using this command:
> /opt/openldap/libexec/slapd -n server1 -f 
> /opt/openldap/etc/openldap/server1.conf -h ldap://0.0.0.0:1001/
> /opt/openldap/libexec/slapd -n server2 -f 
> /opt/openldap/etc/openldap/server2.conf -h ldap://0.0.0.0:1002/
> /opt/openldap/libexec/slapd -n server3 -f 
> /opt/openldap/etc/openldap/server3.conf -h ldap://0.0.0.0:1003/
> 
> Meta instance is configured as follows: 
> include         /opt/openldap/etc/openldap/schema/core.schema
> include         /opt/openldap/etc/openldap/schema/cosine.schema
> include         /opt/openldap/etc/openldap/schema/inetorgperson.schema
> include         /opt/openldap/etc/openldap/schema/nis.schema
> include         /opt/openldap/etc/openldap/schema/dyngroup.schema
> include         /opt/openldap/etc/openldap/schema/anais.schema
> include         /opt/openldap/etc/openldap/schema/misc.schema
> pidfile         /opt/openldap/var/run/meta.pid
> argsfile        /opt/openldap/var/run/meta.args
> database        meta
> suffix          ou=orgunit,o=gouv,c=fr
> uri ldap://localhost:1001/ou=dept1,ou=orgunit,o=gouv,c=fr
> #network-timeout 5
> #timeout 3
> uri ldap://localhost:1002/ou=dept2,ou=orgunit,o=gouv,c=fr
> #network-timeout 5
> #timeout 4
> uri ldap://localhost:1003/ou=dept3,ou=orgunit,o=gouv,c=fr
> #network-timeout 5
> #timeout 4
> 
> and it is launched as follows:
> /opt/openldap/libexec/slapd -n meta -f /opt/openldap/etc/openldap/meta.conf 
> -h ldap://0.0.0.0:1000/ -d 256
> 
> # test with the 3 servers up
> /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
> dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
> dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr
> dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr
> => entries from the three servers are returned
> 
> # stop server 2 (kill -INT ...) and perfom a new search:
> /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
> dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
> => looks good : entries from server1 and server 3 are returned
> 
> Below are the meta instance logs:
> conn=1001 fd=9 ACCEPT from IP=172.30.8.13:55048 (IP=0.0.0.0:1000)
> conn=1001 op=0 BIND dn="" method=128
> conn=1001 op=0 RESULT tag=97 err=0 text=
> conn=1001 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 
> filter="(objectClass=person)"
> conn=1001 op=1 SRCH attr=dn
> conn=1001 op=1 meta_back_retry[1]: retrying URI="ldap://localhost:1002"; DN="".
> conn=1001 op=1 meta_back_retry[1]: meta_back_single_dobind=52
> conn=1001 op=1 SEARCH RESULT tag=101 err=0 nentries=4 text=
> conn=1001 op=2 UNBIND
> conn=1001 fd=9 closed
> => looks good as nentries=4
> 
> # perform numerous new search without changing anything:
> [root@pp-ae2-proxy2 log]# /opt/openldap/bin/ldapsearch -LLL -x -H 
> ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=fr  objectclass=person 
> dn  |grep dn:
> => nothing returned
> 
> Below are the corresponding logs:
> conn=1002 fd=9 ACCEPT from IP=172.30.8.13:55049 (IP=0.0.0.0:1000)
> conn=1002 op=0 BIND dn="" method=128
> conn=1002 op=0 RESULT tag=97 err=0 text=
> conn=1002 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 
> filter="(objectClass=person)"
> conn=1002 op=1 SRCH attr=dn
> conn=1002 op=1 meta_search_dobind_init[1]: retrying 
> URI="ldap://localhost:1002"; DN="".
> conn=1002 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text=
> conn=1002 op=2 UNBIND
> conn=1002 fd=9 closed
> => looks bad as nentries=0
> => Only the first search after server2 stop is successfull.
> 
> # new search but using server1 ou:
> /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=dept1,ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
> => looks good
> 
> # same search as earlier i.e. using root node:
> /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
> => Looks good also. It looks like all is OK but only once a channel is opened 
> to server1 using another manner
> 
> # new search but using server3 base object:
> /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=dept3,ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
> dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
> => looks good
> 
> # new search but using slapd-meta base object:
> /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
> dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
> => entries from server1 and server3 are returned
> => this confirms lookups in server1 and server3 are not performed until a 
> channel is opened to both of them using their repective base object
> => another strange behavior : if search using server3 ou is performed before 
> serach using server ou, then next search attempt using root node allows to 
> retrieve entries from both server1 and server3 ...
> 
> # new search after server2 restart: 
> /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
> dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr
> dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
> dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr
> dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
> => good, all entries are returned
> 
> # new search after meta instance restart while server2 is already stopped 
> opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b 
> ou=orgunit,o=gouv,c=fr  objectclass=person dn  |grep dn:
> => unlike the previous test case (server2 stopped while meta instance is 
> already running) we do not see the single successfull search.
> 
> Then, behaviour is the same i.e. search on root node works again, but only 
> once a search has been performed using ou=dept1 and ou=dept3.
> 
> In addition, behaviour is slightly different adding the "conn-ttl" parameter 
> set to 3 (3 seconds). I could expose it in a new post.
> 
> Thanks for anyone who could help to identify whether it is a misconfiguration 
> or a bug.
> 
> Michel Gruau
> 
> > Message du 19/08/11 13:13
> > De : "Michel Gruau" 
> > A : "openldap-technical openldap org" 
> > Copie à : 
> > Objet : Slapd-meta stop at the first unreachable candidate
> >
> > Hello,
> 
> It have a slapd-meta configuration as follows: 
> 
> database meta
> suffix dc=com
> uri ldap://server1:389/dc=suffix1,dc=com
> uri ldap://server2:389/dc=suffix2,dc=com
> uri ldap://server3:389/dc=suffix3,dc=com
> 
> I performed numerous tests using "base=com" and changing the order of the 
> above list of uri (in slapd.cnof) and I see that as soon as a candidate 
> directory is unreachable, all other directories located below the directory 
> in failure are not requested by the proxy. For instance, in example below:
> - if server2 is down, then server 3 is not requeted
> - if server1 is down, then none of the directories is requested.
> 
> I have the felling this is a bug ... could you confirm ?
> 
> FYI, I also tried the "'onerrr continue" config, but did not change annything
> 
> Thanks in advance.
> 
> Michel
> 
> > 
> >
> 
> Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
> tente ?
> Je crée ma boîte mail www.laposte.net
>

Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ?
Je crée ma boîte mail www.laposte.net

Re:Slapd-meta stop at the first unreachable candidate

Reply via email to