Liam Gretton wrote: > On 23/08/2012 11:18, Howard Chu wrote: > > Your description of your procedure is so vague and imprecise it's > > difficult for anybody to decipher what you're talking about. > > > > Reading back thru the several posts in this thread, what I see you > > saying is that you have tested a few different configurations: > > > > 1) target host is up, target LDAP server is down > > > > this should fail immediately because the host OS will > > immediately send a > > > > TCP Connection Refused response > > > > 2) target host is initially down > > > > this will not fail until the first TCP connect request times > > out > > > > 3) target host is initially up and connected, but thru your > > iptables manipulation you sever the link > > > > this will not fail until the TCP connection times out, which > > it won't > > > > unless you're using TCP Keepalives, and by default those are only > > sent once every 2 hours. > > Let me make it less vague then. > > What I've been trying to simulate are the various modes by which a > uri target will become unavailable. What I'm trying to achieve is to > have the meta backend point to four domain controllers and cope with > one or more DCs being unavailable. > > Having gone through this and let the system time out each time, I've > found it does fail over under one of the conditions listed below, but > it takes about 15 minutes to do so. > > > Scenarios: > > 1. slapd starts, first target is unreachable; > > 2. slapd starts, first target is reachable but has no service > running; > > 3. slapd already running, first target up and connected then later > becomes unreachable. > > > Simulations: > > a. 'Unreachable' simulated by blocking outbound access with the > following iptables rule: > > iptables -A OUTPUT -d host1 -j DROP > > b. 'Unreachable' simulated making the first target a host that is up > but with no service running. > > > Results (all with 2.4.32): > > Case 1a: slapd retries host1 continuously and times out after about > 180s. No attempt is made to contact additional targets. > > Case 2b: slapd retries host1 continuously and times out after about > 180s. No attempt is made to contact additional targets. > > Case 3a: slapd retries host1 continuously, doubling an internal > timeout value each time, eventually timing out after 19 retries and > about 15m. It does then fall through to host2 and subsequent > connections don't attempt to contact host1. > > Here's my config. I've also tried setting nretries explicitly to 3, > but it makes no difference. > > > database meta > suffix dc=local > rootdn cn=administrator,dc=local > rootpw secret > > network-timeout 1 > > uri ldap://host1:3268/ou=dc1,dc=local > ldap://host2:3268/ > ldap://host3:3268/ > > suffixmassage "ou=dc1,dc=local" "dc=example,dc=com" > > idassert-bind bindmethod=simple > binddn="cn=proxyuser,dc=example,dc=com" > credentials="password" > > idassert-authzfrom "dn.exact:cn=administrator,dc=local" > > > These results suggest to me that network-timeout and nretries (which > should default to 3) don't work as documented. I am really not astonished about your results. Run your tests again, but use "reject" as iptables target.
"drop" means, that you never ever get an answer. > Having said that, it does seem to at least cope with scenario 3, > albeit with a long timeout. > > Ideally it'd work in all cases. Pierangelo says the failover works > when connect() times out, but I'd have thought that would include > scenarios 1 and 2 but not 3. -- Harry Jede
