I work at a place with a fairly large openldap setup. (2.3.32).  We have 3 
large Read servers,  Dell R900, 32 GB ram, 2x quadcore, hw RAID1 disks for ldap 
volume. The entire database takes about 13GB of physical disk space (the BDB 
files), and has a few million entries.  DB_CONFIG is configurd to have the 
entire DB in memory (for speed), and slapd.conf cachesize is set to a million 
entries, to make as most effective use of the 32GB of ram this box has.   We 
have them behind an F5 BigIP hardware load balancer (6400 series), and find 
that during peak times of the day,  we get "connection deferred: binding" in 
our slapd.logs. (loglevel set to "none" (misnomer)), and a client request (or 
series of them) fails.  If we use round robin DNS instead, we  rarely see those 
errors.    CPU usage is low, even during peak times, hovering at 20-50% of 1 
core (the other 7 are idle)

The interesting this are it seems to happen (the connection defered: binding) , 
only after a certian load threshold is reached (busiest time of day), and only 
when behind the F5's.    I am suspected it might be the "conn_max_pending" or 
"conn_max_pending_auth" defaults (100 and 1000 respectively), as when behind 
the F5, all the connections will appear to come from the F5 addresses, vs RR 
dns where it's coming from a wde range of sources (eah of the servers. (well 
over 100+).

We had tried experimenting with a higher number of threads previously,  but 
that didn't seem to have a positive effect.    Can any openLDAP guru's suggest 
some things to set/look for,  i.e. (higher number of threads, higher defaults 
for conn_max_pending, conn_max_pending_auth).

Any ideas on what a theoretical performance limit should be of a machine of 
this caliber? i.e. how many reqs/sec, how far will it scale, etc..


We have plans to upgrade to 2.4,  but it's a "down the road item", and mgmt is 
demanding answers to "how far can this design scale as it is"...

Thanks!

-- David J. Andruczyk



      

Reply via email to