I am investigating a LDAP server failure where the LDAP server would stop working at random times and cause all kind of problems on the clients. There are more than 300 clients on the site.
I remember Jose mentioning problems with slapd running out of file descriptors in Spain, and started out investigating to see if this was the problem here too. I added two munin plugins to monitor the number of open files in slapd and if slapd was working or not. The plugins are included below. Google searches lead me to <URL: http://www.openldap.org/lists/openldap-technical/201106/msg00031.html > which report of a similar failure, and mention that adding a idle timeout value might solve it. I'm currently trying with "idletimeout 60" to see if it solve the problem. It would solve the problem by making sure clients are disconnected after a while, and thus not accumulating file descriptors for the duration of the LDAP server life time. If I got the details right, a Linux process can by default only have 1024 files open. This can be adjusted using 'ulimit -n', but I am not sure if it will work with select() calls because of a hardcoded constant in the system header files. With 4 LDAP connections created by nslcd on each client, that would make 256 the ypper limit on the number of clients. This is not really an acceptable upper limit on the scalability, and any sites with more clients would see random LDAP failures now and then. Anyone else seen similar problems? Is there any reason not to include "idletimeout 60" or some other sensible timeout in our default slapd.conf? ============ /etc/munin/plugins/open_slapd_files ======= #!/bin/sh # # Plugin to monitor the number of open files in slapd # # Parameters: # # config (required) # autoconf (optional - used by munin-config) # # Magic markers (Used by munin-config and some installation scripts. # Optional): # #%# family=auto #%# capabilities=autoconf pid=$(pidof slapd) if [ "$1" = "autoconf" ]; then if [ "$pid" ]; then echo yes exit 0 else echo no exit 1 fi fi if [ "$1" = "config" ]; then echo 'graph_title LDAP server slapd file table usage' echo 'graph_args --base 1000 -l 0' echo 'graph_vlabel number of open files' echo 'graph_category system' echo 'graph_info This graph monitors the slapd open files table.' echo 'used.label open files' echo 'used.info The number of currently open files.' exit 0 fi printf "used.value " ls /proc/$pid/fd|wc -l =========== /etc/munin/plugins/open_slapd_working ====== #!/bin/sh # # Plugin to monitor the number of open files in slapd # # Parameters: # # config (required) # autoconf (optional - used by munin-config) # # Magic markers (Used by munin-config and some installation scripts. # Optional): # #%# family=auto #%# capabilities=autoconf if [ "$1" = "autoconf" ]; then if [ 1 ]; then echo yes exit 0 else echo no exit 1 fi fi if [ "$1" = "config" ]; then echo 'graph_title LDAP server replying' echo 'graph_args --base 1000 -l 0' echo 'graph_vlabel true 1 or false 0' echo 'graph_category system' echo 'graph_info This graph replies from the slapd server.' echo 'working.label working' echo 'working.info Is the LDAP server working.' exit 0 fi ldapserver=ldap if ldapsearch -l 3 -LLL -h $ldapserver -x -b '' -s base > /dev/null 2>&1 ; then printf "working.value 1.0\n" else printf "working.value 0.0\n" fi ======================================================== -- Happy hacking Petter Reinholdtsen -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

