"...now and then slapd just stops and always without any 
 traces in the logfiles. Sometime three times a day, sometime a week
 without a failure. I can't find a pattern or any relation to any other
 service on the linux server."

 Is your LDAP environment installed on a VM or on plain metal box? 


 Regards, Kuba


----- Original Message -----
From: Ruud Baart
Sent: 27/02/11 12:57 PM
To: [email protected]
Subject: Problem unexpected failing slapd

 Problem: For a customer we use LDAP for many years. Last year suddenly the 
slapd service just stopped without any traces in the logfiles. After a restart 
of slapd everything works fine again. But the problem was there: it was not an 
incident, now and then slapd just stops and always without any traces in the 
logfiles. Sometime three times a day, sometime a week without a failure. I 
can't find a pattern or any relation to any other service on the linux server. 
Environment: - Several (debian squeeze) servers , several windows servers. We 
use bdb database backend. - There is one master LDAP server which provides 
syncprov and two replica's LDAP servers (syncrepl). The master server is most 
intens used (mainly samba as primary domain controller: a few hundred 
useraccounts, lot of groupaccounts, workstations, acl's, etc.), one of the 
replica's is not very busy but handles the mail for all users (lookup: amavis, 
postfix, courier-imap, mailaccount settings etc). The third replica i!
 s not busy at all, it is a remote location. - Total LDAP is 3700 dn's, slapcat 
produces a file of 7,3 Mb. - It is only the master LDAP with stops suddenly. I 
have never seen a failure of a replica LDAP. Because I have no clear idea about 
the problem I have no idea which technical details are relevant: DB_CONFIG 
=========== set_cachesize 0 10485760 1 set_lk_max_objects 10000 
set_lk_max_locks 10000 set_lk_max_lockers 10000 set_lg_dir /home/ldap-dbd The 
database is stored on a ext3 filesystem, kernel 2.6.32. The server has no 
problems, plenty of memory and a fast diskarray (SAS->SATA). Never technical 
problems with this server. And it worked without problems for a long period. 
Nothing has changed to the environment or the LDAP setup (except of course with 
the upgrade to debian squeeze but the problem was already there). What we have 
tried: - upgrade from openldap 2..4.17 (debian lenny+backports) to openldap 
2.4.23 (debian squeeze). I saw in the release notes that problems rela!
 ted to syncrepl were solved. Therefor we waited for version 2.4.23 te become 
available in debian. This upgrade made no difference. - reindex, rebuilt the 
directory. When I rebuilt the LDAP with a clean LDIF file on the master LDAP or 
an other machine with ldapadd there is not one error or warning. The workaround 
for the moment: I have written a process monitor (perl daemon) which monitors 
the slapd daemon and if it suddenly stops, slapd is restarted. It is of course 
not a solution but the 300 user can work. If slapd stops without a restart 
within 1 minute a few hundred people can't work because samba stops working. I 
would like to receive suggestions what we can do to find the problem. Because 
there is no pattern, nothing in the logfiles I don't know where to start. -- 
Regards, Ruud Baart

Reply via email to