Hello.

slapd keep dying more and more frequently recently, from dying once a
week in last month to current dying 3 to 5 times a day.
"/etc/init.d/slapd start" instantly recover it, for next a few hours.
Exactly the same web application that uses the ldap database had been
running as-is for about one year without this problem. Server machine
never changed/replaced/touched.

What we tried are (in that order):

   1. Run a server monitor to make sure the server load is not high
      (below 0.5) when slapd dies.
   2. Upgrade to 2.4.11-1+lenny2 (on Debian)
   3. slapcat the mostly used database (hdb) and slapadd them back in.
   4. do the same for the other database (bdb);
   5. track the log message at log level 256 (connection) and finding no
      clue. For example, one time, the last word is:

      Oct 25 16:06:59 www slapd[11969]: conn=26289 fd=29 ACCEPT from 
IP=**.**.**.**:56539 (IP=0.0.0.0:389) 
      Oct 25 16:06:59 www slapd[11969]: conn=26289 op=0 BIND 
dn="cn=admin,dc=*******" method=128 
      Oct 25 16:06:59 www slapd[11969]: conn=26289 op=0 BIND 
dn="cn=admin,dc=*******" mech=SIMPLE ssf=0 
      Oct 25 16:06:59 www slapd[11969]: conn=26289 op=0 RESULT tag=97 err=0 
text= 
      Oct 25 16:06:59 www slapd[11969]: conn=26288 op=8 UNBIND 
      Oct 25 16:06:59 www slapd[11969]: conn=26288 fd=41 closed 
      Oct 25 16:06:59 www slapd[11969]: conn=26289 op=1 UNBIND 
      Oct 25 16:06:59 www slapd[11969]: conn=26289 fd=29 closed

      Another time it is:


      Oct 25 16:27:09 www slapd[25691]: conn=2750 fd=51 ACCEPT from 
IP=**.**.**.**:54846 (IP=0.0.0.0:389) 
      Oct 25 16:27:09 www slapd[25691]: conn=2750 op=1 SRCH 
base="ou=contacts,ou=china,dc=*******" scope=2 deref=0 
filter="(uidNumber=7762)" 
      Oct 25 16:27:10 www slapd[25691]: conn=2750 op=1 SRCH attr=o mail 
telephonenumber contactperson c st l street postalcode postofficebox 
facsimiletelephonenumber labeleduri businesscategory description pnglogo 
changetime lastrecapdate objectclass category     

   6. Track the log message at log level 4 (heavy trace debugging) and
      finding no clue. For example, one time, the last word is:

      Oct 25 22:23:48 www slapd[723]: connection_get(50) 
      Oct 25 22:23:48 www slapd[723]: SRCH "ou=contacts,ou=china,dc=*******" 2 0
      Oct 25 22:23:48 www slapd[723]:     0 0 0 
      Oct 25 22:23:48 www slapd[723]:     filter: (uidNumber=2) 
      Oct 25 22:23:48 www slapd[723]:     attrs:
      Oct 25 22:23:49 www slapd[723]:  
      Oct 25 22:23:49 www slapd[723]: connection_get(56) 
      Oct 25 22:23:49 www slapd[723]: SRCH "ou=contacts,ou=china,dc=*******" 2 0
      Oct 25 22:23:49 www slapd[723]:     0 0 0 
      Oct 25 22:23:49 www slapd[723]:     filter: (uidNumber=2) 


Help, hints and suggestion of specific RTFM highly appreciated. Offering
*paid* help to remote login to solve this problem is highly appreciated
as well (please send me and my colleague on the 'cc' an email about the
quotes). The problem exhausted us all.

Thanks in advance!

Zhang Weiwu

Reply via email to