Hello again, Just for the record, I've found the problem that was causing slapd to hang on simultaneous writes. This setting was present on DB_CONFIG:
set_lk_detect DB_LOCK_EXPIRE This was causing deadlocks not being detected, thus making slapd wait forever for operations to complete. 2011/2/4 Diego Lima <[email protected]>: > Hello all, > > I'm running some load tests on an LDAP server using JMeter and as part > of the tests I'm running a continuous loop of users trying to change > their passwords. The problem is, there is a point where OpenLDAP stops > responding requests and is unable to complete its current operations. > This is what each "user" do: > > Bind -> Modify userPassword -> Unbind > > When I spawn a few users on a short time and try to do the modify > operation the server stops responding. I was able to do that by > spawning 5 threads simultaneously and trying to have them change the > user password at once. It seems that something is not handling the > write operations well and is hanging. Slapd doesnt stop responding > completely, as I'm able to send a kill signal to it and it'll tell > that it is waiting for some tasks to end. The complete log follows: > > > Feb 4 13:47:44 Server01 slapd[8772]: @(#) $OpenLDAP: slapd 2.4.23 > (Jan 28 2011 17:07:07) > $#012#011root@Server01:/home/suporte/openldap-2.4.23/servers/slapd > Feb 4 13:47:44 Server01 slapd[8772]: /ldap/etc/slapd.conf: line 108: > rootdn is always granted unlimited privileges. > Feb 4 13:47:44 Server01 slapd[8773]: bdb_db_open: database > "dc=diretorio,dc=mycompany": unclean shutdown detected; attempting > recovery. > Feb 4 13:47:45 Server01 slapd[8773]: slapd starting > Feb 4 13:47:51 Server01 slapd[8773]: conn=1000 fd=12 ACCEPT from > IP=192.168.1.2:6098 (IP=0.0.0.0:389) > Feb 4 13:47:51 Server01 slapd[8773]: conn=1002 fd=15 ACCEPT from > IP=192.168.1.2:6101 (IP=0.0.0.0:389) > Feb 4 13:47:51 Server01 slapd[8773]: conn=1003 fd=13 ACCEPT from > IP=192.168.1.2:6102 (IP=0.0.0.0:389) > Feb 4 13:47:51 Server01 slapd[8773]: conn=1001 fd=14 ACCEPT from > IP=192.168.1.2:6099 (IP=0.0.0.0:389) > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 fd=16 ACCEPT from > IP=192.168.1.2:6103 (IP=0.0.0.0:389) > Feb 4 13:47:51 Server01 slapd[8773]: conn=1001 op=0 BIND > dn="cn=tcarga_usuario4,ou=usuarios,dc=diretorio,dc=mycompany" > method=128 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1002 op=0 BIND > dn="cn=tcarga_usuario1,ou=usuarios,dc=diretorio,dc=mycompany" > method=128 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1003 op=0 BIND > dn="cn=tcarga_usuario3,ou=usuarios,dc=diretorio,dc=mycompany" > method=128 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 op=0 BIND > dn="cn=tcarga_usuario5,ou=usuarios,dc=diretorio,dc=mycompany" > method=128 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1002 op=0 BIND > dn="cn=tcarga_usuario1,ou=usuarios,dc=diretorio,dc=mycompany" > mech=SIMPLE ssf=0 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1000 op=0 BIND > dn="cn=tcarga_usuario2,ou=usuarios,dc=diretorio,dc=mycompany" > method=128 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1002 op=0 RESULT tag=97 err=0 text= > Feb 4 13:47:51 Server01 slapd[8773]: conn=1000 op=0 BIND > dn="cn=tcarga_usuario2,ou=usuarios,dc=diretorio,dc=mycompany" > mech=SIMPLE ssf=0 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1003 op=0 BIND > dn="cn=tcarga_usuario3,ou=usuarios,dc=diretorio,dc=mycompany" > mech=SIMPLE ssf=0 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1001 op=0 BIND > dn="cn=tcarga_usuario4,ou=usuarios,dc=diretorio,dc=mycompany" > mech=SIMPLE ssf=0 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1000 op=0 RESULT tag=97 err=0 text= > Feb 4 13:47:51 Server01 slapd[8773]: conn=1003 op=0 RESULT tag=97 err=0 text= > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 op=0 BIND > dn="cn=tcarga_usuario5,ou=usuarios,dc=diretorio,dc=mycompany" > mech=SIMPLE ssf=0 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1001 op=0 RESULT tag=97 err=0 text= > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 op=0 RESULT tag=97 err=0 text= > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 op=1 MOD > dn="cn=tcarga_usuario5,ou=usuarios,dc=diretorio,dc=mycompany" > Feb 4 13:47:51 Server01 slapd[8773]: conn=1000 op=1 MOD > dn="cn=tcarga_usuario2,ou=usuarios,dc=diretorio,dc=mycompany" > Feb 4 13:47:51 Server01 slapd[8773]: conn=1001 op=1 MOD > dn="cn=tcarga_usuario4,ou=usuarios,dc=diretorio,dc=mycompany" > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 op=1 MOD attr=userPassword > Feb 4 13:47:51 Server01 slapd[8773]: conn=1003 op=1 MOD > dn="cn=tcarga_usuario3,ou=usuarios,dc=diretorio,dc=mycompany" > Feb 4 13:47:51 Server01 slapd[8773]: conn=1003 op=1 MOD attr=userPassword > Feb 4 13:47:51 Server01 slapd[8773]: conn=1000 op=1 MOD attr=userPassword > Feb 4 13:47:51 Server01 slapd[8773]: slap_queue_csn: queing > 0x40d36450 20110204154751.766634Z#000001#000#000000 > Feb 4 13:47:51 Server01 slapd[8773]: slap_queue_csn: queing > 0x426a5450 20110204154751.766634Z#000000#000#000000 > Feb 4 13:47:51 Server01 slapd[8773]: slap_queue_csn: queing > 0x436a7450 20110204154751.766652Z#000000#000#000000 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1002 op=1 MOD > dn="cn=tcarga_usuario1,ou=usuarios,dc=diretorio,dc=mycompany" > Feb 4 13:47:51 Server01 slapd[8773]: conn=1001 op=1 MOD attr=userPassword > Feb 4 13:47:51 Server01 slapd[8773]: conn=1002 op=1 MOD attr=userPassword > Feb 4 13:47:51 Server01 slapd[8773]: slap_queue_csn: queing > 0x42ea6450 20110204154751.766981Z#000000#000#000000 > Feb 4 13:47:51 Server01 slapd[8773]: slap_queue_csn: queing > 0x41ea4450 20110204154751.767126Z#000000#000#000000 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 op=1 RESULT tag=103 err=0 > text= > Feb 4 13:47:51 Server01 slapd[8773]: slap_graduate_commit_csn: > removing 0x7f9bc4203f10 20110204154751.766634Z#000000#000#000000 > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 op=2 UNBIND > Feb 4 13:47:51 Server01 slapd[8773]: conn=1004 fd=16 closed > > [ Here I sent sigterm to slapd ] > > Feb 4 13:52:14 Server01 slapd[8773]: daemon: shutdown requested and > initiated. > Feb 4 13:52:14 Server01 slapd[8773]: slapd shutdown: waiting for 4 > operations/tasks to finish > > > The server does not respond anymore and I have to kill the process by > using kill -9. Anyone ever experienced the same or can provide hints > as to what is happening? > > I'm using slapd 2.4.23 and berkeley db 4.6 on a Debian Lenny system. > > > -- > Diego Lima > http://www.diegolima.org > -- Diego Lima http://www.diegolima.org
