Hi, I've recently been trying to hunt down some odd performance problems with 
our installation of 389 LDAP (currently 1.3.2.19 but been following recent 
debian unstable).  We've been seeing long delays (tens of seconds at times) 
handling even the simplest new bind()s while the server otherwise has idle 
worker threads (and other non-idle worker threads servicing existing 
conenctions).

Upon grabbing some userland thread stacks during these "hangs" when no new 
external connections could be established, I saw what looked to be the thread 
associated with slapd_daemon() in ldap/servers/slapd/daemon.c hung up in  
setup_pr_read_pds() walking the list of active connections acquiring connection 
locks (c->c_mutex) sequentially in the process.  I stuck some calls to 
clock_gettime() around the PR_Lock(c->c_mutex) call or or about 
ldap/servers/slapd/daemon.c:1690 and warned when we waitied for more than a set 
duration:


[22/Jul/2014:17:37:05 +0000] - setup_pr_read_pds: (fd=192) waited 995.375473 
msecs for lock
[22/Jul/2014:17:37:08 +0000] - setup_pr_read_pds: (fd=202) waited 3003.548263 
msecs for lock
[22/Jul/2014:17:37:10 +0000] - setup_pr_read_pds: (fd=181) waited 1997.828897 
msecs for lock

<up to 20-30 seconds in some extreme cases>

It looks like this could hang for up to CONN_TURBO_TIMEOUT_INTERVAL (default 1 
second) per thread in turbo (up to 50% of worker pool by default). While stuck 
there, it isn't calling handle_listeners() to pull new connections off of the 
well known port. 

Perhaps handle_listeners() should run off in its own thread, away from this 
connection maitenance?  (or if it must be there, a non-blocking PRP_TryLock() 
or somesuch?)


TIA

Thomas
--
389 users mailing list
[email protected]
https://admin.fedoraproject.org/mailman/listinfo/389-users

Reply via email to