[email protected] wrote: > On 10/15/2013 01:10 PM, [email protected] wrote: >> It is not the client loop that is multithreading but the ldap server. >> >> And it is not a misuse of the API but a problem that may be raised by day t= >> o day network problems. >> >> I've boiled down the problem to a few simple configurations that work (or b= >> etter, fail ;-) with both 2.4.23 and 2.4.36. A tgz file containing a setup = >> with start script and testclient is attached. It should be sufficient to re= >> produce the fault. >> >> The problem occurs only if we use session variable substitution in the rwm = >> overlay, and only if a search is *immediately* (e.g. caused by network loss= >> and client timeout) followed by an unbind. >> > > I modified the reproducer a bit (the start script) and find out a few things. > You can find the reproducer I'm using at [1]. > > Valgrind's helgrind shows some lock problems in the rwm overlay and also in > back-ldap and connection.c. After correcting those the issue seems to be gone. > > You can find helgrind logs at [2] (before the fix) and [3] (after). > > Also, ElectricFence reveals some problems [4], which I didn't fix yet. > > A fix attempt can be found at [5]. I'm not sure if that is a correct fix, or > it > just masked the real issue. But I didn't to manage to reproduce the problem > after applying it.
I already explained the problem. The other issues you identified are not relevant, and your patch is not correct. Reread Followup #4 of this ITS. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
