There is a discussion of solaris threads socket read/write locking issues and some workarounds at: http://omniorb.sourceforge.net/omni40/omnithread.html
See "6 Threaded I/O shutdown for Unix" Cheers Brett On 08/01/2011, at 9:55 AM, Howard Chu <[email protected]> wrote: > Doug Leavitt wrote: >> >> On 01/ 7/11 08:01 AM, Rein Tollevik wrote: >>> On 06.01.11 22.48, Quanah Gibson-Mount wrote: >>>> --On Thursday, January 06, 2011 7:40 PM +0100 Rein Tollevik >>>> <[email protected]> wrote: >>>> >>>>> On 04.01.11 23.34, Quanah Gibson-Mount wrote: >>>>>> Please test RE24 heavily. >>>>> >>>>> test039 deadlocks for me on 64bit solaris10, both x86 and sparc :-( It >>>>> hangs in the monitor, triggered by the new swamp -SS option added to >>>>> slapd-tester. It works if run with -S or -SSS. It is the third server >>>>> that hangs, and it does so quite consistently with the same stack trace >>>>> every time. A gdb trace is at at: >>>>> >>>>> ftp://ftp.openldap.org/incoming/rein-test039-gdb-trace.txt >>>> >>>> Does this happen on both HEAD and RE24, or RE24 only? >>> >>> Both, as well as when running the head tests suite with the 2.4.23 >>> release. Looks as if the swamp additions have tripped into an >>> existing problem, not anything new. Leave it out of RE24 until if >>> have been resolved? >>> >>> Btw, any other Solaris test runs out there? I´t like to know if it is >>> a real Solaris problem or just me.. > > I'm seeing a similar failure on 32 bit Sparc Solaris 10. But it actually > locks up in test036 for me, I never get as far as test039. The gdb trace > looks much the same as what you posted. > > Looks like for some reason threads that are blocked waiting for their sockets > to become writable are never getting waken up. A regular SIGINT shuts down > slapd cleanly so it doesn't appear to be a problem with the condvars being > used to manage the threads. That kinda points to select() simply not > returning the writable status. > > I haven't used this Solaris machine much, but in fact (looking at the > remnants of other files in my source tree on this box) this appears to have > been a problem since at least last August. (I.e., it looks like I was > investigating this same problem back then but dropped it and never got back > to it.) > >>> Rein > >> I'm currently testing Solaris11 (Nevada) and not seeing any issues in >> either 32 or 64 >> bit builds using both RE24 and HEAD. I have not had any failures on >> x86 yet. >> Testing is still underway for sparc and other internal system testing on >> both platforms. > > -- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/
