Jim Davidson wrote: > > Howdy, > > The leak is more precisely bloat, i.e., Tcl interps that create more > and more objects and [...] The memory is not > actually "returned" via a munmap, e.g.. Nate Folkman and I spent > tons of time trying to figure this out years ago with all sorts of > coalescing, unmapping, etc. fun but failed. It's possible other > memory allocators have the same or better performance in speed and > space today.
Things may be a bit better today, as (I believe) the core tcl allocator has a separate pool specifically for Tcl_Objs, which are a extremely common small allocation. I think the 'vtamalloc' allocator by Zoran Vasiljevec (?) was specifically written to munmap/deallocate memory so that it could be reclaimed by the system and keep the high-water mark down. It requires you to build tcl with nonstandard defines tho, so you don't get the standard tcl threaded allocator, which I think is a direct derivative of zippy. > Anyway, what's the race condition? Curious about that one. I'm going off memory, because looking at the code it seems that it shouldn't happen, but it was and my change fixed what I was seeing. The problem is that a pthreads thread starts to run immediately upon creation, and if maxconns was set too low then the conn thread could run and exit before Ns_ThreadCreate (which is more or less a thin wrapper around pthread_create) ran to completion, adding itself to the list of threads to be reaped, and the next thread to be created would reap the dead threads list but the thread id never got written in Ns_ThreadCreate. I could only reproduce the error with maxconns less than about 10, and running a benchmark like 'ab' with a fast request like a fastpath file. pthread_join would be called with a null tid, and the server would segfault. I "fixed" this by passing the Ns_Thread* passed to Ns_ThreadCreate directly through to pthread_create; on linux at least the tid is written to the pthread_t in pthread_create before the new thread starts running, but POSIX offers no such guarantee, so my "fix" might not work on solaris or elsewhere (I only dug into so many library files). I'm confusing myself now tho, because it certainly looks like the threads are only ever reaped by the driver thread, which should absolutely finish Ns_ThreadCreate before it can call it again. Conn threads create a new thread to replace themselves when they hit maxconns, but they don't reap at that time, so it should be ok. Either way, I had easily reproducible segfaults, and tweaking the thread code eliminated them, so I'm pretty sure I saw something real :) -J ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ aolserver-talk mailing list aolserver-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/aolserver-talk