On Sunday 24 August 2003 03:36, Brian Candler wrote: > On Thu, Aug 21, 2003 at 10:21:51AM -0400, Jesse Guardiani wrote: > > I finally got a chance to run ktrace and kdump (freebsd things) > > on my "runaway" sqwebmail processes today. (They don't show > > up terribly often, but when they DO it brings my system to a > > crawl.) > > > > Basically, from what I could tell, one of these sqwebmail processes > > would cause 3 or 4 others to hang. Either that, or it's spawning > > them as subprocesses. I don't know how to tell if a process is a parent > > or child under FreeBSD... > > ktrace -id will enable tracing of child processes too; the pid is the first > column of the kdump output.
Yes. Next time I'll be sure to note the parent child relationships and trace all related threads. > > ps -ajxw will show processes with their parent pids. Those with a ppid of 1 > have detached from their controlling session or their parent has exited. OK. Good to know. > > > ------ BEGIN ktrace.out from process 36068 ------ > > 36068 sqwebmail CALL break(0xbf50000) > > 36068 sqwebmail RET break 0 > > 36068 sqwebmail CALL break(0xfdb5000) > > 36068 sqwebmail RET break 0 > > That's the low-level call for "allocate storage" > > > 36068 sqwebmail CALL read(0x4,0x80d8000,0x4000) > > 36068 sqwebmail GIO fd 4 read 16384 bytes > > Hmm, I don't know what is on fd 4. Sam? Any idea? > > > 36068 sqwebmail PSIG SIGTERM caught handler=0x807138c mask=0x0 code=0x0 > > That's you killing the process > > > 36068 sqwebmail CALL write(0x2,0xbfbff7c8,0x9) > > 36068 sqwebmail GIO fd 2 wrote 9 bytes > > "sqwebmail" > > 36068 sqwebmail RET write 9 > > 36068 sqwebmail CALL write(0x2,0x281b8e8d,0xb) > > 36068 sqwebmail GIO fd 2 wrote 11 bytes > > " in free():" > > fd 2 is stderr. Those messages should appear in your Apache error_log, > incidentally. Yup, they sure do: ------ BEGIN APACHE ERROR LOG ------ sqwebmail in free(): warning: recursive call sqwebmail in free(): warning: recursive call sqwebmail in free(): warning: recursive call sqwebmail in free(): warning: recursive call ------ END APACHE ERROR LOG ------ > > > Any ideas? Anyone know how I can troubleshoot/debug > > this more accurately? > > You could try attaching gdb to the running process: > > gdb /path/to/sqwebmail pid > > then do 'bt' to get a backtrace of the current stack frame. Then you can > single-step it. This will let you see where this infinite loop is > happening. OK. Sounds good. I'll do that next time it happens. I don't have to compile wth any special flags for gdb to work, do I? > > You're not running sqwebmail under FastCGI are you? If so, I should revert > to normal CGI. No, I'm not. I'd like to figure out what is causing this problem before I switch to FastCGI. I only get bad processes like this every week or so. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net