On Tue, Jun 29, 2010 at 03:23:38PM -0700, Sep Ng wrote: > Basically we had aolservers running and while serving pages, it's also > doing some heavy load processing from a ton of scheduled custom > written procedures.
Scheduled using AOLserver built-in scheduler, ns_schedule_proc, ns_schedule_daily, or the like? > Aolserver crashes and segmentation faults are fairly frequent and > the logs at the time pointed to these running threads as a probable > cause. Then the first place to look is in your custom code, it's the most likely place for the bug. Is your scheduled code purely Tcl or does it use any C code? If you turn off your scheduled procs, does the crashing go away? This is a debugging problem, you need to find the bug before you decide how to fix it. After the crash look at the core file's stack trace in a debugger and see if that gives you any clues. Can you reproduce the problem by hitting your development AOLserver with a particular load-testing script? If the problem is non-obvious, you'll probably need that to track it down. Your focus on AOLserver's thread creation and scheduling mechanisms seems misplaced. You're speculating about ways to fix some imagined problem, but you don't know yet whether your actual problem has any similarity at all to your speculations. > So basically, what I'm currently beating my head over is to > build a much cleaner and better way of handling all the load It's not clear that building any such thing will help you. If the crash-inducing bugs are in your custom scheduled code, it's fairly likely that they're still going to crash no matter what thread you run them in or how you go about scheduling those threads. If after lots of looking you REALLY can't find the crash-causing bug(s), THEN I'd start thinking about ways to live with and ameliorate the problem. The simplest one of course, which you've probably already done, is to just let your AOLserver crash and make sure that it's always able to come back up quickly and pick up as close to where it left off as possible. Better, is to isolate your custom scheduled code in an entirely separate process, with communication between your AOLserver and that helper process. AOLserver 4.5 definitely includes a mechanism for doing that, but I forget what it's called. That way, your code may well still crash, but it will only take down the helper process rather than your entire AOLserver. -- Andrew Piskorski <[email protected]> http://www.piskorski.com/ -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[email protected]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
