I've been trying to find a bug for a number of months with no success. I figured someone on this list might have encountered something similar and would be able to help.
First of all, the project I'm working on is WhichBot, a bot for Natural Selection. All the code is available at http://whichbot.com via SourceForge, if you're interested. The problem is that the server hangs (cpu usage goes to 100%, console unresponsive) intermittently after about 1 to 10 hours of humans vs. bots gameplay. This only seems to happen on Win32, not on Linux, although admittedly the testing focus has been on Win32 systems. It doesn't appear to matter what version of the HL engine you're running. If you attach a debugger to the hung process, it will be in swds.dll and never seems to exit or re-enter the bot code. The call stack does not include the bot code, so I don't even know what the last piece of code executed in the bot was. I'm assuming that the HL engine code is looping infinitely, since the CPU usage is at 100%. I did find one way of getting this to happen - if you have a divide-by-zero bug and pass a NaN to RunPlayerMove, you will see behaviour like this. I wrapped every HL call using an angle to ensure the angle is in the forward arc (-180 > angle > 180) and that the angle is finite. After that, I thought I'd finally fixed it, but no, the problem is still there, it must be something else. So, I guess the starting questions would be: 1) Does anyone else know ways of getting this kind of behaviour out of the HL engine? 2) I'm guessing there's no version of the engine DLLs available that have debugging symbols (might at least give me a clue to the problem area)? 3) Failing both of those, does anyone have a good idea on how to approach the problem? The trouble is that the problem is intermittent, so it's hard to even verify if it is still there. I'm pretty sure it has been happening for a long time now, so rolling back the code to a "known good" version would roll it back to a version where I know for a fact there are a bunch of bugs that cause other stability issues. Even then, if I rolled it back and fixed all the bugs I could find, I wouldn't be sure that the hang bug wasn't there without running a server with an ancient version for a week or two. I'd be most grateful for any help that people could offer, because this bug has me at the end of my tether. Mike Cooper. _______________________________________________ To unsubscribe, edit your list preferences, or view the list archives, please visit: http://list.valvesoftware.com/mailman/listinfo/hlcoders

