Kathi Fisler wrote at 09/06/2011 08:13 PM:
Are there commands we can use when we startup racket or the server that might give diagnostics to help trace the problem?
Intermittent failures are a headache. In addition to whatever people advise here, you might want to add your own detailed logging to a file, with timestamps. Don't be afraid to dump dozens of lines of log output per HTTP request.
We are not getting core dumps (even when the process dies).
You also might want to confirm that your kernel is set up to do core dumps, that the process limits on the actual "racket" process would permit core dumps, and that whatever directory&file the kernel config would say to put the core dump would be writable.
All these are about arranging to capture diagnostic info, should an intermittent failure occur again. In addition to that, you might want to audit the code, looking for potential deadlocks. If you're using the FFI or C extensions, you have to assess how much you trust the code and whether you're calling it in a safe way, since auditing C code will be harder than auditing Racket code.
Oh, and Matthew discovered a conceivable issue with address space randomization, and I don't know whether he's 100% ruled it out as a problem. You might want to disable that feature in Linux, just in case (although there are reasons why they added that feature, but I think the feature is less relevant if the only server code you're running on the machine is in Racket).
-- http://www.neilvandyke.org/ _________________________________________________ For list-related administrative tasks: http://lists.racket-lang.org/listinfo/dev