Matthew Flatt wrote at 05/23/2011 10:11 PM:
At Mon, 23 May 2011 22:01:31 -0400, Neil Van Dyke wrote:
We're not explicitly setting any stack limits anywhere. I believe but am not certain that that core dump came from a "mzscheme -jqr" from inside an Apache CGI context that got a native stack ulimit of 8192 kB (the normal limit on that machine). Shall I confirm this?

Maybe, but I've become more interested in the possibility that other OS
threads might have crashed. Does `info threads' work in gdb with a core
file?

I'm not certain "gdb" is accurate here, but I don't think that any C code we use introduces any additional OS threads.

#0 0x00000000005655b6 in GC_clear_stack_inner (arg=0x0, limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:243
243    ./misc.c: No such file or directory.
   in ./misc.c
(gdb) info threads
 2 process 28526  0x00007fff316fcbe1 in nanosleep () from /lib/libc.so.6
* 1 process 28525 0x00000000005655b6 in GC_clear_stack_inner (arg=0x0, limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:243


Could code evaluated at module load time, such as "make-standard-set" (which has some non-tail calls in loops, I don't know the size), be using lots of stack, and, once every 100,000 runs of a large program, combines with nondeterministic GC behavior and a bug to cause a seg fault?

It seems unlikely that any module is using lots of C stack relative to
8MB, so I think we must be missing something simpler. Nondeterministic
GC behavior seems like a likely part of the puzzle, though.

(I'm not sure whether we're talking about a Scheme stack that is different than the native stack) Could we be having an overly large stack quite often, and the rareness of the crash is only because usually the stack does not collide with non-stack memory in a detectable way?

--
http://www.neilvandyke.org/
_________________________________________________
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/users

Reply via email to