Hi folks. I've been tracking down a heap corruption issue in the systemc
code for the last couple of days, and I think I found the problem. One of
the systemc tests allocates a lot of stuff on the stack in sc_main, that
stack overflows onto its neighbors on the heap (Fiber stacks are heap
allocated), and that corrupts other data structures with unpredictable
results.

The stacks for systemc processes (methods and threads) are (if I remember
correctly) sized larger than specified in Accellera's implementation,
unless of course pthreads ignores that setting and makes them bigger, so I
thought we would be safe from these sorts of problems, or at least no less
safe than otherwise. Unfortunately I didn't realize that sc_main runs in
the main context in that implementation, where it runs in a fiber in gem5's.

I didn't have a chance to implement a fix today, but I want to take a two
pronged approach. First I'm going to try to implement a guard page for
Fiber stacks so that this sort of thing is a lot easier to debug. Ideally
gem5 would recognize segfaults in those guard pages and, in addition to
printing a nice backtrace like it does now, flag that that was a stack
overflow.

Second, I want to increase the size of the stack for sc_main to something
bigger like 8MB, where currently I think it's 32KB. The stack should be
lazily allocated by the host OS, so I don't think this should cause a
problem in cases where the memory isn't being used, it will just avoid
exploding in cases where it is.

I plan to implement both of these things in the near future and don't
anticipate running into any tricky issues, I just wanted to put this note
out there in case someone was independently hitting this issue under
different circumstances and wasn't sure what was going on.

Gabe
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to