I've added a couple edits, but nothing major, ie: added statistics to the bus model, and some extra latency randomization to cache misses to get better averages of parallel code runs. None of this is tied to the trace-flags mechanism that I can determine.
I did run the code through valgrind, but ridiculously enough, the segfault disappears. I'll keep digging in my spare time. The "Exec" trace flags work fine (billions of instructions, no problems) with an old version of m5 that is somewhere between beta4 and beta5 of the stable releases. Now I can trace maybe a few thousand instructions before M5 seg faults. Here is a stripped command line that does expose the bug with the least number of variables to consider in case someone out there wants to try and duplicate the segfaults I'm seeing (it could be a product of my build setup, so I'd appreciate it if someone could verify independently): % m5.opt -trace-flags="ExecEnable" fs.py -b MutexTest -t -n 1 > /dev/null Geoff From: [email protected] [mailto:[email protected]] On Behalf Of Korey Sewell Sent: Friday, April 03, 2009 9:56 AM To: M5 Developer List Subject: Re: [m5-dev] Memory corruption in m5 dev repository when using --trace-flags="ExecEnable" I would echo Gabe sentiments. I've been suspicious of the trace-flags causing memory corruption for awhile now, but every time I dig into it there's some small error that I'm propagating through that finally surfaces. In the big picture, I suspect that the trace-flags just exacerbate any kind of memory-corruption issues since you are accessing things at such a heavy-rate. In terms of debugging, is there any code that you edited that is tagged when you use "ExecEnable" rather than just "Exec"? Also, if you can turn valgrind on for maybe the 1st thousand/million cycles with ExecEnable you'll probably find something. On Thu, Apr 2, 2009 at 7:28 PM, Gabriel Michael Black <[email protected]> wrote: Does this happen when you start tracing sooner? I'd suggest valgrind, especially if you can make the segfault happen quickly. If you wait for your simulation to get to 1400000000000 ticks in valgrind, you may die before you see the result. There's a suppression file in util which should cut down on the noise. Gabe Quoting Geoffrey Blake <[email protected]>: > I stumbled upon what appears to be a memory corruption bug in the current M5 > repository. If on the command line I enter: > > % ./build/ALPHA_FS/m5.opt -trace-flags="ExecEnable" > -trace-start=1400000000000 fs.py -b <benchmark> -t -n <cpus> <more > parameters>. The simulator will error with a segmentation fault or > occasionally an assert not long after starting to trace instructions. > > > > I have run this through gdb in with m5.debug and see the same errors, the > problem is the stack trace showing the cause of the seg fault or assert > changes depending on the inputs to the simulator. So, I have not been able > to pin point this bug which appears to be a subtle memory corruption > somewhere in the code. This error does not happen for other trace flags such > as the "Cache" trace flag. It appears linked solely to the instruction > tracing mechanism. Has anybody else seen this bug? > > > > I'm using an up to date repository I pulled from m5sim.org this morning. > > > > Thanks, > Geoff > > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev -- ---------- Korey L Sewell Graduate Student - PhD Candidate Computer Science & Engineering University of Michigan No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.285 / Virus Database: 270.11.40/2039 - Release Date: 04/03/09 06:19:00
_______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
