Re: [m5-dev] Memory corruption in m5 dev repository when using --trace-flags="ExecEnable"

Geoffrey Blake Fri, 03 Apr 2009 08:02:21 -0700

I've added a couple edits, but nothing major, ie: added statistics to the
bus model, and some extra latency randomization to cache misses to get
better averages of parallel code runs.  None of this is tied to the
trace-flags mechanism that I can determine.

I did run the code through valgrind, but ridiculously enough, the segfault
disappears. I'll keep digging in my spare time.  

The "Exec" trace flags work fine (billions of instructions, no problems)
with an old version of m5 that is somewhere between beta4 and beta5 of the
stable releases. Now I can trace maybe a few thousand instructions before M5
seg faults.

Here is a stripped command line that does expose the bug with the least
number of variables to consider in case someone out there wants to try and
duplicate the segfaults I'm seeing (it could be a product of my build setup,
so I'd appreciate it if someone could verify independently):

% m5.opt -trace-flags="ExecEnable" fs.py -b MutexTest -t -n 1 > /dev/null

Geoff

From: [email protected] [mailto:[email protected]] On Behalf
Of Korey Sewell
Sent: Friday, April 03, 2009 9:56 AM
To: M5 Developer List
Subject: Re: [m5-dev] Memory corruption in m5 dev repository when using
--trace-flags="ExecEnable"

I would echo Gabe sentiments. I've been suspicious of the trace-flags
causing memory corruption for awhile now, but every time I dig into it
there's some small error that I'm propagating through that finally surfaces.

In the big picture, I suspect that the trace-flags just exacerbate any kind
of memory-corruption issues since you are accessing things at such a
heavy-rate.

In terms of debugging, is there any code that you edited that is tagged when
you use "ExecEnable" rather than just "Exec"?

Also, if you can turn valgrind on for maybe the 1st thousand/million cycles
with ExecEnable you'll probably find something.

On Thu, Apr 2, 2009 at 7:28 PM, Gabriel Michael Black
<[email protected]> wrote:

Does this happen when you start tracing sooner? I'd suggest valgrind,
especially if you can make the segfault happen quickly. If you wait
for your simulation to get to 1400000000000 ticks in valgrind, you may
die before you see the result. There's a suppression file in util
which should cut down on the noise.

Gabe

Quoting Geoffrey Blake <[email protected]>:

> I stumbled upon what appears to be a memory corruption bug in the current
M5
> repository.  If on the command line I enter:
>
> % ./build/ALPHA_FS/m5.opt -trace-flags="ExecEnable"
> -trace-start=1400000000000 fs.py -b <benchmark> -t -n <cpus> <more
> parameters>. The simulator will error with a segmentation fault or
> occasionally an assert not long after starting to trace instructions.
>
>
>
> I have run this through gdb in with m5.debug and see the same errors, the
> problem is the stack trace showing the cause of the seg fault or assert
> changes depending on the inputs to the simulator. So, I have not been able
> to pin point this bug which appears to be a subtle memory corruption
> somewhere in the code. This error does not happen for other trace flags
such
> as the "Cache" trace flag. It appears linked solely to the instruction
> tracing mechanism.  Has anybody else seen this bug?
>
>
>
> I'm using an up to date repository I pulled from m5sim.org this morning.
>
>
>
> Thanks,
> Geoff
>
>

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

-- 
----------
Korey L Sewell
Graduate Student - PhD Candidate
Computer Science & Engineering
University of Michigan

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.285 / Virus Database: 270.11.40/2039 - Release Date: 04/03/09
06:19:00

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Memory corruption in m5 dev repository when using --trace-flags="ExecEnable"

Reply via email to