I do not see any assertion errors, and I am encountering this problem
even with a vanilla codebase pulled from the stable repository and
using the 2.0b3 files straight off the website, but it is perhaps
related to the "switch cpus problem" message.
When switching from the atomic to timing processors, the simulator
instantly becomes stuck. It appears to occur with any number of CPUs
(I've tried one, two, and four), and the stats file (after killing it)
shows the same symptomatic behavior: one or more of the switch cpus
have executed no instructions and no cycles have elapsed, but all
other processors have continued to make progress. However, it is not
consistent which processors get stuck, though it is only one or two
for a four CPU setup.
I created traces using the SimpleCPU flag, and what I see is that the
all the CPUs show that they are started with "Resume", but the CPUs
which get stuck never have an ActivateContext message. An example
section from a trace file is shown below where only switch_cpus3 is
stuck. No other messages pertaining to switch_cpus3 ever appear in the
trace, and the stats file shows no instructions or cycles for that
processor.
4084372420500: system.switch_cpus2: Resume
4084372420500: system.switch_cpus3: Resume
4084372420500: system.switch_cpus0: Resume
4084372420500: system.switch_cpus1: Resume
4084960937500: system.switch_cpus0: ActivateContext 0 (1 cycles)
4084960937500: system.switch_cpus1: ActivateContext 0 (1 cycles)
4084960937500: system.switch_cpus2: ActivateContext 0 (1 cycles)
4084960938000: system.switch_cpus2: Fetch
4084960938000: system.switch_cpus1: Fetch
4084960938000: system.switch_cpus0: Fetch
4084960939000: system.switch_cpus0: Complete ICache Fetch
4084960939000: system.switch_cpus0: Fetch
4084960939000: system.switch_cpus1: Complete ICache Fetch
4084960939000: system.switch_cpus1: Fetch
4084960939000: system.switch_cpus2: Complete ICache Fetch
4084960939000: system.switch_cpus2: Fetch
4084960940000: system.switch_cpus2: Complete ICache Fetch
I've not worked much with the CPU side of the M5 codebase, so I've not
attempted to find what is wrong. All I know is that it did not occur
with the original 2.0b6 version of the stable codebase, nor with the
2.0b5-era versions. The only significant change I know of that dropped
into the stable repository since then is the new event queue handling.
Suggestions for things to look at would be appreciated.
Here is an example of how I am running the example FS script (I've
also tried m5.fast and m5.debug, they both give the same, non-
deterministic results):
~/m5-vanilla/build/ALPHA_FS/m5.opt fs.py -n 4 -F 10000000000 --caches
-t
I use "m5 switchcpu" on the terminal after startup is finished to
switch the CPUs over, though it also occurs automatically if one
lowers the fast-forward instruction count to a much smaller value. If
I specify to switch to O3 cpus, I do not have any problems.
Thanks,
- Clint Smullen
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users