Hi,
I want to simulate a multi-threaded program on a multicore supporting
SMT. I used m5threads to make some PARSEC benchmarks run in SE mode.
They run fine if I don't use SMT (i.e., one thread per core).
To enable a multi-threaded program to run on multiple cores, the
workload of each core is set to the same process:
se.py: system.cpu[i].workload = multiprocesses[0]
This creates one workload SimObject (as seen in config.ini) as a child
of cpu0, and all cores' workloads point to that object.
To enable multiple threads on one core, I set the numThreads parameter,
and assign multiple workloads per core (which works fine in
multiprogrammed mode):
se.py: system.cpu[i].workload = [ multiprocesses[0]] * int(numThreads)
However, for two-way SMT (numThreads=2), two workload SimObjects are
created, both children of cpu0, and both with the same name. They both
refer to the same process.
config.ini:
...
[system.cpu0]
type=DerivO3CPU
children=dcache dtb fuPool icache itb tracer workload1 workload1
BTBEntries=4096
...
wbWidth=8
workload=system.cpu0.workload1 system.cpu0.workload1
dcache_port=system.cpu0.dcache.cpu_side
...
[system.cpu0.workload1]
type=LiveProcess
cmd=blackscholes 4 in_16.txt prices.txt
cwd=
...
[system.cpu0.workload1]
type=LiveProcess
cmd=blackscholes 4 in_16.txt prices.txt
cwd=
...
When this configuration is simulated, it gives an error, because
regStats() is executed twice on the same process, creating duplicate
statistics names.
When I give only one process as a workload (as in the non-SMT case),
then the second context of each core is not available for the
multi-threaded program, so that doesn't solve the problem.
It seems that everything is set correctly, except for the fact that the
workload object is now duplicated in the configuration tree. For the
simulation itself, both thread contexts point to the same process, which
is correct. But when the statistics are initialized, it goes over this
process twice, which causes the error.
How can I eliminate this duplication?
And how is it done in the non-SMT mode? In that case, the same process
is assigned to multiple cores, but still only one workload object is
created. Is it checked somewhere that workloads could be the same and
that only one object is created? Then maybe it doesn't check this if
multiple processes to one CPU are assigned? I've tried to dig into the
configuration tree creation code, but this is quite hard to understand
due to the mixture of Python and C++.
Does anyone have an idea?
Thanks,
Stijn
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users