Hi,

I want to simulate a multi-threaded program on a multicore supporting SMT. I used m5threads to make some PARSEC benchmarks run in SE mode. They run fine if I don't use SMT (i.e., one thread per core).

To enable a multi-threaded program to run on multiple cores, the workload of each core is set to the same process:
se.py: system.cpu[i].workload = multiprocesses[0]
This creates one workload SimObject (as seen in config.ini) as a child of cpu0, and all cores' workloads point to that object.

To enable multiple threads on one core, I set the numThreads parameter, and assign multiple workloads per core (which works fine in multiprogrammed mode):
se.py: system.cpu[i].workload = [ multiprocesses[0]] * int(numThreads)
However, for two-way SMT (numThreads=2), two workload SimObjects are created, both children of cpu0, and both with the same name. They both refer to the same process.

config.ini:
...
[system.cpu0]
type=DerivO3CPU
children=dcache dtb fuPool icache itb tracer workload1 workload1
BTBEntries=4096
...
wbWidth=8
workload=system.cpu0.workload1 system.cpu0.workload1
dcache_port=system.cpu0.dcache.cpu_side
...
[system.cpu0.workload1]
type=LiveProcess
cmd=blackscholes 4 in_16.txt prices.txt
cwd=
...
[system.cpu0.workload1]
type=LiveProcess
cmd=blackscholes 4 in_16.txt prices.txt
cwd=
...

When this configuration is simulated, it gives an error, because regStats() is executed twice on the same process, creating duplicate statistics names.

When I give only one process as a workload (as in the non-SMT case), then the second context of each core is not available for the multi-threaded program, so that doesn't solve the problem.

It seems that everything is set correctly, except for the fact that the workload object is now duplicated in the configuration tree. For the simulation itself, both thread contexts point to the same process, which is correct. But when the statistics are initialized, it goes over this process twice, which causes the error.

How can I eliminate this duplication?
And how is it done in the non-SMT mode? In that case, the same process is assigned to multiple cores, but still only one workload object is created. Is it checked somewhere that workloads could be the same and that only one object is created? Then maybe it doesn't check this if multiple processes to one CPU are assigned? I've tried to dig into the configuration tree creation code, but this is quite hard to understand due to the mixture of Python and C++.

Does anyone have an idea?

Thanks,
Stijn
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to