I know the options '-F' and '-W'. Actualy, I use them together with '-I'
option to specify the detailed instruction numbers (as denoted with N3
in my previous mail). It seems that the default implementation in
configs/common/Simulation.py will pass the N3 to
cpu[i].MAX_INSTS_ANY_THREAD. Thus, when any program finishes N3
instructions, the total simulation will exit. Obviously, in this case I
modify this default implementaion by passing N3 to
cpu[i].MAX_INSTS_ALL_THREADS, which will force each program to commit at
least N3 instructions. Then the final total instruction simulated will
be N3 * Nr_cores. But this approach has a pitfall compared with the
methodology I referred. For multi-programmed workload, once some program
finishes N3 instructions, the corresponding core will have no task to
schedule ( I assume the number of workload will be no more than
available cores simulated). Thus, it may be not reasonable to evaluate
its impact on shared resource contention according to final statistics
report.
Based on this, I have an idea to report statistics more reasonable. Can
we carry out detailed simulated N3 * 2 instructions for each program
(thus total instruction simulated will be (N3 * 2) * Nr_cores) but only
dump the stats after the first N3 instructions? But I am not clear on
the stats dump internals.
Hanfeng
On 12/13/2012 11:45 PM, Nilay Vaish wrote:
On Wed, 12 Dec 2012, hanfeng QIN wrote:
Hi all,
I learn a common multi-programmed simulation methodology adopted by
many architecture researchers. But I am not clear its implementation
internals. I describe its idea in brief as following.
For multi-programmed workload consists of M programs, this
methodology firstly fast-forwards N1 instructions. Before detailed
measurement, it warms up cache with N2 instruction. Then detailed
simulation is carried out until all programs execute N3 instructions.
Statistics reports only for the first N3 instructions in detailed
simulation.
I want to know how to implement it with Gem5 in practice. As far as
I know, gem5 provides '-s' option to support mode switch from
TimingSimpleCPU to DetailedCPU (O3). However, I have no idea to
control each program to execute fixed N3 instructions. Besides, if
some programs finish retiring N3 instructions before others, how to
dump the stats to assure it is correct for all programs that have
executed N3 instructions.
If you are programs run long enough, then you would be able to take
the required measurements before any one of them finishes. Instead of
trying to force each program to execute a fixed number of
instructions, you can force each program to execute a minimum number
of instructions during fast-forward, cache warmup and detailed
simulation modes. So when all the programs have executed at least N1
instructions, then only you should switch the CPU. Similarly, when all
the programs have executed at least N2 instructions, reset the
statistics and start the detailed simulation.
If you look in to how the options -F and -W are used in the file
configs/common/Simulation.py, you should be able to make it work for
multiple CPU system as well.
--
Nilay
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users