I know the options '-F' and '-W'. Actualy, I use them together with '-I' option to specify the detailed instruction numbers (as denoted with N3 in my previous mail). It seems that the default implementation in configs/common/Simulation.py will pass the N3 to cpu[i].MAX_INSTS_ANY_THREAD. Thus, when any program finishes N3 instructions, the total simulation will exit. Obviously, in this case I modify this default implementaion by passing N3 to cpu[i].MAX_INSTS_ALL_THREADS, which will force each program to commit at least N3 instructions. Then the final total instruction simulated will be N3 * Nr_cores. But this approach has a pitfall compared with the methodology I referred. For multi-programmed workload, once some program finishes N3 instructions, the corresponding core will have no task to schedule ( I assume the number of workload will be no more than available cores simulated). Thus, it may be not reasonable to evaluate its impact on shared resource contention according to final statistics report.

Based on this, I have an idea to report statistics more reasonable. Can we carry out detailed simulated N3 * 2 instructions for each program (thus total instruction simulated will be (N3 * 2) * Nr_cores) but only dump the stats after the first N3 instructions? But I am not clear on the stats dump internals.


Hanfeng

On 12/13/2012 11:45 PM, Nilay Vaish wrote:
On Wed, 12 Dec 2012, hanfeng QIN wrote:

Hi all,

I learn a common multi-programmed simulation methodology adopted by many architecture researchers. But I am not clear its implementation internals. I describe its idea in brief as following.

For multi-programmed workload consists of M programs, this methodology firstly fast-forwards N1 instructions. Before detailed measurement, it warms up cache with N2 instruction. Then detailed simulation is carried out until all programs execute N3 instructions. Statistics reports only for the first N3 instructions in detailed simulation.

I want to know how to implement it with Gem5 in practice. As far as I know, gem5 provides '-s' option to support mode switch from TimingSimpleCPU to DetailedCPU (O3). However, I have no idea to control each program to execute fixed N3 instructions. Besides, if some programs finish retiring N3 instructions before others, how to dump the stats to assure it is correct for all programs that have executed N3 instructions.



If you are programs run long enough, then you would be able to take the required measurements before any one of them finishes. Instead of trying to force each program to execute a fixed number of instructions, you can force each program to execute a minimum number of instructions during fast-forward, cache warmup and detailed simulation modes. So when all the programs have executed at least N1 instructions, then only you should switch the CPU. Similarly, when all the programs have executed at least N2 instructions, reset the statistics and start the detailed simulation.

If you look in to how the options -F and -W are used in the file configs/common/Simulation.py, you should be able to make it work for multiple CPU system as well.

--
Nilay

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to