On Mon, 23 Jan 2012, Madhavan manivannan wrote:

Hi,

I am simulating X86 TimingSimple CPU (16 cores) with Ruby
(MESI_CMP_Directory protocol)
memory model. The stats for m5 and ruby are reset at the beginning of the
parallel region
and dumped at the end of the parallel region. The following differences are
however observed
between the stats generated by m5 and ruby.

1. The number of cycles (cpuxx.numcycles) reported in M5 stats file for
each cores is different.
However the number of cycles reported by ruby for each processor(cache) is
the same. Why is
it different in M5 and not in ruby? Since they use the same event queue I
was expecting similar
values (number of cycles simulated) in ruby and m5 stats. The stats however
show that ruby
cycles differ from m5 cycles (between 0 to -20%) for different apps. Please
correct me if I have
totally missed something here.

It might be that the cpu cycles accounted for are the ones when a thread context was available for execution.


2. I was expecting to see similar values for the total number of
instructions executed by
each core (M5 stats) and the total number of IFetch Events (Ruby Stats)
since Instruction
Fetch requests in TimingSimpleCPU uses the icacheport which inturn directs
the request to
rubysequencer. However the number of IFetch events reported by Ruby is
around 30% lesser
than m5 stats.

This is possible since an instruction is broken down into microops and these microops are not fetched from the cache. It might be that the instruction count you refer to is the number of microops that were executed.


3. Why is there a considerable difference between the cumulative sum of
miss latencies
measured at each sequencer and the total number of ruby cycles simulated.
Is it because
rubycycle includes CPU latencies in addition to cache latencies?

You need to rephrase the question.

--
Nilay
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to