Re: [gem5-users] [m5-users] How does Timing CPU count number of instructions?

Steve Reinhardt Thu, 02 Jun 2011 07:04:54 -0700

Depending on your ISA, instructions that miss in the TLB may be
counted twice (if it's a trap/SW handler/restart mechanism, like
Alpha), and we use TLBs even in SE mode to map the virtual space into
a contiguous physical space.  So if you're using Alpha the first thing
I'd check is whether the instruction count discrepancy matches the
number of TLB misses.


Other than that though, I agree, it's puzzling, but tracediff will
tell you the answer.

Steve

On Thu, Jun 2, 2011 at 6:31 AM, Ali Saidi <sa...@umich.edu> wrote:
> Yes, it is. The only way to see what is going on is to use tracediff and see
> where the execution diverges.
> Ali
> On Jun 2, 2011, at 6:48 AM, Gustavo Henrique Nihei wrote:
>
> First, sorry for bringing back an old thread.
> But I'm still confused but this matter. I'm not running FS.
> So, by running an SE platform, isn't it weird that for a single application,
> and different cache configurations, the number of simulated instructions
> differ between simulations?
> I mean, if there's no underlying OS, just the application, the expected
> would be the CPU to only execute the instructions provided by the app
> binary, or am I missing some point here?
> Thanks.
> On Tue, Jan 25, 2011 at 12:46 PM, Steve Reinhardt <ste...@gmail.com> wrote:
>>
>> Yes, it's almost impossible to get completely identical behavior without
>> running a completely identical system.  Even making the cache larger will
>> make the program run faster in some phases, which will change where timer
>> interrupts happen with respect to program execution.
>> If you look at larger time windows and/or more samples, the mean behavior
>> should stabilize, but trying to correlate individual small samples like
>> you're doing is going to be extremely challenging.
>> This paper focuses on these issues in multiprocessor systems, but most of
>> what it talks about is relevant to uniprocessor systems running a full OS
>> too:
>> http://pages.cs.wisc.edu/~alaa/papers/ieeemicro03_variability.pdf
>> Steve
>>
>> On Mon, Jan 24, 2011 at 10:34 PM, Stevenson Jian <stevensonj...@gmail.com>
>> wrote:
>>>
>>> Yes, I am running in FS mode. Is it normal for the OS to make that much
>>> difference?
>>> These statistics are taken after the benchmarks have started.
>>> Thanks!
>>> Steve
>>> On Tue, Jan 25, 2011 at 12:00 AM, Steve Reinhardt <ste...@gmail.com>
>>> wrote:
>>>>
>>>> OK, sorry for the confusion; since you were running a Parsec benchmark I
>>>> assumed the numbers were processor IDs.  Are you running in FS mode?  Are
>>>> these statistics taken from the beginning when Linux is booting, or are 
>>>> they
>>>> after the benchmark has started running?
>>>> Steve
>>>>
>>>> On Mon, Jan 24, 2011 at 5:51 PM, Stevenson Jian
>>>> <stevensonj...@gmail.com> wrote:
>>>>>
>>>>> Thanks for replying Steve. I only used a single processor in both
>>>>> simulations. What is shown is not the output from individual processors, 
>>>>> but
>>>>> that of the same processor at the end of every 100,000 instructions (see
>>>>> sim_insts increment 100,000 each time)
>>>>>
>>>>> On Mon, Jan 24, 2011 at 7:14 PM, Steve Reinhardt <ste...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> With a multiprocessor, seemingly small changes in configuration can
>>>>>> have a significant impact if it changes the order in which threads grab a
>>>>>> lock, or something like that. So in particular, for the stats you have
>>>>>> below, it seems likely that there's some serialized computation going on
>>>>>> that happened on processor 3 in the first case and on processor 5 in the
>>>>>> second case.
>>>>>> Steve
>>>>>>
>>>>>> On Mon, Jan 24, 2011 at 1:30 PM, Stevenson Jian
>>>>>> <stevensonj...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>> How does Timing CPU count number of instructions? If it stalls on a
>>>>>>> cache miss, do the Nops count as instructions as well? The reason why I 
>>>>>>> ask
>>>>>>> is that by simply changing the size of the cache, the total number of
>>>>>>> instructions when the benchmark completes varies by about 0.1 - 0.01%.
>>>>>>> Another anomaly that I am observing is that again, by simply changing
>>>>>>> the size of the L2, the number of overall L2 accesses per let's say 
>>>>>>> 100,000
>>>>>>> instructions can vary by over 100%.
>>>>>>> The following are 2 runs that i did on m5 with the Freqmine
>>>>>>> benchmark. The first simulation uses a 1Mb 4 way L2 with a latency of 
>>>>>>> 6ns
>>>>>>> while the second simulation uses a 2MB 8 way L2 with a latency of 
>>>>>>> 4.5ns. The
>>>>>>> overall access per 100,000 instructions are show.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>> 1MB 4Way L2:
>>>>>>> 2:
>>>>>>> sim_insts                                   100200001
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   196940000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       3231
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                           2515
>>>>>>>       # number of overall hits
>>>>>>> 3:
>>>>>>> sim_insts                                   100300001
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   227453000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       4656
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                           3434
>>>>>>>       # number of overall hits
>>>>>>> 4:
>>>>>>> sim_insts                                   100400001
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   154064000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       1078
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                            722
>>>>>>>       # number of overall hits
>>>>>>> 5:
>>>>>>> sim_insts                                   100500001
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   155779000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       1575
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                           1154
>>>>>>>       # number of overall hits
>>>>>>> ....
>>>>>>> 2MB 8Way L2:
>>>>>>> 2:
>>>>>>> sim_insts                                   100200001
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   234810000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       2936
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                           1163
>>>>>>>       # number of overall hits
>>>>>>> 3:
>>>>>>> sim_insts                                   100300000
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   174173000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       1496
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                            803
>>>>>>>       # number of overall hits
>>>>>>> 4:
>>>>>>> sim_insts                                   100400000
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   190135000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       2290
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                           1672
>>>>>>>       # number of overall hits
>>>>>>> 5:
>>>>>>> sim_insts                                   100500000
>>>>>>>       # Number of instructions simulated
>>>>>>> sim_ticks                                   213086000
>>>>>>>       # Number of ticks simulated
>>>>>>> system.l2.overall_accesses                       4554
>>>>>>>       # number of overall (read+write) accesses
>>>>>>> system.l2.overall_hits                           3871
>>>>>>>       # number of overall hits
>>>>>>> .....
>>>>>>>
>>>>>>> ----------------------------------------------------------------------------
>>>>>>> Even if Nops are counted as instructions, I don't see how that would
>>>>>>> make overall access/100,000 instructions vary by as much 200%. How does 
>>>>>>> M5
>>>>>>> count the number of instructions?
>>>>>>> Thanks,
>>>>>>> Steve
>>>>>>> _______________________________________________
>>>>>>> m5-users mailing list
>>>>>>> m5-us...@m5sim.org
>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> m5-users mailing list
>>>>>> m5-us...@m5sim.org
>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> m5-users mailing list
>>>>> m5-us...@m5sim.org
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>>
>>>>
>>>> _______________________________________________
>>>> m5-users mailing list
>>>> m5-us...@m5sim.org
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>
>>>
>>> _______________________________________________
>>> m5-users mailing list
>>> m5-us...@m5sim.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>
>>
>> _______________________________________________
>> m5-users mailing list
>> m5-us...@m5sim.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>
>
>
> --
> Gustavo Henrique Nihei
> LAPS - Laboratório de Automação do Projeto de Sistemas
> NIME - Núcleo Interdepartamental de Microeletrônica
> Universidade Federal de Santa Catarina
> Florianópolis - Santa Catarina - Brasil
> _______________________________________________
> gem5-users mailing list
> gem5-users@m5sim.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@m5sim.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
gem5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] [m5-users] How does Timing CPU count number of instructions?

Reply via email to