Ah, if you're running a multithreaded program, then yes, it's not at
all surprising that the instruction counts change.  Not many people
run multithreaded programs under SE mode, which is why I was confused.

It's really only single-threaded SE-mode programs that should have
repeatable instruction counts across different configurations.

Steve

On Sat, Jun 4, 2011 at 6:30 AM, Gustavo Henrique Nihei
<ghni...@gmail.com> wrote:
> Thanks! I'll take some time to analyze the execution traces.
> In fact, I'm using the Sparc ISA, but I suppose it works in similar way as
> the Alpha.
> One thing I forgot to mention, I'm using the m5threads lib.
> I think the discrepancy might be caused by the threads syncing mechanism.
> As soon as I discover the cause, I'll report here.
> On Thu, Jun 2, 2011 at 11:04 AM, Steve Reinhardt <ste...@gmail.com> wrote:
>>
>> Depending on your ISA, instructions that miss in the TLB may be
>> counted twice (if it's a trap/SW handler/restart mechanism, like
>> Alpha), and we use TLBs even in SE mode to map the virtual space into
>> a contiguous physical space.  So if you're using Alpha the first thing
>> I'd check is whether the instruction count discrepancy matches the
>> number of TLB misses.
>>
>> Other than that though, I agree, it's puzzling, but tracediff will
>> tell you the answer.
>>
>> Steve
>>
>> On Thu, Jun 2, 2011 at 6:31 AM, Ali Saidi <sa...@umich.edu> wrote:
>> > Yes, it is. The only way to see what is going on is to use tracediff and
>> > see
>> > where the execution diverges.
>> > Ali
>> > On Jun 2, 2011, at 6:48 AM, Gustavo Henrique Nihei wrote:
>> >
>> > First, sorry for bringing back an old thread.
>> > But I'm still confused but this matter. I'm not running FS.
>> > So, by running an SE platform, isn't it weird that for a single
>> > application,
>> > and different cache configurations, the number of simulated instructions
>> > differ between simulations?
>> > I mean, if there's no underlying OS, just the application, the expected
>> > would be the CPU to only execute the instructions provided by the app
>> > binary, or am I missing some point here?
>> > Thanks.
>> > On Tue, Jan 25, 2011 at 12:46 PM, Steve Reinhardt <ste...@gmail.com>
>> > wrote:
>> >>
>> >> Yes, it's almost impossible to get completely identical behavior
>> >> without
>> >> running a completely identical system.  Even making the cache larger
>> >> will
>> >> make the program run faster in some phases, which will change where
>> >> timer
>> >> interrupts happen with respect to program execution.
>> >> If you look at larger time windows and/or more samples, the mean
>> >> behavior
>> >> should stabilize, but trying to correlate individual small samples like
>> >> you're doing is going to be extremely challenging.
>> >> This paper focuses on these issues in multiprocessor systems, but most
>> >> of
>> >> what it talks about is relevant to uniprocessor systems running a full
>> >> OS
>> >> too:
>> >> http://pages.cs.wisc.edu/~alaa/papers/ieeemicro03_variability.pdf
>> >> Steve
>> >>
>> >> On Mon, Jan 24, 2011 at 10:34 PM, Stevenson Jian
>> >> <stevensonj...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Yes, I am running in FS mode. Is it normal for the OS to make that
>> >>> much
>> >>> difference?
>> >>> These statistics are taken after the benchmarks have started.
>> >>> Thanks!
>> >>> Steve
>> >>> On Tue, Jan 25, 2011 at 12:00 AM, Steve Reinhardt <ste...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> OK, sorry for the confusion; since you were running a Parsec
>> >>>> benchmark I
>> >>>> assumed the numbers were processor IDs.  Are you running in FS mode?
>> >>>>  Are
>> >>>> these statistics taken from the beginning when Linux is booting, or
>> >>>> are they
>> >>>> after the benchmark has started running?
>> >>>> Steve
>> >>>>
>> >>>> On Mon, Jan 24, 2011 at 5:51 PM, Stevenson Jian
>> >>>> <stevensonj...@gmail.com> wrote:
>> >>>>>
>> >>>>> Thanks for replying Steve. I only used a single processor in both
>> >>>>> simulations. What is shown is not the output from individual
>> >>>>> processors, but
>> >>>>> that of the same processor at the end of every 100,000 instructions
>> >>>>> (see
>> >>>>> sim_insts increment 100,000 each time)
>> >>>>>
>> >>>>> On Mon, Jan 24, 2011 at 7:14 PM, Steve Reinhardt <ste...@gmail.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> With a multiprocessor, seemingly small changes in configuration can
>> >>>>>> have a significant impact if it changes the order in which threads
>> >>>>>> grab a
>> >>>>>> lock, or something like that. So in particular, for the stats you
>> >>>>>> have
>> >>>>>> below, it seems likely that there's some serialized computation
>> >>>>>> going on
>> >>>>>> that happened on processor 3 in the first case and on processor 5
>> >>>>>> in the
>> >>>>>> second case.
>> >>>>>> Steve
>> >>>>>>
>> >>>>>> On Mon, Jan 24, 2011 at 1:30 PM, Stevenson Jian
>> >>>>>> <stevensonj...@gmail.com> wrote:
>> >>>>>>>
>> >>>>>>> Hi,
>> >>>>>>> How does Timing CPU count number of instructions? If it stalls on
>> >>>>>>> a
>> >>>>>>> cache miss, do the Nops count as instructions as well? The reason
>> >>>>>>> why I ask
>> >>>>>>> is that by simply changing the size of the cache, the total number
>> >>>>>>> of
>> >>>>>>> instructions when the benchmark completes varies by about 0.1 -
>> >>>>>>> 0.01%.
>> >>>>>>> Another anomaly that I am observing is that again, by simply
>> >>>>>>> changing
>> >>>>>>> the size of the L2, the number of overall L2 accesses per let's
>> >>>>>>> say 100,000
>> >>>>>>> instructions can vary by over 100%.
>> >>>>>>> The following are 2 runs that i did on m5 with the Freqmine
>> >>>>>>> benchmark. The first simulation uses a 1Mb 4 way L2 with a latency
>> >>>>>>> of 6ns
>> >>>>>>> while the second simulation uses a 2MB 8 way L2 with a latency of
>> >>>>>>> 4.5ns. The
>> >>>>>>> overall access per 100,000 instructions are show.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> ---------------------------------------------------------------------------------------------
>> >>>>>>> 1MB 4Way L2:
>> >>>>>>> 2:
>> >>>>>>> sim_insts                                   100200001
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   196940000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       3231
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                           2515
>> >>>>>>>       # number of overall hits
>> >>>>>>> 3:
>> >>>>>>> sim_insts                                   100300001
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   227453000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       4656
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                           3434
>> >>>>>>>       # number of overall hits
>> >>>>>>> 4:
>> >>>>>>> sim_insts                                   100400001
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   154064000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       1078
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                            722
>> >>>>>>>       # number of overall hits
>> >>>>>>> 5:
>> >>>>>>> sim_insts                                   100500001
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   155779000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       1575
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                           1154
>> >>>>>>>       # number of overall hits
>> >>>>>>> ....
>> >>>>>>> 2MB 8Way L2:
>> >>>>>>> 2:
>> >>>>>>> sim_insts                                   100200001
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   234810000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       2936
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                           1163
>> >>>>>>>       # number of overall hits
>> >>>>>>> 3:
>> >>>>>>> sim_insts                                   100300000
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   174173000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       1496
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                            803
>> >>>>>>>       # number of overall hits
>> >>>>>>> 4:
>> >>>>>>> sim_insts                                   100400000
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   190135000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       2290
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                           1672
>> >>>>>>>       # number of overall hits
>> >>>>>>> 5:
>> >>>>>>> sim_insts                                   100500000
>> >>>>>>>       # Number of instructions simulated
>> >>>>>>> sim_ticks                                   213086000
>> >>>>>>>       # Number of ticks simulated
>> >>>>>>> system.l2.overall_accesses                       4554
>> >>>>>>>       # number of overall (read+write) accesses
>> >>>>>>> system.l2.overall_hits                           3871
>> >>>>>>>       # number of overall hits
>> >>>>>>> .....
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> ----------------------------------------------------------------------------
>> >>>>>>> Even if Nops are counted as instructions, I don't see how that
>> >>>>>>> would
>> >>>>>>> make overall access/100,000 instructions vary by as much 200%. How
>> >>>>>>> does M5
>> >>>>>>> count the number of instructions?
>> >>>>>>> Thanks,
>> >>>>>>> Steve
>> >>>>>>> _______________________________________________
>> >>>>>>> m5-users mailing list
>> >>>>>>> m5-us...@m5sim.org
>> >>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >>>>>>
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> m5-users mailing list
>> >>>>>> m5-us...@m5sim.org
>> >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> m5-users mailing list
>> >>>>> m5-us...@m5sim.org
>> >>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> m5-users mailing list
>> >>>> m5-us...@m5sim.org
>> >>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> m5-users mailing list
>> >>> m5-us...@m5sim.org
>> >>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >>
>> >>
>> >> _______________________________________________
>> >> m5-users mailing list
>> >> m5-us...@m5sim.org
>> >> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >
>> >
>> >
>> > --
>> > Gustavo Henrique Nihei
>> > LAPS - Laboratório de Automação do Projeto de Sistemas
>> > NIME - Núcleo Interdepartamental de Microeletrônica
>> > Universidade Federal de Santa Catarina
>> > Florianópolis - Santa Catarina - Brasil
>> > _______________________________________________
>> > gem5-users mailing list
>> > gem5-users@m5sim.org
>> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>> >
>> > _______________________________________________
>> > gem5-users mailing list
>> > gem5-users@m5sim.org
>> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>> >
>> _______________________________________________
>> gem5-users mailing list
>> gem5-users@m5sim.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
>
> --
> Gustavo Henrique Nihei
> LAPS - Laboratório de Automação do Projeto de Sistemas
> NIME - Núcleo Interdepartamental de Microeletrônica
> Universidade Federal de Santa Catarina
> Florianópolis - Santa Catarina - Brasil
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@m5sim.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
gem5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to