Thanks! I'll take some time to analyze the execution traces.
In fact, I'm using the Sparc ISA, but I suppose it works in similar way as
the Alpha.

One thing I forgot to mention, I'm using the m5threads lib.
I think the discrepancy might be caused by the threads syncing mechanism.
As soon as I discover the cause, I'll report here.

On Thu, Jun 2, 2011 at 11:04 AM, Steve Reinhardt <ste...@gmail.com> wrote:

> Depending on your ISA, instructions that miss in the TLB may be
> counted twice (if it's a trap/SW handler/restart mechanism, like
> Alpha), and we use TLBs even in SE mode to map the virtual space into
> a contiguous physical space.  So if you're using Alpha the first thing
> I'd check is whether the instruction count discrepancy matches the
> number of TLB misses.
>
> Other than that though, I agree, it's puzzling, but tracediff will
> tell you the answer.
>
> Steve
>
> On Thu, Jun 2, 2011 at 6:31 AM, Ali Saidi <sa...@umich.edu> wrote:
> > Yes, it is. The only way to see what is going on is to use tracediff and
> see
> > where the execution diverges.
> > Ali
> > On Jun 2, 2011, at 6:48 AM, Gustavo Henrique Nihei wrote:
> >
> > First, sorry for bringing back an old thread.
> > But I'm still confused but this matter. I'm not running FS.
> > So, by running an SE platform, isn't it weird that for a single
> application,
> > and different cache configurations, the number of simulated instructions
> > differ between simulations?
> > I mean, if there's no underlying OS, just the application, the expected
> > would be the CPU to only execute the instructions provided by the app
> > binary, or am I missing some point here?
> > Thanks.
> > On Tue, Jan 25, 2011 at 12:46 PM, Steve Reinhardt <ste...@gmail.com>
> wrote:
> >>
> >> Yes, it's almost impossible to get completely identical behavior without
> >> running a completely identical system.  Even making the cache larger
> will
> >> make the program run faster in some phases, which will change where
> timer
> >> interrupts happen with respect to program execution.
> >> If you look at larger time windows and/or more samples, the mean
> behavior
> >> should stabilize, but trying to correlate individual small samples like
> >> you're doing is going to be extremely challenging.
> >> This paper focuses on these issues in multiprocessor systems, but most
> of
> >> what it talks about is relevant to uniprocessor systems running a full
> OS
> >> too:
> >> http://pages.cs.wisc.edu/~alaa/papers/ieeemicro03_variability.pdf
> >> Steve
> >>
> >> On Mon, Jan 24, 2011 at 10:34 PM, Stevenson Jian <
> stevensonj...@gmail.com>
> >> wrote:
> >>>
> >>> Yes, I am running in FS mode. Is it normal for the OS to make that much
> >>> difference?
> >>> These statistics are taken after the benchmarks have started.
> >>> Thanks!
> >>> Steve
> >>> On Tue, Jan 25, 2011 at 12:00 AM, Steve Reinhardt <ste...@gmail.com>
> >>> wrote:
> >>>>
> >>>> OK, sorry for the confusion; since you were running a Parsec benchmark
> I
> >>>> assumed the numbers were processor IDs.  Are you running in FS mode?
>  Are
> >>>> these statistics taken from the beginning when Linux is booting, or
> are they
> >>>> after the benchmark has started running?
> >>>> Steve
> >>>>
> >>>> On Mon, Jan 24, 2011 at 5:51 PM, Stevenson Jian
> >>>> <stevensonj...@gmail.com> wrote:
> >>>>>
> >>>>> Thanks for replying Steve. I only used a single processor in both
> >>>>> simulations. What is shown is not the output from individual
> processors, but
> >>>>> that of the same processor at the end of every 100,000 instructions
> (see
> >>>>> sim_insts increment 100,000 each time)
> >>>>>
> >>>>> On Mon, Jan 24, 2011 at 7:14 PM, Steve Reinhardt <ste...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> With a multiprocessor, seemingly small changes in configuration can
> >>>>>> have a significant impact if it changes the order in which threads
> grab a
> >>>>>> lock, or something like that. So in particular, for the stats you
> have
> >>>>>> below, it seems likely that there's some serialized computation
> going on
> >>>>>> that happened on processor 3 in the first case and on processor 5 in
> the
> >>>>>> second case.
> >>>>>> Steve
> >>>>>>
> >>>>>> On Mon, Jan 24, 2011 at 1:30 PM, Stevenson Jian
> >>>>>> <stevensonj...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>> How does Timing CPU count number of instructions? If it stalls on a
> >>>>>>> cache miss, do the Nops count as instructions as well? The reason
> why I ask
> >>>>>>> is that by simply changing the size of the cache, the total number
> of
> >>>>>>> instructions when the benchmark completes varies by about 0.1 -
> 0.01%.
> >>>>>>> Another anomaly that I am observing is that again, by simply
> changing
> >>>>>>> the size of the L2, the number of overall L2 accesses per let's say
> 100,000
> >>>>>>> instructions can vary by over 100%.
> >>>>>>> The following are 2 runs that i did on m5 with the Freqmine
> >>>>>>> benchmark. The first simulation uses a 1Mb 4 way L2 with a latency
> of 6ns
> >>>>>>> while the second simulation uses a 2MB 8 way L2 with a latency of
> 4.5ns. The
> >>>>>>> overall access per 100,000 instructions are show.
> >>>>>>>
> >>>>>>>
> ---------------------------------------------------------------------------------------------
> >>>>>>> 1MB 4Way L2:
> >>>>>>> 2:
> >>>>>>> sim_insts                                   100200001
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   196940000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       3231
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                           2515
> >>>>>>>       # number of overall hits
> >>>>>>> 3:
> >>>>>>> sim_insts                                   100300001
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   227453000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       4656
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                           3434
> >>>>>>>       # number of overall hits
> >>>>>>> 4:
> >>>>>>> sim_insts                                   100400001
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   154064000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       1078
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                            722
> >>>>>>>       # number of overall hits
> >>>>>>> 5:
> >>>>>>> sim_insts                                   100500001
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   155779000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       1575
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                           1154
> >>>>>>>       # number of overall hits
> >>>>>>> ....
> >>>>>>> 2MB 8Way L2:
> >>>>>>> 2:
> >>>>>>> sim_insts                                   100200001
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   234810000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       2936
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                           1163
> >>>>>>>       # number of overall hits
> >>>>>>> 3:
> >>>>>>> sim_insts                                   100300000
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   174173000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       1496
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                            803
> >>>>>>>       # number of overall hits
> >>>>>>> 4:
> >>>>>>> sim_insts                                   100400000
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   190135000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       2290
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                           1672
> >>>>>>>       # number of overall hits
> >>>>>>> 5:
> >>>>>>> sim_insts                                   100500000
> >>>>>>>       # Number of instructions simulated
> >>>>>>> sim_ticks                                   213086000
> >>>>>>>       # Number of ticks simulated
> >>>>>>> system.l2.overall_accesses                       4554
> >>>>>>>       # number of overall (read+write) accesses
> >>>>>>> system.l2.overall_hits                           3871
> >>>>>>>       # number of overall hits
> >>>>>>> .....
> >>>>>>>
> >>>>>>>
> ----------------------------------------------------------------------------
> >>>>>>> Even if Nops are counted as instructions, I don't see how that
> would
> >>>>>>> make overall access/100,000 instructions vary by as much 200%. How
> does M5
> >>>>>>> count the number of instructions?
> >>>>>>> Thanks,
> >>>>>>> Steve
> >>>>>>> _______________________________________________
> >>>>>>> m5-users mailing list
> >>>>>>> m5-us...@m5sim.org
> >>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> m5-users mailing list
> >>>>>> m5-us...@m5sim.org
> >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> m5-users mailing list
> >>>>> m5-us...@m5sim.org
> >>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> m5-users mailing list
> >>>> m5-us...@m5sim.org
> >>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> >>>
> >>>
> >>> _______________________________________________
> >>> m5-users mailing list
> >>> m5-us...@m5sim.org
> >>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> >>
> >>
> >> _______________________________________________
> >> m5-users mailing list
> >> m5-us...@m5sim.org
> >> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> >
> >
> >
> > --
> > Gustavo Henrique Nihei
> > LAPS - Laboratório de Automação do Projeto de Sistemas
> > NIME - Núcleo Interdepartamental de Microeletrônica
> > Universidade Federal de Santa Catarina
> > Florianópolis - Santa Catarina - Brasil
> > _______________________________________________
> > gem5-users mailing list
> > gem5-users@m5sim.org
> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
> >
> > _______________________________________________
> > gem5-users mailing list
> > gem5-users@m5sim.org
> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
> >
> _______________________________________________
> gem5-users mailing list
> gem5-users@m5sim.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>



-- 
Gustavo Henrique Nihei
LAPS - Laboratório de Automação do Projeto de Sistemas
NIME - Núcleo Interdepartamental de Microeletrônica
Universidade Federal de Santa Catarina
Florianópolis - Santa Catarina - Brasil
_______________________________________________
gem5-users mailing list
gem5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to