Re: [gem5-users] Switching to Arm Detailed CPU after Checkpoint Restore - Committed Instruction Count

Andrew Cebulski Fri, 02 Mar 2012 09:37:29 -0800

Geoff,

   Okay, but it looks to me like that error is correctable.  I think that
the m5.instantiate(checkpoint_dir) should only happen within the 'if
options.checkpoint_restore != None:' statement (so it needs an extra tab).
 As it is in the repository, it happens regardless of whether or not you
are restoring from a checkpoint.  So you're essentially doing
m5.instantiate(None).


-Andrew

On Fri, Mar 2, 2012 at 12:23 PM, Geoffrey Blake <[email protected]> wrote:

> Andrew,
>
> You may want to wait until the most recent patches for the checker are
> pushed that will allow you to just specify --checker on the command
> line.  I forgot the checker as it is now in the tree had broken during
> a recent merge with other changes.  Or, if you go to M5's reviewboard
> you can grab the patches for the checker and apply them.
>
> Geoff
>
> On Fri, Mar 2, 2012 at 11:17 AM, Andrew Cebulski <[email protected]> wrote:
> > I'm getting the following error when running this basic command with the
> CPU
> > Checker enabled:
> >
> > build/ARM/gem5.fast configs/example/fs.py -b ArmUbuntu
> --cpu-type=detailed
> > --caches
> >
> > Error in unproxying param 'workload' of system.cpu.checker
> > Traceback (most recent call last):
> >   File "<string>", line 1, in ?
> >   File "/gem5/src/python/m5/main.py", line 361, in main
> >     exec filecode in scope
> >   File "configs/example/fs.py", line 215, in ?
> >     Simulation.run(options, root, test_sys, FutureClass)
> >   File "/gem5/configs/common/Simulation.py", line 246, in run
> >     m5.instantiate(checkpoint_dir)
> >   File "/gem5/src/python/m5/simulate.py", line 66, in instantiate
> >     for obj in root.descendants(): obj.unproxyParams()
> >   File "/gem5/src/python/m5/SimObject.py", line 851, in unproxyParams
> >     value = value.unproxy(self)
> >   File "/gem5/src/python/m5/params.py", line 196, in unproxy
> >     return [v.unproxy(base) for v in self]
> >   File "/gem5/src/python/m5/proxy.py", line 89, in unproxy
> >     result, done = self.find(obj)
> >   File "/gem5/src/python/m5/proxy.py", line 162, in find
> >     val = val[m]
> > IndexError: list index out of range
> >
> > Any idea why this is happening?  I'm not even attempting to launch from a
> > checkpoint here (though this exact error does occur when attempting
> > restoring from checkpoint now).  Some notes on my environment...  I'm
> > running Python 2.4.3, SWIG 1.3.40 and GCC 4.5.3.
> >
> > Note that when I run atomic/timing CPUs, I get a segmentation fault.  I'm
> > assuming this is because they don't have checker's setup in the code.
> Let
> > me know if otherwise.
> >
> > Thanks,
> > Andrew
> >
> >
> > On Thu, Mar 1, 2012 at 5:00 PM, Ali Saidi <[email protected]> wrote:
> >>
> >> Hi Andrew,
> >>
> >>
> >>
> >> You should be able to re-compile gem5 with USE_CHECKER=1 on the command
> >> line and it will include the checker and run it when you restore to the
> o3
> >> cpu.
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Ali
> >>
> >>
> >>
> >> On 01.03.2012 14:02, Andrew Cebulski wrote:
> >>
> >> Hi Ali,
> >>
> >>     Okay, thanks, I'll try out the checker cpu.  Is this the best
> resource
> >> available on how to use the Checker CPU?  --  http://gem5.org/Checker
> >>     Also, my run restoring the O3 CPU from my checkpoint has the same
> >> result:
> >>     Detailed CPU (checkpoint restore) :   system.cpu.committedInsts =
> >> 646985567
> >>
> >>   system.cpu.fetch.Insts        = 648951747
> >> Thanks,
> >> Andrew
> >>
> >> On Thu, Mar 1, 2012 at 2:40 PM, Ali Saidi <[email protected]> wrote:
> >>>
> >>> Hi Andrew,
> >>>
> >>>
> >>>
> >>> The first guess is that possibly the cpu results in a different code
> path
> >>> or different scheduler decisions which lengthen execution. Another
> >>> possibility is that the O3 cpu as configured by the arm-detailed
> >>> configuration has some issue. While this is possible it's not
> incredibly
> >>> likely. You could try to restore from the checkpoint and run with the
> >>> checker cpu. This creates a little atomic like cpu that sits next to
> the o3
> >>> core and verifies it's execution which might tell you if there is a
> bug in
> >>> the o3 model.
> >>>
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Ali
> >>>
> >>>
> >>>
> >>> On 01.03.2012 13:04, Andrew Cebulskiwrote:
> >>>
> >>> Hi,
> >>>     I'm experiencing some problems that I currently am attributing to
> >>> restoring from a checkpoint, then switching to an arm_detailed CPU
> >>> (O3_ARM_v7a_3).  I first noticed the problem due to my committed
> instruction
> >>> counts not lining up correctly between different CPUs for a benchmark
> I'm
> >>> running (by roughly 170M instructions).  The stats below are reset
> right
> >>> before running the benchmark, then dumped afterwards:
> >>>     Atomic CPU (no checkpoint restore):  system.cpu.numInsts =
> 476085242
> >>>     Detailed CPU (no checkpoint restore):  system.cpu.committedInsts =
> >>> 476128320
> >>>
> >>>  system.cpu.fetch.Insts        = 478463491
> >>>     Arm_detailed CPU (checkpoint restore):
> >>>  system.switch_cpus_1.committedInsts = 646468886
> >>>
> >>> system.switch_cpus_1.fetch.Insts        = 660969371
> >>>     Arm_detailed CPU (no checkpoint restore):
>  system.cpu.committedInsts
> >>> = 476107801
> >>>
> >>> system.cpu.fetch.Insts        = 491814681
> >>>     I included both the committed and fetched instructions, to see if
> the
> >>> problem is with fetchs getting counted as committed even if they are
> not
> >>> (i.e. insts not getting squashed).  It does not seem like that is the
> case
> >>> from the stats above...as the arm_detailed run without a checkpoint has
> >>> roughly the same difference between fetched/committed instructions.  I
> >>> noticed that the switch arm_detailed cpu when restoring from a
> checkpoint
> >>> lacks both a icache and dcache as children, but I read in a previous
> post
> >>> that they are connected to fetch/iew respectively, so this is probably
> not
> >>> the issue.  I assume it's just not shown explicitly in the config.ini
> >>> file...
> >>>     I'm running a test right now to see if switching to a regular
> >>> DerivO3CPU has the same issue.  Regardless of its results, does anyone
> have
> >>> any idea why I'm seeing roughly 170M more committed instructions in the
> >>> arm_detailed CPU run when I restore from a checkpoint?  I've attached
> my
> >>> config file from the arm_detailed with checkpoint run for reference.
> >>>     Here's the run command for when I use a checkpoint:
> >>>     build/ARM/gem5.fast -d [dir] configs/example/fs.py -b [benchmark]
> -r
> >>> 1 --checkpoint-dir=[chkpt-dir] --caches -s
> >>>     Lastly, I'm running off of revision 8813 from 2/3/12.  Let me know
> if
> >>> you need anymore info (i.e. stats).
> >>> Thanks,
> >>> Andrew
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> gem5-users mailing list
> >>> [email protected]
> >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
> >>
> >>
> >>
> >>
> >
> >
> >
> > _______________________________________________
> > gem5-users mailing list
> > [email protected]
> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
> _______________________________________________
> gem5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Switching to Arm Detailed CPU after Checkpoint Restore - Committed Instruction Count

Reply via email to