It looks like the TLB trace flag prints out the asid (in tlb.cc and table_walker.cc)...are there others I should use instead or in addition?
Thanks, Andrew On Sat, Mar 3, 2012 at 3:12 PM, Ali Saidi <[email protected]> wrote: > Hi Andrew, > > You could get a trace using debug flag Exec and seeing where the extra > instructions are coming from. You might want to sleep for 10 or 15 seconds > before running your benchmark and see what happens. Since the solution > validates my guess is that the Linux scheduler isn't cooperating with you > but understanding where all these instructions are coming from is the only > way to know for certain. You probably also want to use the trace flags that > print out the asid so you can identify one app/pid from another. > > Ali > > Sent from my ARM powered mobile device > > On Mar 3, 2012, at 1:38 PM, Andrew Cebulski <[email protected]> wrote: > > Hi Ali, > > The benchmark is libquantum from SPEC CPU2006. The results are printed > out to the system.terminal, so I am able to verify. In all cases it passes > with the exact same output. Note that when I don't restore from a > checkpoint...the committed instructions for the O3 CPU are roughly the same > as atomic (within 100,000 instructions). > > Yes, I did actually run atomic CPU with a checkpoint restore. It > resulted in 476085242 committed instructions...the exact same as without > launching from a checkpoint. > > I'll work on getting you results from another benchmark. In the > meantime, let me know if you have any other ideas. > > Thanks, > Andrew > > On Sat, Mar 3, 2012 at 2:14 PM, Ali Saidi <[email protected]> wrote: > >> Hi Andrew, >> >> Are you sure the benchmark isn't timing dependent. Does the benchmark do >> any kind of self-checking (E.g. the benchmark completes,but does it come to >> the right answer)? >> >> Did you ever run the atomic cpu with a checkpoint restore? What is the >> instruction count in this case? >> >> Thanks, >> Ali >> >> On Mar 2, 2012, at 10:08 PM, Andrew Cebulski wrote: >> >> Okay, checker built and ran perfectly as far as I can tell. Thanks! >> >> Here are the errors reported by the checker: >> >> warn: 3009947097500: Instruction results do not match! (Values may not >> actually be integers) Inst: 0x2281c, checker: 0x281c >> warn: 3015415839500: Instruction results do not match! (Values may not >> actually be integers) Inst: 0x2281c, checker: 0x281c >> warn: 3077134098000: Instruction results do not match! (Values may not >> actually be integers) Inst: 0x2, checker: 0 >> >> A grep shows this coming from src/cpu/checker/cpu_impl.hh >> >> My benchmark ran to completion with the following results: >> >> Detailed CPU (checkpoint restore) : system.switch_cpus_1.committedInsts >> = 610834324 >> >> system.switch_cpus_1.committedOps (new stat) = 646803879 (this is close to >> what the committed instructions were before...) >> >> system.switch_cpus_1.fetch.Insts = 632688924 >> >> What's the next step finding the source of this error? >> >> Thanks, >> Andrew >> >> On Fri, Mar 2, 2012 at 5:04 PM, Andrew Cebulski <[email protected]> wrote: >> >>> This probably happened because I merged into rev 8877 instead of rev >>> 8861. The patch merged find with rev 8861, so none of my local changes >>> conflicted. I'm building now. I'll send an update later when I'm blocked >>> again. >>> >>> I actually just tried gcc 4.6.2 recently, so I experienced that swig >>> error with ptrdiff_t. Glad to see that was fixed in rev 8861. >>> >>> -Andrew >>> >>> >>> On Fri, Mar 2, 2012 at 3:36 PM, Andrew Cebulski <[email protected]>wrote: >>> >>>> Okay, so I'm trying to build after patching this from the review >>>> board: http://reviews.m5sim.org/r/1031/ >>>> >>>> There were a few minor merge issues with the patch, but they all seemed >>>> easily resolved. I'm merging this into gem5 revision 8884 (today). >>>> Unfortunately, I'm getting this error: >>>> >>>> [ CXX] ARM/cpu/checker/cpu.cc -> .fo >>>> build/ARM/cpu/checker/cpu.cc: In member function 'void >>>> CheckerCPU::setSystem(System*)': >>>> build/ARM/cpu/checker/cpu.cc:106:43: error: no matching function for >>>> call to 'SimpleThread::SimpleThread(CheckerCPU* const, int, System*&, >>>> Process*, ArmISA::TLB*&, ArmISA::TLB*&)' >>>> build/ARM/cpu/simple_thread.hh:142:5: note: candidates are: >>>> SimpleThread::SimpleThread() >>>> build/ARM/cpu/simple_thread.hh:139:5: note: >>>> SimpleThread::SimpleThread(BaseCPU*, int, Process*, ArmISA::TLB*, >>>> ArmISA::TLB*) >>>> build/ARM/cpu/simple_thread.hh:135:5: note: >>>> SimpleThread::SimpleThread(BaseCPU*, int, System*, ArmISA::TLB*, >>>> ArmISA::TLB*, bool) >>>> build/ARM/cpu/simple_thread.hh:96:1: note: >>>> SimpleThread::SimpleThread(const SimpleThread&) >>>> build/ARM/cpu/checker/cpu.cc: In member function 'Fault >>>> CheckerCPU::readMem(Addr, uint8_t*, unsigned int, unsigned int)': >>>> build/ARM/cpu/checker/cpu.cc:156:47: error: 'masterId' was not declared >>>> in this scope >>>> scons: *** [build/ARM/cpu/checker/cpu.fo] Error 1 >>>> >>>> I tried patching to a repo I have with revision 8813 and received the >>>> same error. Are there some other patches from the reviewboard that I >>>> should be including? >>>> >>>> Thanks, >>>> Andrew >>>> >>>> >>>> On Fri, Mar 2, 2012 at 12:37 PM, Andrew Cebulski <[email protected]>wrote: >>>> >>>>> Geoff, >>>>> >>>>> Okay, but it looks to me like that error is correctable. I think >>>>> that the m5.instantiate(checkpoint_dir) should only happen within the 'if >>>>> options.checkpoint_restore != None:' statement (so it needs an extra tab). >>>>> As it is in the repository, it happens regardless of whether or not you >>>>> are restoring from a checkpoint. So you're essentially doing >>>>> m5.instantiate(None). >>>>> >>>>> -Andrew >>>>> >>>>> >>>>> On Fri, Mar 2, 2012 at 12:23 PM, Geoffrey Blake <[email protected]>wrote: >>>>> >>>>>> Andrew, >>>>>> >>>>>> You may want to wait until the most recent patches for the checker are >>>>>> pushed that will allow you to just specify --checker on the command >>>>>> line. I forgot the checker as it is now in the tree had broken during >>>>>> a recent merge with other changes. Or, if you go to M5's reviewboard >>>>>> you can grab the patches for the checker and apply them. >>>>>> >>>>>> Geoff >>>>>> >>>>>> On Fri, Mar 2, 2012 at 11:17 AM, Andrew Cebulski <[email protected]> >>>>>> wrote: >>>>>> > I'm getting the following error when running this basic command >>>>>> with the CPU >>>>>> > Checker enabled: >>>>>> > >>>>>> > build/ARM/gem5.fast configs/example/fs.py -b ArmUbuntu >>>>>> --cpu-type=detailed >>>>>> > --caches >>>>>> > >>>>>> > Error in unproxying param 'workload' of system.cpu.checker >>>>>> > Traceback (most recent call last): >>>>>> > File "<string>", line 1, in ? >>>>>> > File "/gem5/src/python/m5/main.py", line 361, in main >>>>>> > exec filecode in scope >>>>>> > File "configs/example/fs.py", line 215, in ? >>>>>> > Simulation.run(options, root, test_sys, FutureClass) >>>>>> > File "/gem5/configs/common/Simulation.py", line 246, in run >>>>>> > m5.instantiate(checkpoint_dir) >>>>>> > File "/gem5/src/python/m5/simulate.py", line 66, in instantiate >>>>>> > for obj in root.descendants(): obj.unproxyParams() >>>>>> > File "/gem5/src/python/m5/SimObject.py", line 851, in >>>>>> unproxyParams >>>>>> > value = value.unproxy(self) >>>>>> > File "/gem5/src/python/m5/params.py", line 196, in unproxy >>>>>> > return [v.unproxy(base) for v in self] >>>>>> > File "/gem5/src/python/m5/proxy.py", line 89, in unproxy >>>>>> > result, done = self.find(obj) >>>>>> > File "/gem5/src/python/m5/proxy.py", line 162, in find >>>>>> > val = val[m] >>>>>> > IndexError: list index out of range >>>>>> > >>>>>> > Any idea why this is happening? I'm not even attempting to launch >>>>>> from a >>>>>> > checkpoint here (though this exact error does occur when attempting >>>>>> > restoring from checkpoint now). Some notes on my environment... >>>>>> I'm >>>>>> > running Python 2.4.3, SWIG 1.3.40 and GCC 4.5.3. >>>>>> > >>>>>> > Note that when I run atomic/timing CPUs, I get a segmentation >>>>>> fault. I'm >>>>>> > assuming this is because they don't have checker's setup in the >>>>>> code. Let >>>>>> > me know if otherwise. >>>>>> > >>>>>> > Thanks, >>>>>> > Andrew >>>>>> > >>>>>> > >>>>>> > On Thu, Mar 1, 2012 at 5:00 PM, Ali Saidi <[email protected]> wrote: >>>>>> >> >>>>>> >> Hi Andrew, >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> You should be able to re-compile gem5 with USE_CHECKER=1 on the >>>>>> command >>>>>> >> line and it will include the checker and run it when you restore >>>>>> to the o3 >>>>>> >> cpu. >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> Thanks, >>>>>> >> >>>>>> >> Ali >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> On 01.03.2012 14:02, Andrew Cebulski wrote: >>>>>> >> >>>>>> >> Hi Ali, >>>>>> >> >>>>>> >> Okay, thanks, I'll try out the checker cpu. Is this the best >>>>>> resource >>>>>> >> available on how to use the Checker CPU? -- >>>>>> http://gem5.org/Checker >>>>>> >> Also, my run restoring the O3 CPU from my checkpoint has the >>>>>> same >>>>>> >> result: >>>>>> >> Detailed CPU (checkpoint restore) : >>>>>> system.cpu.committedInsts = >>>>>> >> 646985567 >>>>>> >> >>>>>> >> system.cpu.fetch.Insts = 648951747 >>>>>> >> Thanks, >>>>>> >> Andrew >>>>>> >> >>>>>> >> On Thu, Mar 1, 2012 at 2:40 PM, Ali Saidi <[email protected]> wrote: >>>>>> >>> >>>>>> >>> Hi Andrew, >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> The first guess is that possibly the cpu results in a different >>>>>> code path >>>>>> >>> or different scheduler decisions which lengthen execution. Another >>>>>> >>> possibility is that the O3 cpu as configured by the arm-detailed >>>>>> >>> configuration has some issue. While this is possible it's not >>>>>> incredibly >>>>>> >>> likely. You could try to restore from the checkpoint and run with >>>>>> the >>>>>> >>> checker cpu. This creates a little atomic like cpu that sits next >>>>>> to the o3 >>>>>> >>> core and verifies it's execution which might tell you if there is >>>>>> a bug in >>>>>> >>> the o3 model. >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> Thanks, >>>>>> >>> >>>>>> >>> Ali >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> On 01.03.2012 13:04, Andrew Cebulskiwrote: >>>>>> >>> >>>>>> >>> Hi, >>>>>> >>> I'm experiencing some problems that I currently am >>>>>> attributing to >>>>>> >>> restoring from a checkpoint, then switching to an arm_detailed CPU >>>>>> >>> (O3_ARM_v7a_3). I first noticed the problem due to my committed >>>>>> instruction >>>>>> >>> counts not lining up correctly between different CPUs for a >>>>>> benchmark I'm >>>>>> >>> running (by roughly 170M instructions). The stats below are >>>>>> reset right >>>>>> >>> before running the benchmark, then dumped afterwards: >>>>>> >>> Atomic CPU (no checkpoint restore): system.cpu.numInsts = >>>>>> 476085242 >>>>>> >>> Detailed CPU (no checkpoint restore): >>>>>> system.cpu.committedInsts = >>>>>> >>> 476128320 >>>>>> >>> >>>>>> >>> system.cpu.fetch.Insts = 478463491 >>>>>> >>> Arm_detailed CPU (checkpoint restore): >>>>>> >>> system.switch_cpus_1.committedInsts = 646468886 >>>>>> >>> >>>>>> >>> system.switch_cpus_1.fetch.Insts = 660969371 >>>>>> >>> Arm_detailed CPU (no checkpoint restore): >>>>>> system.cpu.committedInsts >>>>>> >>> = 476107801 >>>>>> >>> >>>>>> >>> system.cpu.fetch.Insts = 491814681 >>>>>> >>> I included both the committed and fetched instructions, to >>>>>> see if the >>>>>> >>> problem is with fetchs getting counted as committed even if they >>>>>> are not >>>>>> >>> (i.e. insts not getting squashed). It does not seem like that is >>>>>> the case >>>>>> >>> from the stats above...as the arm_detailed run without a >>>>>> checkpoint has >>>>>> >>> roughly the same difference between fetched/committed >>>>>> instructions. I >>>>>> >>> noticed that the switch arm_detailed cpu when restoring from a >>>>>> checkpoint >>>>>> >>> lacks both a icache and dcache as children, but I read in a >>>>>> previous post >>>>>> >>> that they are connected to fetch/iew respectively, so this is >>>>>> probably not >>>>>> >>> the issue. I assume it's just not shown explicitly in the >>>>>> config.ini >>>>>> >>> file... >>>>>> >>> I'm running a test right now to see if switching to a regular >>>>>> >>> DerivO3CPU has the same issue. Regardless of its results, does >>>>>> anyone have >>>>>> >>> any idea why I'm seeing roughly 170M more committed instructions >>>>>> in the >>>>>> >>> arm_detailed CPU run when I restore from a checkpoint? I've >>>>>> attached my >>>>>> >>> config file from the arm_detailed with checkpoint run for >>>>>> reference. >>>>>> >>> Here's the run command for when I use a checkpoint: >>>>>> >>> build/ARM/gem5.fast -d [dir] configs/example/fs.py -b >>>>>> [benchmark] -r >>>>>> >>> 1 --checkpoint-dir=[chkpt-dir] --caches -s >>>>>> >>> Lastly, I'm running off of revision 8813 from 2/3/12. Let me >>>>>> know if >>>>>> >>> you need anymore info (i.e. stats). >>>>>> >>> Thanks, >>>>>> >>> Andrew >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> _______________________________________________ >>>>>> >>> gem5-users mailing list >>>>>> >>> [email protected] >>>>>> >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> > >>>>>> > >>>>>> > >>>>>> > _______________________________________________ >>>>>> > gem5-users mailing list >>>>>> > [email protected] >>>>>> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> _______________________________________________ >>>>>> gem5-users mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >>>>> >>>>> >>>> >>> >> _______________________________________________ >> gem5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> >> >> >> _______________________________________________ >> gem5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
