Hi Ali, The benchmark is libquantum from SPEC CPU2006. The results are printed out to the system.terminal, so I am able to verify. In all cases it passes with the exact same output. Note that when I don't restore from a checkpoint...the committed instructions for the O3 CPU are roughly the same as atomic (within 100,000 instructions).
Yes, I did actually run atomic CPU with a checkpoint restore. It resulted in 476085242 committed instructions...the exact same as without launching from a checkpoint. I'll work on getting you results from another benchmark. In the meantime, let me know if you have any other ideas. Thanks, Andrew On Sat, Mar 3, 2012 at 2:14 PM, Ali Saidi <[email protected]> wrote: > Hi Andrew, > > Are you sure the benchmark isn't timing dependent. Does the benchmark do > any kind of self-checking (E.g. the benchmark completes,but does it come to > the right answer)? > > Did you ever run the atomic cpu with a checkpoint restore? What is the > instruction count in this case? > > Thanks, > Ali > > On Mar 2, 2012, at 10:08 PM, Andrew Cebulski wrote: > > Okay, checker built and ran perfectly as far as I can tell. Thanks! > > Here are the errors reported by the checker: > > warn: 3009947097500: Instruction results do not match! (Values may not > actually be integers) Inst: 0x2281c, checker: 0x281c > warn: 3015415839500: Instruction results do not match! (Values may not > actually be integers) Inst: 0x2281c, checker: 0x281c > warn: 3077134098000: Instruction results do not match! (Values may not > actually be integers) Inst: 0x2, checker: 0 > > A grep shows this coming from src/cpu/checker/cpu_impl.hh > > My benchmark ran to completion with the following results: > > Detailed CPU (checkpoint restore) : system.switch_cpus_1.committedInsts > = 610834324 > > system.switch_cpus_1.committedOps (new stat) = 646803879 (this is close to > what the committed instructions were before...) > > system.switch_cpus_1.fetch.Insts = 632688924 > > What's the next step finding the source of this error? > > Thanks, > Andrew > > On Fri, Mar 2, 2012 at 5:04 PM, Andrew Cebulski <[email protected]> wrote: > >> This probably happened because I merged into rev 8877 instead of rev >> 8861. The patch merged find with rev 8861, so none of my local changes >> conflicted. I'm building now. I'll send an update later when I'm blocked >> again. >> >> I actually just tried gcc 4.6.2 recently, so I experienced that swig >> error with ptrdiff_t. Glad to see that was fixed in rev 8861. >> >> -Andrew >> >> >> On Fri, Mar 2, 2012 at 3:36 PM, Andrew Cebulski <[email protected]> wrote: >> >>> Okay, so I'm trying to build after patching this from the review board: >>> http://reviews.m5sim.org/r/1031/ >>> >>> There were a few minor merge issues with the patch, but they all seemed >>> easily resolved. I'm merging this into gem5 revision 8884 (today). >>> Unfortunately, I'm getting this error: >>> >>> [ CXX] ARM/cpu/checker/cpu.cc -> .fo >>> build/ARM/cpu/checker/cpu.cc: In member function 'void >>> CheckerCPU::setSystem(System*)': >>> build/ARM/cpu/checker/cpu.cc:106:43: error: no matching function for >>> call to 'SimpleThread::SimpleThread(CheckerCPU* const, int, System*&, >>> Process*, ArmISA::TLB*&, ArmISA::TLB*&)' >>> build/ARM/cpu/simple_thread.hh:142:5: note: candidates are: >>> SimpleThread::SimpleThread() >>> build/ARM/cpu/simple_thread.hh:139:5: note: >>> SimpleThread::SimpleThread(BaseCPU*, int, Process*, ArmISA::TLB*, >>> ArmISA::TLB*) >>> build/ARM/cpu/simple_thread.hh:135:5: note: >>> SimpleThread::SimpleThread(BaseCPU*, int, System*, ArmISA::TLB*, >>> ArmISA::TLB*, bool) >>> build/ARM/cpu/simple_thread.hh:96:1: note: >>> SimpleThread::SimpleThread(const SimpleThread&) >>> build/ARM/cpu/checker/cpu.cc: In member function 'Fault >>> CheckerCPU::readMem(Addr, uint8_t*, unsigned int, unsigned int)': >>> build/ARM/cpu/checker/cpu.cc:156:47: error: 'masterId' was not declared >>> in this scope >>> scons: *** [build/ARM/cpu/checker/cpu.fo] Error 1 >>> >>> I tried patching to a repo I have with revision 8813 and received the >>> same error. Are there some other patches from the reviewboard that I >>> should be including? >>> >>> Thanks, >>> Andrew >>> >>> >>> On Fri, Mar 2, 2012 at 12:37 PM, Andrew Cebulski <[email protected]>wrote: >>> >>>> Geoff, >>>> >>>> Okay, but it looks to me like that error is correctable. I think >>>> that the m5.instantiate(checkpoint_dir) should only happen within the 'if >>>> options.checkpoint_restore != None:' statement (so it needs an extra tab). >>>> As it is in the repository, it happens regardless of whether or not you >>>> are restoring from a checkpoint. So you're essentially doing >>>> m5.instantiate(None). >>>> >>>> -Andrew >>>> >>>> >>>> On Fri, Mar 2, 2012 at 12:23 PM, Geoffrey Blake <[email protected]>wrote: >>>> >>>>> Andrew, >>>>> >>>>> You may want to wait until the most recent patches for the checker are >>>>> pushed that will allow you to just specify --checker on the command >>>>> line. I forgot the checker as it is now in the tree had broken during >>>>> a recent merge with other changes. Or, if you go to M5's reviewboard >>>>> you can grab the patches for the checker and apply them. >>>>> >>>>> Geoff >>>>> >>>>> On Fri, Mar 2, 2012 at 11:17 AM, Andrew Cebulski <[email protected]> >>>>> wrote: >>>>> > I'm getting the following error when running this basic command with >>>>> the CPU >>>>> > Checker enabled: >>>>> > >>>>> > build/ARM/gem5.fast configs/example/fs.py -b ArmUbuntu >>>>> --cpu-type=detailed >>>>> > --caches >>>>> > >>>>> > Error in unproxying param 'workload' of system.cpu.checker >>>>> > Traceback (most recent call last): >>>>> > File "<string>", line 1, in ? >>>>> > File "/gem5/src/python/m5/main.py", line 361, in main >>>>> > exec filecode in scope >>>>> > File "configs/example/fs.py", line 215, in ? >>>>> > Simulation.run(options, root, test_sys, FutureClass) >>>>> > File "/gem5/configs/common/Simulation.py", line 246, in run >>>>> > m5.instantiate(checkpoint_dir) >>>>> > File "/gem5/src/python/m5/simulate.py", line 66, in instantiate >>>>> > for obj in root.descendants(): obj.unproxyParams() >>>>> > File "/gem5/src/python/m5/SimObject.py", line 851, in unproxyParams >>>>> > value = value.unproxy(self) >>>>> > File "/gem5/src/python/m5/params.py", line 196, in unproxy >>>>> > return [v.unproxy(base) for v in self] >>>>> > File "/gem5/src/python/m5/proxy.py", line 89, in unproxy >>>>> > result, done = self.find(obj) >>>>> > File "/gem5/src/python/m5/proxy.py", line 162, in find >>>>> > val = val[m] >>>>> > IndexError: list index out of range >>>>> > >>>>> > Any idea why this is happening? I'm not even attempting to launch >>>>> from a >>>>> > checkpoint here (though this exact error does occur when attempting >>>>> > restoring from checkpoint now). Some notes on my environment... I'm >>>>> > running Python 2.4.3, SWIG 1.3.40 and GCC 4.5.3. >>>>> > >>>>> > Note that when I run atomic/timing CPUs, I get a segmentation >>>>> fault. I'm >>>>> > assuming this is because they don't have checker's setup in the >>>>> code. Let >>>>> > me know if otherwise. >>>>> > >>>>> > Thanks, >>>>> > Andrew >>>>> > >>>>> > >>>>> > On Thu, Mar 1, 2012 at 5:00 PM, Ali Saidi <[email protected]> wrote: >>>>> >> >>>>> >> Hi Andrew, >>>>> >> >>>>> >> >>>>> >> >>>>> >> You should be able to re-compile gem5 with USE_CHECKER=1 on the >>>>> command >>>>> >> line and it will include the checker and run it when you restore to >>>>> the o3 >>>>> >> cpu. >>>>> >> >>>>> >> >>>>> >> >>>>> >> Thanks, >>>>> >> >>>>> >> Ali >>>>> >> >>>>> >> >>>>> >> >>>>> >> On 01.03.2012 14:02, Andrew Cebulski wrote: >>>>> >> >>>>> >> Hi Ali, >>>>> >> >>>>> >> Okay, thanks, I'll try out the checker cpu. Is this the best >>>>> resource >>>>> >> available on how to use the Checker CPU? -- >>>>> http://gem5.org/Checker >>>>> >> Also, my run restoring the O3 CPU from my checkpoint has the >>>>> same >>>>> >> result: >>>>> >> Detailed CPU (checkpoint restore) : system.cpu.committedInsts >>>>> = >>>>> >> 646985567 >>>>> >> >>>>> >> system.cpu.fetch.Insts = 648951747 >>>>> >> Thanks, >>>>> >> Andrew >>>>> >> >>>>> >> On Thu, Mar 1, 2012 at 2:40 PM, Ali Saidi <[email protected]> wrote: >>>>> >>> >>>>> >>> Hi Andrew, >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> The first guess is that possibly the cpu results in a different >>>>> code path >>>>> >>> or different scheduler decisions which lengthen execution. Another >>>>> >>> possibility is that the O3 cpu as configured by the arm-detailed >>>>> >>> configuration has some issue. While this is possible it's not >>>>> incredibly >>>>> >>> likely. You could try to restore from the checkpoint and run with >>>>> the >>>>> >>> checker cpu. This creates a little atomic like cpu that sits next >>>>> to the o3 >>>>> >>> core and verifies it's execution which might tell you if there is >>>>> a bug in >>>>> >>> the o3 model. >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> Thanks, >>>>> >>> >>>>> >>> Ali >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> On 01.03.2012 13:04, Andrew Cebulskiwrote: >>>>> >>> >>>>> >>> Hi, >>>>> >>> I'm experiencing some problems that I currently am attributing >>>>> to >>>>> >>> restoring from a checkpoint, then switching to an arm_detailed CPU >>>>> >>> (O3_ARM_v7a_3). I first noticed the problem due to my committed >>>>> instruction >>>>> >>> counts not lining up correctly between different CPUs for a >>>>> benchmark I'm >>>>> >>> running (by roughly 170M instructions). The stats below are reset >>>>> right >>>>> >>> before running the benchmark, then dumped afterwards: >>>>> >>> Atomic CPU (no checkpoint restore): system.cpu.numInsts = >>>>> 476085242 >>>>> >>> Detailed CPU (no checkpoint restore): >>>>> system.cpu.committedInsts = >>>>> >>> 476128320 >>>>> >>> >>>>> >>> system.cpu.fetch.Insts = 478463491 >>>>> >>> Arm_detailed CPU (checkpoint restore): >>>>> >>> system.switch_cpus_1.committedInsts = 646468886 >>>>> >>> >>>>> >>> system.switch_cpus_1.fetch.Insts = 660969371 >>>>> >>> Arm_detailed CPU (no checkpoint restore): >>>>> system.cpu.committedInsts >>>>> >>> = 476107801 >>>>> >>> >>>>> >>> system.cpu.fetch.Insts = 491814681 >>>>> >>> I included both the committed and fetched instructions, to see >>>>> if the >>>>> >>> problem is with fetchs getting counted as committed even if they >>>>> are not >>>>> >>> (i.e. insts not getting squashed). It does not seem like that is >>>>> the case >>>>> >>> from the stats above...as the arm_detailed run without a >>>>> checkpoint has >>>>> >>> roughly the same difference between fetched/committed >>>>> instructions. I >>>>> >>> noticed that the switch arm_detailed cpu when restoring from a >>>>> checkpoint >>>>> >>> lacks both a icache and dcache as children, but I read in a >>>>> previous post >>>>> >>> that they are connected to fetch/iew respectively, so this is >>>>> probably not >>>>> >>> the issue. I assume it's just not shown explicitly in the >>>>> config.ini >>>>> >>> file... >>>>> >>> I'm running a test right now to see if switching to a regular >>>>> >>> DerivO3CPU has the same issue. Regardless of its results, does >>>>> anyone have >>>>> >>> any idea why I'm seeing roughly 170M more committed instructions >>>>> in the >>>>> >>> arm_detailed CPU run when I restore from a checkpoint? I've >>>>> attached my >>>>> >>> config file from the arm_detailed with checkpoint run for >>>>> reference. >>>>> >>> Here's the run command for when I use a checkpoint: >>>>> >>> build/ARM/gem5.fast -d [dir] configs/example/fs.py -b >>>>> [benchmark] -r >>>>> >>> 1 --checkpoint-dir=[chkpt-dir] --caches -s >>>>> >>> Lastly, I'm running off of revision 8813 from 2/3/12. Let me >>>>> know if >>>>> >>> you need anymore info (i.e. stats). >>>>> >>> Thanks, >>>>> >>> Andrew >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> _______________________________________________ >>>>> >>> gem5-users mailing list >>>>> >>> [email protected] >>>>> >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > gem5-users mailing list >>>>> > [email protected] >>>>> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>> _______________________________________________ >>>>> gem5-users mailing list >>>>> [email protected] >>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>> >>>> >>>> >>> >> > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
