Hi Andrew, Could you add some code to the table walker to see how big the following are getting: stateQueueL1.size() stateQueueL2.size() pendingQueue.size()
Perhaps we're some how getting into a loop where there are a lot of translations to invalid addresses that get squashed and they pile up in the table walker? Thanks, Ali On May 4, 2012, at 7:53 AM, Gabriel Michael Black wrote: > I haven't had a chance to study what's going on here, but could the problem > be that we don't have bandwidth limits/back pressure implemented for the TLB > and delayed translation? It could be that the CPU is pumping instructions > into translation which eventually drain out/are squashed, and if too many > accumulate they trip that assert. > > That may not actually make any sense as far as what the code is actually > doing, but it occurred to me as a possibility and I thought I'd throw it out > there. > > Gabe > > Quoting Andrew Cebulski <af...@drexel.edu>: > >> I double-checked by looking at the config.ini file. It turns out I did >> actually create the checkpoint with an Atomic CPU without caches. Sorry >> for the confusion. >> >> -Andrew >> >> On Wed, May 2, 2012 at 10:12 PM, Andrew Cebulski <af...@drexel.edu> wrote: >> >>> I started hitting this assertion (that the number of insts in flight was > >>> 1500) before I started using a checkpoint. I created the checkpoint >>> afterwards to decrease the time needed to run simulations to debug this >>> problem. I'll create a new checkpoint, then send the new trace output. >>> >>> -Andrew >>> >>> >>> On Wed, May 2, 2012 at 9:53 PM, Ali Saidi <sa...@umich.edu> wrote: >>> >>>> ** >>>> >>>> It's likely the cause for all of your problems. Dirty data in the caches >>>> doesn't get restored either. You should always create checkpoints with an >>>> atomic cpu and without caches. >>>> >>>> >>>> >>>> Ali >>>> >>>> >>>> >>>> On 02.05.2012 21:23, Andrew Cebulski wrote: >>>> >>>> Sorry, I created the checkpoint I referred to with an O3 CPU with caches. >>>> From what I recall reading, caches don't get restored from checkpoints. >>>> Since the checkpoint wasn't during the benchmark run, I assumed that was >>>> okay. >>>> -Andrew >>>> >>>> On Wed, May 2, 2012 at 9:07 PM, Ali Saidi <sa...@umich.edu> wrote: >>>> >>>>> You haven't answered the question about if you created the checkpoints >>>>> with an atomic cpu without caches. >>>>> >>>>> Ali >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 02.05.2012 19:58, Andrew Cebulski wrote: >>>>> >>>>> I have not run with the checker CPU recently. Here's the stderr output >>>>> from a run I did awhile back: >>>>> http://dl.dropbox.com/u/2953302/gem5/err.0 >>>>> Note that the instruction match error is before my benchmark actually >>>>> starts running. The start of my boot script checks to see if my files >>>>> image is mounted (which it is), then continues on to run the benchmark. I >>>>> booted the system, mounted my files image, then took a checkpoint. I've >>>>> been running all my tests from that checkpoint. I found where my >>>>> benchmark >>>>> started based on the ASID (from ExecAsid debug flag). >>>>> I delayed the start of gathering trace data until the second-to-last >>>>> linear increase in dynamic instructions in-flight. I'm running a new >>>>> trace >>>>> now. >>>>> -Andrew >>>>> >>>>> >>>>> On Wed, May 2, 2012 at 5:28 PM, Ali Saidi <sa...@umich.edu> wrote: >>>>> >>>>>> Something is wrong well before this point. There is no reason that >>>>>> address 0x0 or 0x4 should be translated. >>>>>> >>>>>> Did you happen to create a checkpoint when caches were in the system? >>>>>> >>>>>> Have you tried to run with the checker cpu and see if it detects any >>>>>> errors? >>>>>> >>>>>> >>>>>> >>>>>> Ali >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 02.05.2012 17:22, Andrew Cebulski wrote: >>>>>> >>>>>> They are data TLB misses that occur as the in-flight instruction count >>>>>> rises (at 0x0 and 0x4). The last TLB miss before the in-flight >>>>>> instruction >>>>>> count finally linearly decreases is to 0x200. Also, at the start of the >>>>>> rising slope, I see a miss to 0x8 and 0x2508c. >>>>>> Here's a trace file: >>>>>> http://dl.dropbox.com/u/2953302/gem5/tlb.out >>>>>> To reduce size, I just have lines that have either TLB or walker in >>>>>> them. >>>>>> I do see only a handful of instruction TLB misses. >>>>>> -Andrew >>>>>> >>>>>> On Wed, May 2, 2012 at 11:10 AM, Ali Saidi <sa...@umich.edu> wrote: >>>>>> >>>>>>> Hi Andrew, >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks for digging into this. I think there is an issue somewhere, but >>>>>>> I'm still not sure where. >>>>>>> >>>>>>> Ali >>>>>>> >>>>>>> On 01.05.2012 23:34, Andrew Cebulski wrote: >>>>>>> >>>>>>> Okay, I'm positive now that the issue lies with delayed translations >>>>>>> that are squashed before finishing. >>>>>>> >>>>>>> On the data on instruction side? You seem to allude to data in the >>>>>>> paragraph below, but then instructions in the latter text. >>>>>>> >>>>>>> It seems to me like speculative load/stores are being executed, >>>>>>> rather than waiting for the instructions to commit. Once the >>>>>>> instructions >>>>>>> begin getting (speculatively) executed in the TLB, a reference is left >>>>>>> there, which seems hard to root out and dereference after the >>>>>>> instruction >>>>>>> ends up being squashed. At least, I have not been able to find that >>>>>>> out in >>>>>>> the source code as of yet. Can anyone clarify on this? >>>>>>> >>>>>>> >>>>>>> >>>>>>> There should only be one translation outstanding from each >>>>>>> instruction and data side walker. Any nested transactions should be >>>>>>> queued >>>>>>> in the walker. Until one finishes, I'm not sure how multiple would ever >>>>>>> be >>>>>>> outstanding. >>>>>>> >>>>>>> Recall the following image that shows how the number of dynamic >>>>>>> instruction (DynInst) objects in-flight increases linearly for varying >>>>>>> periods of time: >>>>>>> http://dl.dropbox.com/u/2953302/gem5/dyninst_vs_dyninstptr_dramsim2.png >>>>>>> After enabling the TLB debug flag, I see that the linear increase in >>>>>>> instructions in flight is proportional to the number of TLB misses. >>>>>>> These >>>>>>> TLB misses have a much larger delay (resulting in translation delays) >>>>>>> due >>>>>>> to the fact the DramSim2 models the memory system more accurately. It >>>>>>> seems that with the classic memory system, TLB misses often do not have >>>>>>> translation delays. For whatever reason, it would also seem that every >>>>>>> instruction that has a TLB miss also is eventually squashed... >>>>>>> >>>>>>> From a data side perspective this is reasonable. While a miss is >>>>>>> outstanding at some point instructions will stop committing and thus the >>>>>>> instructions in flight will begin to rise until the miss is satisfied. >>>>>>> >>>>>>> Here's a summary of outputs from my trace. These two DPRINTF >>>>>>> messages appears on the rising slopes (repeated up until the peak): >>>>>>> TLB Miss: Starting hardware table walker for 0(656) >>>>>>> TLB Miss: Starting hardware table walker for 0x4(656) >>>>>>> >>>>>>> This is interesting/odd. I don't know a good reason why (1) a miss >>>>>>> would be outstanding to both address 0 and address 4 at the same time. >>>>>>> In >>>>>>> almost all cases these pages are marked as no-access to detect >>>>>>> segfaults. >>>>>>> Perhaps there is an issue where the cpu is getting into a loop faulting >>>>>>> on >>>>>>> a bad access and then faulting again on the fault handler. I could >>>>>>> imagine >>>>>>> this would happen if there was some corruption in the memory system (for >>>>>>> example the timings in dramsim exposing a bug in the cache models or >>>>>>> something). >>>>>>> >>>>>>> >>>>>>> At the peak, the following message appears (from fetch) almost every >>>>>>> tick for (what I believe to be) every single one of the table walkers >>>>>>> that >>>>>>> were squashed. >>>>>>> Fetch is waiting ITLB walk to finish! >>>>>>> >>>>>>> There must be another walk in flight? The instruction side will only >>>>>>> have one fault outstanding at once. Successive branch mispredicts will >>>>>>> re-direct fetch but there is code that catches the fact that a different >>>>>>> walk completed then expected and "does the right thing." >>>>>>> >>>>>>> The problem is that these ITLB table walks are for instructions that >>>>>>> were squashed as much as 0.3 billion cycles earlier, and since been >>>>>>> removed >>>>>>> from the CPU's instruction list. >>>>>>> >>>>>>> I'm not following here. >>>>>>> >>>>>>> Any help will be greatly appreciated in solving this problem. I've >>>>>>> hit a roadblock with getting Ruby working with ARM, most likely due to >>>>>>> the >>>>>>> fact that ARM has disjoint memory (x86 and Alpha do not). There's the >>>>>>> 256 >>>>>>> MB for physical memory, then the 64 MB for the boot loader. I brought >>>>>>> this >>>>>>> up in my last email about trying to get Ruby working. Therefore, I'm >>>>>>> trying to get this DramSim2 integration fixed so I can start modeling FS >>>>>>> with DRAM memory. >>>>>>> >>>>>>> Brad/Steve/Nilay anyone have a suggestion on how to make this work? >>>>>>> >>>>>>> >>>>>>> Note that these problems also occur in Soplex from the Spec CPU2006 >>>>>>> benchmark suite (also hits 1500 in-flight instructions assertion). Due >>>>>>> to >>>>>>> time constraints, I haven't tested on other benchmarks. >>>>>>> Thanks, >>>>>>> Andrew >>>>>>> On Tue, May 1, 2012 at 4:27 AM, Andrew Cebulski >>>>>>> <af...@drexel.edu>wrote: >>>>>>> >>>>>>>> Hey Gabe, >>>>>>>> Thanks for this...very helpful. I just recently got back into >>>>>>>> debugging this problem. I made a small change in src/base/refcnt.hh to >>>>>>>> allow me to return the current count of references to a DynInst object. >>>>>>>> I then modified existing DPRINTFs to also print out reference >>>>>>>> counts, then added some of my own when I needed extra visibility. >>>>>>>> I've found one memory store instruction that seems to be getting >>>>>>>> lost. What's happening is that is progresses as far as getting >>>>>>>> executed in >>>>>>>> the IEW once, but a delayed translation occurs, deferring the store. >>>>>>>> By >>>>>>>> the time it reenters the IEW, the IQ has marked the instruction as >>>>>>>> squashed. Everything progresses as usual from here on out, with one >>>>>>>> exception. When the instruction is removed from the CPUs instruction >>>>>>>> list, >>>>>>>> there is one reference count hanging. >>>>>>>> I've added in some additional debugging for my traces to help >>>>>>>> narrow down where this reference is coming from. As far as I can tell, >>>>>>>> it's because of a call to initiateAcc() within the executeStore >>>>>>>> function in >>>>>>>> the lsq unit. Please see the following two traces. The first trace >>>>>>>> shows >>>>>>>> what I just discussed. The second trace is another memory store >>>>>>>> instruction that got squashed, however, it was squashed upon its first >>>>>>>> entry into the IEW, therefore it never started execution. >>>>>>>> http://dl.dropbox.com/u/2953302/gem5/lostinstruction.out >>>>>>>> http://dl.dropbox.com/u/2953302/gem5/similarinstruction.out >>>>>>>> Let me know if you have any ideas based on these two instruction >>>>>>>> traces. I do not understand how the initiateAcc function results in >>>>>>>> another reference, but maybe someone else does.... Since I don't see >>>>>>>> how >>>>>>>> it makes a reference, it's hard to find out how to make sure it gets >>>>>>>> dereferenced... >>>>>>>> Unfortunately, I haven't been able to add a DPRINTF in >>>>>>>> src/base/refcnt.hh ...this would make things more clear (i.e. exactly >>>>>>>> when >>>>>>>> references/deferences occur). Let me know if you have any advice on >>>>>>>> this...if it's possible. I can't seem to get the right include files, >>>>>>>> and >>>>>>>> likely right SConscript compile order... >>>>>>>> Thanks, >>>>>>>> Andrew >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Apr 7, 2012 at 9:48 PM, Gabe Black >>>>>>>> <gbl...@eecs.umich.edu>wrote: >>>>>>>> >>>>>>>>> Without digging into things too deeply, it looks like you may be >>>>>>>>> leaking references to dynamic instructions. The CPU may think it's >>>>>>>>> done >>>>>>>>> with one, but until that final reference is removed, the object will >>>>>>>>> hang >>>>>>>>> around forever. I think I've had problems before where there reference >>>>>>>>> count ended up off by one somehow and instructions would start piling >>>>>>>>> up. >>>>>>>>> It's also possible that a clog develops in O3's pipeline and some >>>>>>>>> internal >>>>>>>>> structure stops letting instructions through and starts accumulating >>>>>>>>> them. >>>>>>>>> Either of these problems will be annoying to track down, but with >>>>>>>>> enough >>>>>>>>> digging I've been able to fix these sorts of things. >>>>>>>>> >>>>>>>>> This may have more to do with O3 not handling the benchmark you're >>>>>>>>> running well rather than a problem with your new DRAM model. There >>>>>>>>> may be >>>>>>>>> some interaction between the two, though, where the new memory makes >>>>>>>>> the >>>>>>>>> timing line up to cause O3 to behave poorly. What you can do is >>>>>>>>> instrument >>>>>>>>> dynamic instruction creation and destruction and reference counting >>>>>>>>> (try >>>>>>>>> print "this" for both the reference counting wrapper and the dyn inst >>>>>>>>> itself) and turn it on as close as you can to where things go bad tick >>>>>>>>> wise. Then look for an instruction which gets lost, and look for >>>>>>>>> where it's >>>>>>>>> reference count is incremented and decremented. It should be >>>>>>>>> relatively >>>>>>>>> easy to pair up where references are created and destroyed, and you >>>>>>>>> should >>>>>>>>> be able to identify the reference which never goes away. Then you >>>>>>>>> need to >>>>>>>>> figure out where that reference is being created. After that, you >>>>>>>>> should >>>>>>>>> have enough information to identify why the reference counting isn't >>>>>>>>> being >>>>>>>>> done correctly. It's arduous, but that's the only way. >>>>>>>>> >>>>>>>>> It's important to also make sure reference counts aren't decremented >>>>>>>>> to zero prematurely. I had a problem once where that happened and the >>>>>>>>> memory behind the object was updated by something that didn't know it >>>>>>>>> was >>>>>>>>> dead. The memory had since been reallocated to another object of the >>>>>>>>> same >>>>>>>>> type, so that other object reflected what happened to the phantom >>>>>>>>> one. If I >>>>>>>>> remember that manifested as something weird like an add causing a page >>>>>>>>> fault or something. >>>>>>>>> >>>>>>>>> Gabe >>>>>>>>> >>>>>>>>> >>>>>>>>> On 04/07/12 18:21, Andrew Cebulski wrote: >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> I've looked into this problem some more, and have put together a >>>>>>>>> couple traces. I've been becoming more familiar with how gem5 handles >>>>>>>>> dynamic instructions, in particular how it destroys them. I have two >>>>>>>>> traces to compare, one with the physical memory, and the other with >>>>>>>>> the >>>>>>>>> integrated dramsim2 dram memory. I also have two plots showing >>>>>>>>> instruction >>>>>>>>> counts over time (sim ticks). All of these are linked at the end of >>>>>>>>> the >>>>>>>>> email. >>>>>>>>> First, I'm going to go into what I've been able to interpret >>>>>>>>> regarding how instructions are destroyed. In particular, comparing >>>>>>>>> when >>>>>>>>> DynInst's vs. DynInstPtr's are deconstructed/removed from the cpu. I >>>>>>>>> separate these because I've seen a difference, as I discuss later. >>>>>>>>> These >>>>>>>>> explanations are fairly non-existent on the wiki. There is a section >>>>>>>>> header waiting to be filled... >>>>>>>>> From what I have been able to gather from the code, there is a list >>>>>>>>> of all the instructions in flight in cpu/o3/cpu.cc called instList, >>>>>>>>> with >>>>>>>>> the type DynInstPtr. There are three conditions to instructions being >>>>>>>>> cleaned from this list: >>>>>>>>> 1.) The ROB retires its head instruction >>>>>>>>> 2.) Fetch receives a rob squashing signal from the commit, >>>>>>>>> resulting in removing any instruction not in the ROB >>>>>>>>> 3.) Decode detects an incorrect branch prediction, resulting in >>>>>>>>> removal of all instructions back to the bad seq num. >>>>>>>>> Once all five stages have completed, the CPU cleans up all the >>>>>>>>> removed in-flight instructions. This line in particular >>>>>>>>> in cleanUpRemovedInsts() in cpu/o3/cpu.cc deconstructs a DynInstPtr: >>>>>>>>> instList.erase(removeList.front()); >>>>>>>>> When I turn on the debug flag O3CPU, I see the message "Removing >>>>>>>>> instruction, ..." (from o3/cpu.cc) with the threadNum, seqNum and >>>>>>>>> pcState >>>>>>>>> after all 5 cpu stages have completed, and one of the conditions >>>>>>>>> above is >>>>>>>>> met. I also see what tick it occurs on. >>>>>>>>> When I turn on the DynInst debug flag, I see when instructions are >>>>>>>>> created and destroyed (cpu/base_dyn_inst_impl.hh) and what tick. From >>>>>>>>> analyzing the trace files, I've gathered that this takes into account >>>>>>>>> that >>>>>>>>> instructions have different execution lengths. So if one tick a >>>>>>>>> memory >>>>>>>>> instruction in the instList (DynInstPtr) is removed, the DynInst for >>>>>>>>> that >>>>>>>>> memory instruction will occur much later (i.e. 1M ticks later). I >>>>>>>>> have yet >>>>>>>>> to determine how this is implemented. >>>>>>>>> Now for the problem. >>>>>>>>> What I'm seeing when I run dramsim2 dram memory is a significant >>>>>>>>> difference between the size of the instList vector (of DynInstPtr >>>>>>>>> objects), >>>>>>>>> and the size of dynamic instruction count (of DynInst objects). The >>>>>>>>> benchmark I'm running is libquantum from SPEC 2006. For the first >>>>>>>>> roughly >>>>>>>>> 130B ticks, the dynamic instruction count kept in >>>>>>>>> cpu/base_dyn_inst.impl.hh >>>>>>>>> shadows the instList size in o3/cpu.cc (figure linked below) very >>>>>>>>> closely. >>>>>>>>> Around tick 130B after libquantum started, it starts hitting what I'm >>>>>>>>> assuming are loops (therefore branch prediction), resulting in some >>>>>>>>> behavior that seems to imply improper instruction handling (i.e. more >>>>>>>>> instructions in flight than allowed by ROB). >>>>>>>>> I wasn't able to sync-up the physical and dramsim2 traces exactly by >>>>>>>>> trace, but they should represent roughly the same area of execution. >>>>>>>>> They >>>>>>>>> don't execute the same due to the dramsim2 modeling the memory >>>>>>>>> differently >>>>>>>>> (i.e. latency and other delays). >>>>>>>>> I've shared both traces on my public Dropbox here -- >>>>>>>>> >>>>>>>>> http://dl.dropbox.com/u/2953302/gem5/physical-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU.out.gz >>>>>>>>> >>>>>>>>> http://dl.dropbox.com/u/2953302/gem5/dramsim2-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU-2.out.gz >>>>>>>>> Here are a couple plots of tick versus instruction count, with >>>>>>>>> respect to cpu->instcount in cpu/base_dyn_inst.impl.hh and >>>>>>>>> instList.size() >>>>>>>>> in cpu/o3/cpu.cc. -- >>>>>>>>> >>>>>>>>> http://dl.dropbox.com/u/2953302/gem5/dyninst_vs_dyninstptr_physical.png >>>>>>>>> >>>>>>>>> http://dl.dropbox.com/u/2953302/gem5/dyninst_vs_dyninstptr_dramsim2.png >>>>>>>>> Note that I added the printout of the instList size to an existing >>>>>>>>> O3CPU DPRINTF in cleanUpRemovedInsts() in cpu/o3/cpu.cc. >>>>>>>>> Here are the commands I ran to parse the traces into data files to >>>>>>>>> analyze in MATLAB and create the plots: >>>>>>>>> zgrep DynInst >>>>>>>>> dramsim2-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU-2.out.gz | grep >>>>>>>>> destroyed >>>>>>>>> | awk '{print $1,$11}' > cpuinstcount.out >>>>>>>>> zgrep instList >>>>>>>>> dramsim2-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU-2.out.gz | awk >>>>>>>>> '{print >>>>>>>>> $1,$11}' > instlistsize.out >>>>>>>>> It seems to me like the problem might lie in gem5, but has just been >>>>>>>>> exposed by integrating this more detailed memory model, dramsim2, into >>>>>>>>> gem5. Either that, or their are some timing errors in how dramsim2 >>>>>>>>> was >>>>>>>>> integrated. I doubt this, however, since those first 190B ticks >>>>>>>>> executed >>>>>>>>> used the dramsim2 memory. I believe the problem is a combination of >>>>>>>>> memory >>>>>>>>> instructions + complex loops (branch prediction), resulting in >>>>>>>>> improper >>>>>>>>> destroying of instructions. >>>>>>>>> I've included the ROB, Commit, Fetch, DynInst and O3CPU debug flags. >>>>>>>>> Their are 192 ROB entries, which is why the instList size generally >>>>>>>>> has a >>>>>>>>> max of about 192 instructions. The dynamic instruction counts (seen >>>>>>>>> in the >>>>>>>>> dramsim2 plot) seem to also imply that instructions are incorrectly >>>>>>>>> been >>>>>>>>> removed from the ROB, and then from the cpu's instruction list in >>>>>>>>> cpu.cc, >>>>>>>>> which allows more and more instructions to be added to the system >>>>>>>>> (possibly >>>>>>>>> from a bad branch). >>>>>>>>> I appreciate any help in debugging this and further figuring out the >>>>>>>>> root problem, just let me know if you need anything else from me. I >>>>>>>>> don't >>>>>>>>> have much more time at the moment to debug, but I can take any advice >>>>>>>>> for >>>>>>>>> quick changes and/or additional traces, then send the results back to >>>>>>>>> the >>>>>>>>> list for discussion. >>>>>>>>> Thanks, >>>>>>>>> Andrew >>>>>>>>> P.S. Paul - I did try decreasing the size of the dramsim2 >>>>>>>>> transaction (and even command) queue from 512 to 32. The same >>>>>>>>> instructions >>>>>>>>> problem occurred. It basically just decreased the execution time. >>>>>>>>> >>>>>>>>> On Wed, Mar 14, 2012 at 2:10 PM, Ali Saidi <sa...@umich.edu> wrote: >>>>>>>>> >>>>>>>>>> The error is that there are more that 1500 instructions currently >>>>>>>>>> in flight in the system. It could mean several things: >>>>>>>>>> >>>>>>>>>> 1. The value is somewhat arbitrarily defined and maybe there are >>>>>>>>>> more than 1500 in your system at one time? >>>>>>>>>> >>>>>>>>>> 2. Instructions aren't being destroyed correctly >>>>>>>>>> >>>>>>>>>> You could try to to run a debug binary so you'll get a list of >>>>>>>>>> instructions when it happens or increase the number which may >>>>>>>>>> be appropriate for certain situations (but 1500 is quite a few >>>>>>>>>> inflight >>>>>>>>>> instructions). >>>>>>>>>> >>>>>>>>>> Ali >>>>>>>>>> >>>>>>>>>> On 13.03.2012 10:56, Andrew Cebulski wrote: >>>>>>>>>> >>>>>>>>>> Hi Xiangyu, >>>>>>>>>> I just started looking into this some more. So at first I >>>>>>>>>> thought it was due to updating to a more recent revision, but then I >>>>>>>>>> went >>>>>>>>>> back to revision 8643, added your patch, built and ran....and now >>>>>>>>>> get the >>>>>>>>>> error with it too (when running ARM_FS/gem5.opt). I"m testing now >>>>>>>>>> to see >>>>>>>>>> if an update to SWIG might have resulted in this error, maybe >>>>>>>>>> someone on >>>>>>>>>> the mailing list would know if that's possible. The difference is >>>>>>>>>> 1.3.40 >>>>>>>>>> vs. 2.0.3, both of which are supported according to the dependencies >>>>>>>>>> wiki >>>>>>>>>> page. >>>>>>>>>> Just for completeness, here's the error from revision 8643: >>>>>>>>>> build/ARM_FS/cpu/base_dyn_inst_impl.hh:149: void >>>>>>>>>> BaseDynInst::initVars() [with Impl = O3CPUImpl]: Assertion >>>>>>>>>> `cpu->instcount >>>>>>>>>> I have not tried running with gem5.debug, so I will be doing >>>>>>>>>> that today. Maybe this is an assertion that is occurring due to an >>>>>>>>>> optimization. That would mean it wouldn't be triggered in >>>>>>>>>> gem5.debug since >>>>>>>>>> it runs without optimizations. Have you tested all debug, opt and >>>>>>>>>> fast >>>>>>>>>> with your tests? >>>>>>>>>> Thanks, >>>>>>>>>> Andrew >>>>>>>>>> >>>>>>>>>> On Tue, Mar 13, 2012 at 1:37 PM, Rio Xiangyu Dong < >>>>>>>>>> riosher...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Andrew, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I didn?t see this error in my simulations. May I ask which gem5 >>>>>>>>>>> version you are using? I find some of the latest code updates do >>>>>>>>>>> not comply >>>>>>>>>>> with my changes. I am still using the DRAMsim2 patch on Gem5 >>>>>>>>>>> repo8643, and >>>>>>>>>>> have run all the runnable benchmarks in SPEC2006, SPEC2000, EEMBC2, >>>>>>>>>>> and >>>>>>>>>>> PARSEC2 on ARM_SE. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thank you! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> Xiangyu >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *From:* Andrew Cebulski [mailto:af...@drexel.edu] >>>>>>>>>>> *Sent:* Thursday, March 08, 2012 6:52 PM >>>>>>>>>>> >>>>>>>>>>> *To:* gem5 users mailing list >>>>>>>>>>> *Cc:*riosher...@gmail.com; sa...@umich.edu >>>>>>>>>>> >>>>>>>>>>> *Subject:* Re: [gem5-users] A Patch for DRAMsim2 Integration >>>>>>>>>>> >>>>>>>>>>> Xiangyu, >>>>>>>>>>> >>>>>>>>>>> I've been having an issue recently with the number of >>>>>>>>>>> instructions I've been seeing committed to the CPU (I have a >>>>>>>>>>> separate >>>>>>>>>>> thread on this). It turns out the issue seems to be coming from >>>>>>>>>>> this patch >>>>>>>>>>> you created to integrate DramSim2 with Gem5. Unfortunately, I've >>>>>>>>>>> been >>>>>>>>>>> running with gem5.fast, not gem5.opt. So up until now, I haven't >>>>>>>>>>> been >>>>>>>>>>> seeing assertions. I thought I'd run it with gem5.opt or debug >>>>>>>>>>> back in >>>>>>>>>>> December, but I must not have. My runs on the Arm O3 cpu fails >>>>>>>>>>> with this >>>>>>>>>>> assertion: >>>>>>>>>>> >>>>>>>>>>> build/ARM/cpu/base_dyn_inst_impl.hh:149: void >>>>>>>>>>> BaseDynInst::initVars() [with Impl = O3CPUImpl]: Assertion >>>>>>>>>>> `cpu->instcount >>>>>>>>>>> >>>>>>>>>>> -Andrew >>>>>>>>>>> >>>>>>>>>>> Date: Sun, 18 Dec 2011 01:48:58 -0800 >>>>>>>>>>> From: "Dong, Xiangyu" <riosher...@gmail.com> >>>>>>>>>>> To: "gem5 users mailing list" <gem5-users@gem5.org> >>>>>>>>>>> Subject: [gem5-users] A Patch for DRAMsim2 Integration >>>>>>>>>>> Message-ID: gmail.com> >>>>>>>>>>> >>>>>>>>>>> Content-Type: text/plain; charset="us-ascii" >>>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I have a Gem5+DRAMsim2 patch. I've tested it under both SE and FS >>>>>>>>>>> modes. >>>>>>>>>>> I'm willing to share it here. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> For those who have such needs, please go to my website >>>>>>>>>>> www.cse.psu.edu/~xydong <http://www.cse.psu.edu/%7Exydong> to >>>>>>>>>>> download the patch and test it. To enable >>>>>>>>>>> DRAMSim2, use se_dramsim2.py script instead of se.py (for FS, you >>>>>>>>>>> can create >>>>>>>>>>> by yourself). The basic idea to enable the DRAMsim2 module is to >>>>>>>>>>> use the >>>>>>>>>>> derived DRAMMemory class instead of PhysicalMemory class. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Please let me know if there are bugs. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thank you! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> Xiangyu Dong >>>>>>>>>>> >>>>>>>>>>> -------------- next part -------------- >>>>>>>>>>> An HTML attachment was scrubbed... >>>>>>>>>>> URL: < >>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/private/gem5-users/attachments/20111218/f3fdf5da/attachment.html >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> gem5-users mailing list >>>>>>>>>> gem5-users@gem5.org >>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-users mailing >>>>>>>>> listgem5-users@gem5.orghttp://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> gem5-users@gem5.org >>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> gem5-users mailing list >>>>>>> gem5-users@gem5.org >>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> gem5-users mailing list >>>>>> gem5-users@gem5.org >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> gem5-users mailing list >>>>> gem5-users@gem5.org >>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> gem5-users mailing list >>>> gem5-users@gem5.org >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>> >>> >> > > > _______________________________________________ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > _______________________________________________ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users