Hi Tao,

   That would be great...thanks!  Have you tested the integration with ARM?
 I know for awhile the status matrix on the gem5 wiki has shown that Ruby
"might work" with ARM (last updated the beginning of March).

-Andrew

On Sat, Apr 7, 2012 at 9:35 PM, Tao Zhang <tao.zhang.0...@gmail.com> wrote:

> Hi Andrew,
>
> I just finished the integration of DRAMSim2 with Ruby. Since I think
> Xiangyu's patch only work with classic memory, as a workaround, I can share
> the code with you if you'd like to use Ruby.
>
> Tao Zhang
> Department of CSE
> Penn State University
> (from my iphone4)
>
> On Apr 7, 2012, at 9:21 PM, Andrew Cebulski <af...@drexel.edu> wrote:
>
> Hi all,
>
> I've looked into this problem some more, and have put together a couple
> traces.  I've been becoming more familiar with how gem5 handles dynamic
> instructions, in particular how it destroys them.  I have two traces to
> compare, one with the physical memory, and the other with the integrated
> dramsim2 dram memory.  I also have two plots showing instruction counts
> over time (sim ticks).  All of these are linked at the end of the email.
>
> First, I'm going to go into what I've been able to interpret regarding how
> instructions are destroyed.  In particular, comparing when DynInst's vs.
> DynInstPtr's are deconstructed/removed from the cpu.  I separate these
> because I've seen a difference, as I discuss later.  These explanations are
> fairly non-existent on the wiki.  There is a section header waiting to be
> filled...
>
> From what I have been able to gather from the code, there is a list of all
> the instructions in flight in cpu/o3/cpu.cc called instList, with the type
> DynInstPtr.  There are three conditions to instructions being cleaned from
> this list:
>
> 1.)  The ROB retires its head instruction
> 2.)  Fetch receives a rob squashing signal from the commit, resulting in
> removing any instruction not in the ROB
> 3.)  Decode detects an incorrect branch prediction, resulting in removal
> of all instructions back to the bad seq num.
>
> Once all five stages have completed, the CPU cleans up all the removed
> in-flight instructions.  This line in particular
> in cleanUpRemovedInsts() in cpu/o3/cpu.cc deconstructs a DynInstPtr:
>
> instList.erase(removeList.front());
>
> When I turn on the debug flag O3CPU, I see the message "Removing
> instruction, ..." (from o3/cpu.cc) with the threadNum, seqNum and pcState
> after all 5 cpu stages have completed, and one of the conditions above is
> met.  I also see what tick it occurs on.
>
> When I turn on the DynInst debug flag, I see when instructions are created
> and destroyed (cpu/base_dyn_inst_impl.hh) and what tick.  From analyzing
> the trace files, I've gathered that this takes into account that
> instructions have different execution lengths.  So if one tick a memory
> instruction in the instList (DynInstPtr) is removed, the DynInst for that
> memory instruction will occur much later (i.e. 1M ticks later).  I have yet
> to determine how this is implemented.
>
> Now for the problem.
>
> What I'm seeing when I run dramsim2 dram memory is a significant
> difference between the size of the instList vector (of DynInstPtr objects),
> and the size of dynamic instruction count (of DynInst objects).  The
> benchmark I'm running is libquantum from SPEC 2006.  For the first roughly
> 130B ticks, the dynamic instruction count kept in cpu/base_dyn_inst.impl.hh
> shadows the instList size in o3/cpu.cc (figure linked below) very closely.
>  Around tick 130B after libquantum started, it starts hitting what I'm
> assuming are loops (therefore branch prediction), resulting in some
> behavior that seems to imply improper instruction handling (i.e. more
> instructions in flight than allowed by ROB).
>
> I wasn't able to sync-up the physical and dramsim2 traces exactly by
> trace, but they should represent roughly the same area of execution.  They
> don't execute the same due to the dramsim2 modeling the memory differently
> (i.e. latency and other delays).
>
> I've shared both traces on my public Dropbox here --
>
> http://dl.dropbox.com/u/2953302/gem5/physical-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU.out.gz
>
> http://dl.dropbox.com/u/2953302/gem5/dramsim2-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU-2.out.gz
>
> Here are a couple plots of tick versus instruction count, with respect to
> cpu->instcount in cpu/base_dyn_inst.impl.hh and instList.size() in
> cpu/o3/cpu.cc.  --
> http://dl.dropbox.com/u/2953302/gem5/dyninst_vs_dyninstptr_physical.png
> http://dl.dropbox.com/u/2953302/gem5/dyninst_vs_dyninstptr_dramsim2.png
>
> Note that I added the printout of the instList size to an existing O3CPU
> DPRINTF in cleanUpRemovedInsts() in cpu/o3/cpu.cc.
>
> Here are the commands I ran to parse the traces into data files to analyze
> in MATLAB and create the plots:
> zgrep DynInst dramsim2-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU-2.out.gz |
> grep destroyed | awk '{print $1,$11}' > cpuinstcount.out
> zgrep instList dramsim2-fs-040612-ROB-Commit-DynInst-Fetch-O3CPU-2.out.gz
> | awk '{print $1,$11}' > instlistsize.out
>
> It seems to me like the problem might lie in gem5, but has just been
> exposed by integrating this more detailed memory model, dramsim2, into
> gem5.  Either that, or their are some timing errors in how dramsim2 was
> integrated.  I doubt this, however, since those first 190B ticks executed
> used the dramsim2 memory.  I believe the problem is a combination of memory
> instructions + complex loops (branch prediction), resulting in improper
> destroying of instructions.
>
> I've included the ROB, Commit, Fetch, DynInst and O3CPU debug flags.
>  Their are 192 ROB entries, which is why the instList size generally has a
> max of about 192 instructions.  The dynamic instruction counts (seen in the
> dramsim2 plot) seem to also imply that instructions are incorrectly been
> removed from the ROB, and then from the cpu's instruction list in cpu.cc,
> which allows more and more instructions to be added to the system (possibly
> from a bad branch).
>
> I appreciate any help in debugging this and further figuring out the root
> problem, just let me know if you need anything else from me.  I don't have
> much more time at the moment to debug, but I can take any advice for quick
> changes and/or additional traces, then send the results back to the list
> for discussion.
>
> Thanks,
> Andrew
>
> P.S. Paul - I did try decreasing the size of the dramsim2 transaction (and
> even command) queue from 512 to 32.  The same instructions problem
> occurred.  It basically just decreased the execution time.
>
> On Wed, Mar 14, 2012 at 2:10 PM, Ali Saidi <sa...@umich.edu> wrote:
>
>> **
>>
>> The error is that there are more that 1500 instructions currently in
>> flight in the system. It could mean several things:
>>
>> 1. The value is somewhat arbitrarily defined and maybe there are more
>> than 1500 in your system at one time?
>>
>> 2. Instructions aren't being destroyed correctly
>>
>>
>>
>> You could try to to run a debug binary so you'll get a list of
>> instructions when it happens or increase the number which may
>> be appropriate for certain situations (but 1500 is quite a few inflight
>> instructions).
>>
>>
>>
>> Ali
>>
>> On 13.03.2012 10:56, Andrew Cebulski wrote:
>>
>> Hi Xiangyu,
>>     I just started looking into this some more.  So at first I thought it
>> was due to updating to a more recent revision, but then I went back to
>> revision 8643, added your patch, built and ran....and now get the error
>> with it too (when running ARM_FS/gem5.opt).  I"m testing now to see if an
>> update to SWIG might have resulted in this error, maybe someone on the
>> mailing list would know if that's possible.  The difference is 1.3.40 vs.
>> 2.0.3, both of which are supported according to the dependencies wiki page.
>> Just for completeness, here's the error from revision 8643:
>> build/ARM_FS/cpu/base_dyn_inst_impl.hh:149: void BaseDynInst::initVars()
>> [with Impl = O3CPUImpl]: Assertion `cpu->instcount
>>    I have not tried running with gem5.debug, so I will be doing that
>> today.  Maybe this is an assertion that is occurring due to an
>> optimization.  That would mean it wouldn't be triggered in gem5.debug since
>> it runs without optimizations.  Have you tested all debug, opt and fast
>> with your tests?
>> Thanks,
>> Andrew
>>
>> On Tue, Mar 13, 2012 at 1:37 PM, Rio Xiangyu Dong 
>> <riosher...@gmail.com>wrote:
>>
>>>  Hi Andrew,
>>>
>>>
>>>
>>> I didn’t see this error in my simulations. May I ask which gem5 version
>>> you are using? I find some of the latest code updates do not comply with my
>>> changes. I am still using the DRAMsim2 patch on Gem5 repo8643, and have run
>>> all the runnable benchmarks in SPEC2006, SPEC2000, EEMBC2, and PARSEC2 on
>>> ARM_SE.
>>>
>>>
>>>
>>> Thank you!
>>>
>>>
>>>
>>> Best,
>>>
>>> Xiangyu
>>>
>>>
>>>
>>> *From:* Andrew Cebulski [mailto:af...@drexel.edu]
>>> *Sent:* Thursday, March 08, 2012 6:52 PM
>>>
>>> *To:* gem5 users mailing list
>>> *Cc:* riosher...@gmail.com; sa...@umich.edu
>>>
>>> *Subject:* Re: [gem5-users] A Patch for DRAMsim2 Integration
>>>
>>>
>>>
>>>
>>>
>>> Xiangyu,
>>>
>>>
>>>
>>>    I've been having an issue recently with the number of instructions
>>> I've been seeing committed to the CPU (I have a separate thread on this).
>>>  It turns out the issue seems to be coming from this patch you created to
>>> integrate DramSim2 with Gem5.  Unfortunately, I've been running with
>>> gem5.fast, not gem5.opt.  So up until now, I haven't been seeing
>>> assertions.  I thought I'd run it with gem5.opt or debug back in December,
>>> but I must not have.  My runs on the Arm O3 cpu fails with this assertion:
>>>
>>>
>>>
>>> build/ARM/cpu/base_dyn_inst_impl.hh:149: void BaseDynInst::initVars()
>>> [with Impl = O3CPUImpl]: Assertion `cpu->instcount
>>>
>>>
>>>
>>> -Andrew
>>>
>>>
>>>
>>>
>>>
>>> Date: Sun, 18 Dec 2011 01:48:58 -0800
>>> From: "Dong, Xiangyu" <riosher...@gmail.com>
>>> To: "gem5 users mailing list" <gem5-users@gem5.org>
>>> Subject: [gem5-users] A Patch for DRAMsim2 Integration
>>> Message-ID: gmail.com>
>>>
>>> Content-Type: text/plain; charset="us-ascii"
>>>
>>> Hi all,
>>>
>>>
>>>
>>> I have a Gem5+DRAMsim2 patch.  I've tested it under both SE and FS modes.
>>> I'm willing to share it here.
>>>
>>>
>>>
>>> For those who have such needs, please go to my website
>>> www.cse.psu.edu/~xydong to download the patch and test it.  To enable
>>> DRAMSim2, use se_dramsim2.py script instead of se.py (for FS, you can
>>> create
>>> by yourself).  The basic idea to enable the DRAMsim2 module is to use the
>>> derived DRAMMemory class instead of PhysicalMemory class.
>>>
>>>
>>>
>>> Please let me know if there are bugs.
>>>
>>>
>>>
>>> Thank you!
>>>
>>>
>>>
>>> Best,
>>>
>>> Xiangyu Dong
>>>
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>> URL: <
>>> http://m5sim.org/cgi-bin/mailman/private/gem5-users/attachments/20111218/f3fdf5da/attachment.html
>>> >
>>>
>>>
>> _______________________________________________
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>>
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to