Thanks everyone for your reply. I have a better understanding but still have questions.
Let's consider a time buffer B between two consecutive pipeline stages X and Y. When computing Y's output at cycle t, do we need the signal passed from X at t or t-1 (i.e., the struct in B with index t or t-1)? Similarly, when computing X's output at cycle t, do we need to look at the status of Y at cycle t or t-1 (e.g., whether some hw resource is available for this cycle)? If both answers are t-1, which means the output of any stage only depends on some other stages' output at previous cycle, then I can understand why time buffer can get ride of the dependencies. However, if a stage requires a result from another stage at the same cycle, I cannot see how this works. Maybe hardware never does that -- as it is not actually "parallel" between stages. I am not an expert on hardware and simulator. I really appreciate it if someone help me understand this. Chen On Sat, Jan 26, 2013 at 5:56 AM, Andreas Hansson <[email protected]>wrote: > Hi everyone, > > A bit of Saturday philosophising...you have been warned. > > All Discrete Event simulators that are used for > SystemVerilog/SystemC/Verilog/VHDL etc (all describing inherently parallel > behaviours) solve the problem "properly" by having delta cycles similar to > the timebuffers in gem5. Hence, as Nilay points out, the order of > processing events in the current tick should not matter as the update and > the notification are separated in time. > > There are definitely places in gem5 where the aforementioned methodology > is not employed moment (and the event scheduling order does matter), but > the o3 CPU should largely do it. > > The problem with the method that e.g. SimpleScalar uses is that once > there are loops and converging points in the system due to e.g. flow > control it breaks down. What happened first, the request or the grant (or > the arbitration, another request etc)? Unless you can build what looks like > a spanning tree of dependencies the ordering breaks down. > > Andreas > > From: Mitch Hayenga <[email protected]> > Reply-To: gem5 users mailing list <[email protected]> > Date: Saturday, 26 January 2013 03:07 > To: "[email protected]" <[email protected]>, gem5 users mailing list < > [email protected]> > Subject: Re: [gem5-users] documents on O3 cpu implementation? > > Nilay, > > Ticking pipestages in reverse (and allowing values to propagate in that > order) is a *very* common way to implement processor simulators. I'd almost > call it the standard method. Though gem5 gets around this via the > timebuffer, other simulators do not use a timebuffer/pipe method. For > example, the processor simulator Yale Patt uses in his undergrad class, > SimpleScalar, and PTLSim (if I remember right) call pipestages in reverse > order. > > > Chen, > > I think you are thinking how other simulators work. Specifically look > into how the timebuffer works. Slides 125 and 126 of the ISCA 38 tutorial ( > http://gem5.org/Tutorials#ISCA_38) detail how timebuffers let gem5 get > around a specific order of ticking pipestages. Basically, each stage > indexes by a cycle offset into a structure of passed signals. This lets > each stage write to their output, without worrying about modifying the > inputs being used, on the current cycle, by other pipeline stages. At the > end of the cycle, the timebuffer is "advanced". > > > On Fri, Jan 25, 2013 at 8:06 PM, Nilay <[email protected]> wrote: > >> On Fri, January 25, 2013 6:54 pm, Chen Tian wrote: >> > Hi, >> > >> > Is there any document on O3 implementation? I cannot get my head around >> > the >> > logic where in each cpu tick, fetch stage is first ticked then decode , >> > rename, iew and finally commit. I always thought it should be in reverse >> > order because an earlier stage in the same cycle does not know what >> > resource will be available if processed first. Can somebody please >> explain >> > to me? Thanks. >> > >> >> I am slightly astounded by your comment that the stages should be ticked >> in reverse. We trying to implement something that happens in parallel in >> actual hardware. So, ideally all orders in which the the different stages >> can be processed should result in the same final state. There might be >> things in the code that are specific to gem5's implementation of an O3 cpu >> and hence require stages to processed in certain order. >> >> >> The documentation on this page is probably the best you can find -- >> http://gem5.org/O3CPU >> >> -- >> Nilay >> >> _______________________________________________ >> gem5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
