Thanks everyone for your reply. I have a better understanding but still
have questions.

Let's consider a time buffer B between two consecutive pipeline stages X
and Y. When computing Y's output at cycle t, do we need the signal passed
from X at t or t-1 (i.e., the struct in B with index t or t-1)?  Similarly,
when computing X's output at cycle t, do we need to look at the status of Y
at cycle t or t-1 (e.g., whether some hw resource is available for this
cycle)?  If both answers are t-1, which means the output of any stage only
depends on some other stages' output at previous cycle, then I can
understand why time buffer can get ride of the dependencies. However, if a
stage requires a result from another stage at the same cycle, I cannot see
how this works. Maybe hardware never does that -- as it is not actually
"parallel" between stages. I am not an expert on hardware and simulator. I
really appreciate it if someone help me understand this.

Chen


On Sat, Jan 26, 2013 at 5:56 AM, Andreas Hansson <[email protected]>wrote:

>  Hi everyone,
>
>  A bit of Saturday philosophising...you have been warned.
>
>  All Discrete Event simulators that are used for
> SystemVerilog/SystemC/Verilog/VHDL etc (all describing inherently parallel
> behaviours) solve the problem "properly" by having delta cycles similar to
> the timebuffers in gem5. Hence, as Nilay points out, the order of
> processing events in the current tick should not matter as the update and
> the notification are separated in time.
>
>  There are definitely places in gem5 where the aforementioned methodology
> is  not employed moment (and the event scheduling order does matter), but
> the o3 CPU should largely do it.
>
>  The problem with the method that e.g. SimpleScalar uses is that once
> there are loops and converging points in the system due to e.g. flow
> control it breaks down. What happened first, the request or the grant (or
> the arbitration, another request etc)? Unless you can build what looks like
> a spanning tree of dependencies the ordering breaks down.
>
>  Andreas
>
>   From: Mitch Hayenga <[email protected]>
> Reply-To: gem5 users mailing list <[email protected]>
> Date: Saturday, 26 January 2013 03:07
> To: "[email protected]" <[email protected]>, gem5 users mailing list <
> [email protected]>
> Subject: Re: [gem5-users] documents on O3 cpu implementation?
>
>  Nilay,
>
>  Ticking pipestages in reverse (and allowing values to propagate in that
> order) is a *very* common way to implement processor simulators. I'd almost
> call it the standard method.  Though gem5 gets around this via the
> timebuffer, other simulators do not use a timebuffer/pipe method.  For
> example, the processor simulator Yale Patt uses in his undergrad class,
> SimpleScalar, and PTLSim (if I remember right) call pipestages in reverse
> order.
>
>
>  Chen,
>
>  I think you are thinking how other simulators work.  Specifically look
> into how the timebuffer works.  Slides 125 and 126 of the ISCA 38 tutorial (
> http://gem5.org/Tutorials#ISCA_38) detail how timebuffers let gem5 get
> around a specific order of ticking pipestages.  Basically, each stage
> indexes by a cycle offset into a structure of passed signals.  This lets
> each stage write to their output, without worrying about modifying the
> inputs being used, on the current cycle, by other pipeline stages.  At the
> end of the cycle, the timebuffer is "advanced".
>
>
> On Fri, Jan 25, 2013 at 8:06 PM, Nilay <[email protected]> wrote:
>
>>  On Fri, January 25, 2013 6:54 pm, Chen Tian wrote:
>> > Hi,
>> >
>> > Is there any document on O3 implementation? I cannot get my head around
>> > the
>> > logic where in each cpu tick, fetch stage is first ticked then decode ,
>> > rename, iew and finally commit. I always thought it should be in reverse
>> > order because an earlier stage in the same cycle does not know what
>> > resource will be available if processed first. Can somebody please
>> explain
>> > to me? Thanks.
>> >
>>
>>  I am slightly astounded by your comment that the stages should be ticked
>> in reverse. We trying to implement something that happens in parallel in
>> actual hardware. So, ideally all orders in which the the different stages
>> can be processed should result in the same final state. There might be
>> things in the code that are specific to gem5's implementation of an O3 cpu
>> and hence require stages to processed in certain order.
>>
>>
>> The documentation on this page is probably the best you can find --
>> http://gem5.org/O3CPU
>>
>> --
>> Nilay
>>
>> _______________________________________________
>> gem5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
>
> _______________________________________________
> gem5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to