Dear all,
Having chewed on this for a week, I'd like to invite some discussion. It
seems to me that the most general problem outlined by Brad is massaging Ruby
into accepting various flavors of non-timing accesses.

In particular, a working Ruby functional access could:
- Aid in cache warm-up
- Help deal with devices
- Maybe other useful things too (I'm no M5 expert)

The question is: How do we go about implementing these accesses? I get the
impression that functional and timing accesses will occasionally intermix in
the memory system, and that 'the right thing' should happen when
functional/timing accesses mix. This presents a problem for an arbitrary
protocol, as a functional access can/will occur at exactly the wrong time --
e.g., when a block is in a blocking transient state.

The simplest solution seems to be the most obvious as well: quiesce the
memory system for each non-timing access, then handle the functional access
in isolation. This means that each functional access becomes a timing
transient instead of a correctness problem.

Unfortunately, the architecture of Ruby requires events to 'run' in order to
service requests. If functional accesses are to be useful for warming
caches, they have to affect coherence and permissions state in an
approximately correct manner. One way or another, this means fast-forwarding
Ruby.

This is the area around which I'd like to invite discussion.
- Is quiescing Ruby the right way to implement functional access?
- What is the best way to go about fast-forwarding Ruby events?

Regards,
Dan

On Wed, Aug 18, 2010 at 1:18 PM, Beckmann, Brad <[email protected]>wrote:

>  Hi All,
>
>
>
> Yesterday a few of us from AMD and Wisconsin met and discussed the next
> tasks for GEM5.  Specifically there are a couple (possibly more?) new
> graduate students at Wisconsin that are starting to ramp up in the
> simulator.  While we spent some time discussing short-term projects for the
> new graduate students, the majority of the time was focused on a remaining
> steps necessary before Ruby can supply data to M5 cpus while using warmed up
> cache traces.  Below is a summary of our meeting as well as a discussion on
> the possible directions we can take.  I’m send this summary out so that
> other can comment and provide feedback.
>
>
>
> Please let me know if you have any questions,
>
>
>
> Brad
>
>
>
>
>
> Short-term projects for the new Wisconsin graduate students
>
> -          Incorporate work completed/transaction completed metrics to
> Simulate.py
>
> -          Include randomization support into the memory system to
> simulate multiple execution paths
>
>
>
> Tasks required before Ruby can supply data to M5 cpus while using a warmed
> up cache traces.
>
> -          Add support for cache flushes within the protocols
>
> o   This mechanism is required by certain x86 instructions and memory
> types
>
> o   Furthermore it could be leveraged to create checkpoints that include
> both valid main memory data as well as a cache warmup trace with valid
> data.  (More on this topic below)
>
> -          Provide support for allowing certain simobjects to be scheduled
> on the event queue without advancing sim_ticks
>
> o   In order to run a cache warmup trace through Ruby, Ruby requests need
> to be executed and Ruby simobjects need to be scheduled.  However, at the
> end of warmup, the simticks and the rest of the simulator state need to be
> consistent with the loaded checkpoint.
>
> §  Currently, we (at AMD) have an internal patch that achieves this
> functionality by leveraging the fact that Ruby objects still use the Ruby
> eventqueue API.  During this warmup phase, the Ruby eventqueue detaches from
> the M5 event queue and instead uses the old ruby event queue implementation
> to schedule events.  Once the warmup is complete, the Ruby eventqueue
> reattaches to the M5 event queue.  This obviously is not the real way we
> want to do this because eventually we want all Ruby events to directly use
> the M5 event queue API.  That is why I don’t have plans to check in this
> current patch to the public tree.
>
> o   One possible solution would be to identify which events can be
> scheduled during cache warmup and assume all other events can only be
> scheduled during actual execution.
>
> §  I’m interested to know how complicated others believe a solution like
> that would be?
>
> -          Once the two above tasks are complete, we should be able to
> create cache warmup traces with data and also provide valid data in main
> memory.
>
> o   The motivation for providing valid data in both the cache trace and
> main memory is that we maintain the flexibility of allowing each protocol to
> create their own policies for handling dirty data.
>
> o   The specific mechanism would be to record the cache trace using Ruby’s
> current CacheRecorder (I’ve already revitalized this code in GEM5) and then
> use the cache flush mechanism to flush dirty data to main memory before
> checkpointing memory.
>
> -          Add functional access support to Ruby.
>
> o   One possible way to add functional access support to Ruby is to
> quiesce or drain outstanding Ruby requests when a dynamic functional access
> is initiated and actually perform the access after Ruby has been drained of
> any outstanding requests.
>
> §  Therefore, if all cache blocks are in a base state, then Ruby can use
> the protocol independent AccessPremissions on the cache and data blocks to
> determine which blocks should be read and written.  This would only require
> additional set state operations in the Directory sm files to set the
> AccessPremissions for a block.  The cache sm files already do this so the
> impact to the protocols would be minimal.
>
> §  However, the obvious disadvantage to this approach is that every
> functional access will perturb the timing of the system.
>
> o   Another possible approach is to add atomic access support to Ruby and
> utilize this for functional accesses.
>
> §  Basically instead of using events resulting from the receiving of timed
> messages to transition between base states, use function calls.
>
> §  I’m not sure how to make this work, and I fear that it will restrict
> how protocols are defined.  Furthermore, it seems it would be extremely
> complicated to get these atomic accesses to work while timing accesses are
> active in the system and cache blocks are already in some sort of transient
> state.  Overall, I suspect this is not easily feasible, but I could be
> wrong.
>
> o   A third option is to restrict functional writes to only initialization
> and allow dynamic functional reads to be only a best effort and not
> guaranteed to be correct.
>
> §  I believe Steve has already discussed such a possibility with Nate and
> Ali.
>
> §  Functional writes at initialization are trivial to support in Ruby
> since all blocks are in a steady state.  I actually already have an internal
> patch that provides some of this support.  If functional reads are relaxed
> to be a best effort, than Ruby will almost always succeed reading valid data
> using AccessPremissions without quiescing the system and only rarely fail
> because a block is in a transient state.
>
>
>
> Other outstanding tasks to keep in mind
>
> -          Merging stat files
>
> o   We’ve discussed this in several previous emails, but I just wanted to
> reiterate it here.
>
> -          Allow I/O requests/responses and interrupts to flow through the
> Ruby network.
>
> o   Currently these are simply routed to a classic M5 bus, but these
> should be included in the Ruby network.
>
> o   It will take some effort to make this work, but it shouldn’t be too
> hard.
>
>
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
>


-- 
http://www.cs.wisc.edu/~gibson [esc]:wq!
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to