This looks pretty good to me.  There are a few things where I don't
disagree with what you're doing but I'd probably describe it differently,
and some other comments below.  Some of the comments are details we already
discussed that you're not explicitly mentioning, and I want to get them
down for completeness, and to see whether you're rejecting those ideas or
just neglecting them.

On Tue, May 1, 2012 at 1:11 AM, Gabe Black <[email protected]> wrote:

> Building on that, I want to make the decode function a member of the
> ISA's Decoder class. Then it can use information stored in the object as
> context when decoding, preventing that information from having to be
> copied into every ExtMachInst. To handle the loss of context when using
> the ExtMachInst as a hash key or a tag in the page cache, the caches
> themselves would be selected based on the particular current
> configuration of the contextualizing state.
>

I think it should be a per-ISA decision whether to use per-context caches
or to include the context in the cache lookup.  This distinction isn't
visible outside the Decoder object (right?) so it shouldn't matter which
way a particular ISA decides to go.  Also the ISA can use different
policies for different caches (e.g., page cache vs hash cache).

In particular, there's no need to change Alpha to use per-context caches,
IMO.


> Because the decoder can now contain state, it no longer has to read the
> global FS/SE bool when decoding instructions, it can use its own local
> copy. That means different decoders in the same simulation can operate
> in different modes at the same time.
>

I can't think of a use for this off the top of my head, but more
flexibility is always good.  It does make me wonder though: does this help
at all in moving us toward multi-ISA simulations?


> To address the problem of the predecoder running all the time, I want to
> put a cache in front of it which knows three things about the
> instructions it's already seen, where they started, how long they were,
> and what bytes went into them. As the MachInsts come in, if the current
> PC matches one that was already seen, and the bytes that were seen last
> time are the same as the incoming bytes, then the whole decode process
> can be short circuited and the StaticinstPtr that was generated the last
> time can be returned again. Steve suggests this may make one of the
> other caches redundant, like the page cache. I think this is quite
> possibly true.
>
> The pipeline would then look like this:
> (MachInsts) -> MachInst cache[ctx] -> predecoder -> (ExtMachInsts) ->
> ?page cache? -> ExtMachInst hash[ctx] -> decode function.
>

My argument was that we should just move the page cache in front of the
predecoder and make it work on MachInsts rather than ExtMachInsts.
 Assuming that the page cache in your diagram actually is redundant, and
that "MachInst cache[ctx]" is a page-based cache (which seems to match your
description), then I don't think we're disagreeing.  I think my description
is clearer though ;-).


> Here [ctx] is a reminder that the specific instance of the cache/hash
> has been selected based on the current contextual state.
>

Again, I would make that ISA-dependent and not baked into the structure.


> This new layout makes good sense for x86, but I don't want to make
> things less efficient for ISAs that have fixed size instructions or
> trivial predecoders. On the Alpha side, it may well be more efficient to
> leave out the MachInst cache and continue to use the page cache. If we
> do keep the MachInst cache, then it would be a waste to have machinery
> in there that stored/checked the size of the incoming instruction and
> moved bytes around to compare only the relevant parts. To me this all
> sounds like these pieces should be interchangeable and replaceable so
> that we can have a standard toolbox of steps and apply them
> intelligently per ISA.
>

Sounds good, though I would say "On the Alpha side, it may well be more
efficient to leave the page cache where it is instead of moving it in front
of the predecoder".  At some level that's equivalent, but I think that
describing it that way might cause you to view the organization a little
differently.  In particular, your toolbox might provide a page cache
component that can be used with either fixed or variable-length data, and
x86 would instantiate N of them per-context before the predecoder and Alpha
would have one of them post-predecoder.  I think that perspective might
lead to a different design than thinking that your toolbox needs to support
a pipeline that has one kind of cache before the predecoder and a slightly
different kind after the predecoder with both of them being optional, and
typical ISAs choosing one or the other... at least that's the picture I'm
getting from your continued discussion below.


> Along those lines, it would be nice to organize the decode flow as a
> call stack. [...] The tricky part is
> that A and B and C, etc., need to be plugged into each other at compile
> time, probably with templates, and because each step has somewhat
> different function signatures it gets a little more complicated. This
> may not be as complicated as I imagine it is, but a clean implementation
> hasn't dropped out of this yet.
>

My gut reaction (without thinking through in a lot of detail) is that
you're overcomplicating this.  I think having a library of components like
the page cache and the hash-based cache is useful, but to me the big
advantage of the Decoder object being a black box is that a particular ISA
can do whatever it wants inside.  For example, maybe there's no need for
the ExtMachInst struct at all in some ISAs, if the output of the predecoder
is never used without also knowing the external context it depends on.
 Thus trying to formalize what a decode pipeline has to look like and
trying to support that via some template scheme strikes me as overkill.  I
think in your latter part there you've identified a useful pattern, but I
don't see a big gain from formalizing it.

Like I said, this is just a gut reaction, maybe there's more to it than I'm
seeing.

Steve
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to