Quoting Steve Reinhardt <[email protected]>:

On Fri, Oct 22, 2010 at 6:06 PM, Steve Reinhardt <[email protected]> wrote:
On Fri, Oct 22, 2010 at 3:59 PM, Steve Reinhardt <[email protected]> wrote:
I'd still really encourage you to work on cutting out the middleman
and find a way to go straight from raw bytes to StaticInsts via a
cache.

Just to be clear: what I mean is that we need a way to do the "tag
check" on the PC-indexed decoded page cache using raw bytes, so we can
determine hits there without invoking the predecoder.  If the decode
page cache misses and we have to repopulate it, then how we manage the
"backing" decode cache is probably not that big of an issue, and
probably does require going through the predecoder since otherwise we
won't know necessarily how long the undecoded instruction byte
sequence is.

In fact the main reason we need a cache there at all is so that we can
re-use the same StaticInst in multiple places; I'm not sure it really
saves that much time relative to doing the full decode.  (Probably
some, but I don't know how much.)

Just to reiterate this message: Nate's radix tree may or may not be a
good idea for directly looking up StaticInsts based on raw byte
sequences, but that's not what I was talking about; sorry for being
unclear.

I think what we really need is to replace or augment the ExtMachInst
that's currently stored in each StaticInst with the raw machine
instruction plus context info.  Then when we get a StaticInst from the
PC-based decode page cache, we can validate it by comparing the raw
machine instruction with the byte(s) we fetch and the current context,
without invoking the predecoder.  Once we get the predecoder out of
the path of decode page cache hits, I'm guessing the performance of
the predecoder itself won't matter so much anymore.

Steve
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev


Yeah, I think I get what you're saying. If we're gathering up an ExtMachInst at some point (and we will still have to, pretty much no matter what) then we might as well stick it in the StaticInst. It can be handy for debugging if nothing else.

Also I wanted to mention that while Nate's radix tree might not fit with x86 universally, if there were per mode (pre)decoders with per mode caches then it would work better. Then you'd use the cache that fit with whatever the circumstances were which eliminates the ambiguity. That would have the potential to really pump up the memory overhead for the cache since there could be a lot of redundancy, but it still might not be that bad since that wouldn't need to scale with the simulation, just the diversity of instructions being executed.


On a semi-tangent, I'm mulling over the benefits of turning the ISA object into the new ISA namespace. The ISA namespace would still exist, but it would be for the actual implementation behind the scenes. All the bits that would be exposed to the outside world (ie. everything not the ISA) would be brought into the class definition with typedefs, using directives, etc. That would make it a little clearer what was there just because it's handy for that particular ISA, and what functions are part of the established interface to the ISA that every ISA needs to implement. It would also allow templating classes on the ISA which we can't do with the current namespace. The ability to define things in more than one file would still be preserved because things would be defined as part of the namespace like they are now and just brought into the ISA object to export.

The semi of semi-tangent comes in because the decode functionality would then be local to the ISA state so it could be more easily switched in and out, virtual or not virtual, as appropriate. It would also allow the predecoder and the decoder to coordinate with each other to share StaticInst cache information.

I'm sure there are downsides to all this, one obvious one being that we'd loose some of the current isolation between different types of ISA header files. These tend to include each other pretty freely, though, so it might not be that different.

I haven't even really decided if I like this idea for sure, but it sounded interesting enough where I thought I'd mention it.

Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to