Re: [gem5-dev] Review Request: CPU: Merge the predecoder and decoder.

Gabe Black Thu, 24 May 2012 00:43:25 -0700

On 05/21/12 15:14, Steve Reinhardt wrote:
> On Mon, May 21, 2012 at 3:48 AM, Gabe Black <[email protected]> wrote:
>
>>  I'll move over to email since this is turning into a discussion more than
>> review feedback. It's not the first thing on my agenda, but fairly high up
>> is going to be simplifying this interface at least a little bit. I did some
>> grepping, and I think the only place getExtMachInst is used is in the
>> checker right now.
>>
> Can that usage be easily eliminated?  That's one of my main comments on
> this patch: you eliminated it in a couple of other places (like
> simple/base.cc, inorder/resources/fetch_unit.cc, and legiontrace.cc) and it
> wasn't clear to me why it wasn't equally easy to eliminate it in the
> checker.  If you did that, then we could get rid of getExtMachInst()
> entirely (or at least make it provate to the decoder).


Sort of. What the checker is doing is somewhat broken. It's trying to
compare the underlying binary version of the two instructions, I presume
to make sure they're the same thing. It does that by reading out the
ExtMachInst with getExtMachInst, but it then assigns that to a MachInst.
Later on it gets the machInst member out of the StaticInst, casts it to
a MachInst as well, and compares that against the first one. There are
at least two problems with this. First, it assumes you *can* cast an
ExtMachInst into a MachInst. This is why there are #ifdefs which exclude
that code for x86. Second, it assumes that by casting down to a
MachInst, you still get relevant information about whether two
instructions are the same. There's a good chance that's true, but
there's no guarantee, and it also throws away the contextualizing state.

One thing we could do to "fix" the problem, which would itself be a bit
broken, would be to compare the StaticInstPtrs from the two decoders.
Because we presumably have a cache that will return the same
StaticInstPtr for the same input, if they're the same instruction they
*should* end up as the same object in the end, but that's a pretty
fragile assumption.

Another option would be to eliminate that check entirely.


>
>
>> extMachInstReady can be renamed to instReady.
>>
> That sounds good.
>
>
>> We may be able to get rid of it entirely and rely on decode() returning
>> NULL if nothing is ready, but I don't want to go quite that far without
>> looking at current uses carefully and considering what that might preclude
>> doing in the future. That would primarily simplify things for everything
>> except ARM and x86 which have to keep track of that anyway.
>>
> I'm fine with leaving that as a future possibility.
>
>
>> The interface I'd most like to get rid of, though, is the one where you
>> can pass an ExtMachInst to the decoder and have it decode it for you. In
>> x86, this means having to check whether the thing you're decoding is the
>> thing you just predecoded in case somebody has some other ExtMachInst lying
>> around they want to toss in. That isn't possible if the decode(pc) style
>> interface is used. This is especially painful in x86 because ExtMachInsts
>> are big and relatively complicated to compare. I have some changes I'm
>> hammering the bugs out of which seem to improve x86 performance in atomic
>> mode on twolf by about 10%, but I lose about 3-4% of that with the
>> ExtMachInst comparison.
>>
>> It looks like the only consumer of the old interface that won't be easy to
>> move over is O3 and its noops. Any ideas how to build a working StaticInst
>> outside of the parser and its execute function duplicating magic? The magic
>> I'm referring to is how it makes a copy of execute for each possible
>> ExecContext based on what CPUs are build it.
>>
> Agreed, this would be great, but it can wait for a later patch.
>
> Would it be possible for each ISA to internally generate a no-op StaticInst
> during initilization (e.g., during the Decoder constructor or init(), by
> passing in a no-op ExtMachInst that only that ISA's Decoder knows about),
> then just have a function that returns a pointer to it?  I.e., if the goal
> is just to get a no-op StaticInst, why not just have the Decoder provide
> that directly instead of tricking it into providing one by passing in a
> special ExtMachInst?

Yes, this is definitely something for a future patch. Each ISA could
define a Nop instruction class which could be exported through the ISA
namespace, and that would work out pretty decently. I'd like for those
sorts of things to be more standalone and not have to have all the ISA
goop in there, especially since they'll be virtually identical between
ISAs, but it's not immediately clear how that would work.

>
> Steve
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Review Request: CPU: Merge the predecoder and decoder.

Reply via email to