Hi Gabe,

On 21/01/2018 06:34, Gabe Black wrote:
Hi folks. ExtMachInst is a type which is defined per ISA, and is *almost*
not used outside of ISA specific code. The three uses for it that I see
right now are as a type in the decode cache when decoding instructions, the
protobuf based instruction tracer, and the MinorCPU.

The decode cache is tightly bound to the decode function which is already
highly ISA dependent, so I don't forsee a big problem pushing those
together and handling them at the same time.

Sounds reasonable. Except for detailed arch-specific timing models and
debugging, there should be little need to expose the encoding in the
pipeline.

A simple solution would be to just use a 64-bit ExtMachInst
representation in the StaticInst interface and just stub it for x86.
That would retain existing functionality and wouldn't be particularly
intrusive.

The protobuf tracer unfortunately stores a raw 32 bit integer value to
record what the instruction was, and this field is required. I'd like to
make this field optional, and to also add an optional byte array which can
be used to store instructions which don't fit neatly into a 32 bit integer.
This shouldn't affect the efficiency of the protobuf format as it's used
today, it would just enable its use in more places. Also, the protobuf
tracer would need to call, say, a virtual function to encode the
instruction in whatever way is appropriate. Alternatively, all instructions
could be encoded as an array of bytes which would make that interface more
generic. It would be, I think, one byte less efficient storage wise when
storing instructions that are consistently 32 bits, but would be more
efficient for 16 bit instructions and equivalent for ones that would be
forced to use an array of bytes anyway.

Another option would be to completely remove the encoding from the Inst
message and add a add architecture-specific Encoding messages (possibly
using PB's extensions mechanism to extend Inst). Unfortunately, this has
the potentially make the code ugly if we want to maintain PB as an
optional dependency.

The MinorCPU really should not be reaching into instructions and looking at
their bytes. The idea that there's necessarily some sort of mask and
compare that can be done to determine what type of operation an instruction
is is very fragile, not only because it assumes those operations make
sense, but also that the instructions actually have such a pattern. This is
evidenced by the fact that there's already an #ifdef around this bit of
code.

We actually make extensive use of this functionality. Take the HPI core
model in configs/common/cores/arm as an example. In order to get good
correlation against existing in-order Arm cores, we use the bit matching
functionality to route instructions to FUs and assign latencies to
individual instructions. This has allowed us to achieve much tighter
correlation against existing in-order cores than what we could achieve

Latencies and FU routing has traditionally been done using instruction
classes in gem5. However, because different cores classify instructions
in slightly different ways, this would make the ISA code timing model
specific. This is clearly undesirable.

Is the MinorCPU used? The easiest thing to do would be to nuke it, but if
anyone is using it or is particularly attached to it we should try to keep
it. Alternatively, the interesting bits could be returned by a virtual
function on the StaticInst object. This still assumes that there's some
sort of magical pattern to mask and compare against, but at least it will
compile for everything and hides the ExtMachInst type within the ISA.
Really, since we're already determining whether an instruction is a branch,
integer operation, etc., we should just use those flags and not try to
redecode the instruction in the CPU. That also leaks instruction encoding
into the configuration which makes that more fragile and ISA specialized as
well. Dealing with the MinorCPU is the part of this I most want feedback on.

The minor CPU is used a lot. It's the model you need to simulate a small
core in a bitLITTLE configuration. Unfortunately, we simply can't get
good enough correlation against existing designs if we use just the
instruction flags, so keeping the bit matching magic is a hard
requirement from our sides.

Cheers,
Andreas


IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to