Ok... I removed the NoEffect versions from the exec context and
changed the *Operand versions to just call their non-operand
versions which cleans up the code some. Alpha has no problem with
these changes, but they do uncover a bug with SPARC that somehow
worked, but it's amazing that it did. The ASI register changes the
type of instruction that is executing (e.g. one ASI is a type of
prefetch vs. load vs. load with a different addressing mode etc).
The types are decoded from the ASI bits in the ExtMachInst which are
inserted by the pre-decoder. The trouble is that these bits are
inserted in "fetch" which is before rename so the IsSerializing
flags don't prevent it from happening. Thus, while the wrasi
instruction can be marked as Serializing fetch already has read the
old value and created the instructions based on the old ASI. When
wrasi commits the old instructions flow through the rest of the
pipeline and that isn't great. It's amazing that all the code
sequences that Gabe used to run on the sparc/o3 model never ran into
this issue. With the current code if you executed a alternating
wrasi and load w/asi instructions you would get some very odd
results. Anyway, the question becomes how do we fix this??? My
current solution is to add an instruction flag that after commit
flushes the CPU. Although, I don't know if this is enough and what
other surprises I'll run into.
Thoughts?
Ali
It seems like our notion of serializing came from the definition Alpha
uses, and sometimes we need a stronger one. X86 is going to have
similar issues in some cases, although I couldn't necessarily list
them for you off hand. The "nuke everything" flag your proposing might
be the best solution because I doubt we'd ever need anything stronger
than that. Maybe you could make the CPU stop fetching too, but I don't
see how that would be useful and it's probably very hard to do.
This also highlights the usefulness of target tests of particular
features like changing the ISA and then immediately using it as
apposed to getting specific workloads to work. The compiler, code
author, etc., only wants to achieve a functional result, and they'll
probably use the same structure and features over and over again since
those work well, are equivalently good to the other options, etc.
There are swathes of x86, which granted is very large, that aren't
implemented at all and Linux boots just fine, but depending on how
picky you are you could say those areas are severely broken. The same
thing could be happening less intentionally elsewhere.
Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev