Re: [m5-dev] stores that update their base registers

Gabriel Michael Black Mon, 18 Oct 2010 20:16:05 -0700

Quoting Gabe Black <gbl...@eecs.umich.edu>:

Gabe Black wrote:

This has come up in ARM and also in X86 with its STUPD (store with
update) microop. The problem has been updating the base register when,
one, the instruction may fault after initiateAcc and the initial value
is lost, and two, completeAcc isn't called by O3. The problem is
compounded by the fact that O3 can speculatively update the register and
recover the old value if there's a fault, and the simple CPUs can't.


What if we changed the instructions that update the base to update the
base in initiateAcc and store the old value in an architecturally
invisible register? Then, if the instruction faults for whatever reason,
the fault object can know it needs to restore the old value of the base
before vectoring into the fault handler. If the instruction completes
normally the value of the base will be updated for consumption by later
instructions, and the value of the backup register can be ignored. I
don't -think- there would be performance distortions from this since the
actual number of sources/destinations doesn't matter, and this would be
at least a little more realistic and simulator level performant than
splitting things into microops.

This would be pretty easy to implement, I think, and would be entirely
contained in existing mechanisms in the ISA, so there isn't really any
question there. What I'd like to know is whether people think this is a
reasonable approach to this problem in the first place.

Gabe
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Hmm. This probably won't work. O3 would revert to the old value of the
backup register, I think, and the fault object would clobber the
correctly restored base register with that old value.

Gabe
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

OK, so, ideally we'd want to put the any register updates for a storein initiateAcc since they're not dependent on memory, that way otherinstructions can use them sooner, O3 doesn't run completeAcc, etc.,but that doesn't work because SimpleCPU and O3 are inconsistent as faras the commit points an instruction goes through as it runs. InSimpleCPU state is updated live, so every setIntReg is a commit point.In O3, the instructions are updating a dyninst so the commit point isthe actual commit stage. For regular instructions this is masked bywriting back results at the end of the execute function and only ifthere's no fault, but for memory ops with initiateAcc and completeAcc,all possible faults haven't happened by the point the instructionloses control. The actual commit points of the instruction thenintroduce functional differences and break the consistency of theinstruction model.

The problem seems to be that O3 is smarter than SimpleCPU, or reallythat O3 is more capable at undoing things that shouldn't havehappened. One solution might be to make SimpleCPU smarter, but whydon't we make O3 selectively dumber?

We might be able to solve this problem if we change the semantics ofinitateAcc, the access, and completeAcc for stores. We could do thesame for loads for symmetry, but I won't push for it because of thearguments Steve made about base updating loads and the fact that itmight not work as well there. Anyway, instead of trying(unsuccessfully) to string intiateAcc, the access, and completeAcc,together as one large atomic operation, lets make them all separate.Once initiateAcc finishes, if it doesn't return a fault it commits. Ifthe access faults later that's handled, but the state written to ininitiateAcc is already permanent. If something needs to be rolledback, initiateAcc needs to set up backup state like I talked about inmy earlier email. completeAcc would then never be called for stores.

This is nice because it means all CPUs can behave the same, we get allthe benefits of writing back state in initiateAcc, there's nosimulated performance overhead as far as I can see, the impact onexisting ISA code is minimal, and (I hope) it shouldn't be that hardto implement or carry that much baggage for later.

So what do people think of this second version? Hopefully we don'tneed a third :-).


Gabe
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] stores that update their base registers

Reply via email to