On Fri, Sep 10, 2010 at 10:48 AM, Ali Saidi <[email protected]> wrote:
>
> On Fri, 10 Sep 2010 08:55:47 -0500, Ali Saidi <[email protected]> wrote:
>> We just changed the micro-ops in ARM so that any of the register
>> updating loads/stores were micro-coded, calculated the new address,
>> placed it in a temp register, loaded from the temp, and the moved the
>> temp into the real register. This solution worked fine for both O3 and
>> the simple cpus. Since ARM has some interesting options for PC
>> relative loads and loading the PC via a load, I think you can pretty
>> much accomplish anything with three carefully crafted micro-ops.

I think I mentioned this before, but it seems reasonable to me to do
store-updates in a single uop but require two uops for load-updates,
since a real pipeline would likely support one but not two register
writes per uop.  I realize the asymmetry is a little weird, but that's
life.  (Similarly, a store-update with a reg+reg addressing mode might
require two uops since you need to read three regs and not just two...
I vaguely recall that PowerPC might even have the restriction that
store updates can't use reg+reg addressing, even though it's available
for other memory accesses.)

That would still require some fix for Gabe's
completeAcc-not-getting-called-in-O3 problem.  I'm a little confused
because I believe that the completeAcc callback is used for
store-conditionals to write back the success flag, which means that
(1) his solution of calling it right away won't work and (2) it must
be getting called somewhere in O3 since Alpha store-conditionals do
work there.

Is there a reason not to update the register in initiateAcc?

All that said, I'm not necessarily opposed to piggybacking on the ARM
solution and just using multiple uops anyway.  The patent is a great
guideline, but we shouldn't feel overly constrained by it.

Steve
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to