On Fri, Sep 10, 2010 at 1:07 PM, Gabriel Michael Black
<[email protected]> wrote:
>> That would still require some fix for Gabe's
>> completeAcc-not-getting-called-in-O3 problem.  I'm a little confused
>> because I believe that the completeAcc callback is used for
>> store-conditionals to write back the success flag, which means that
>> (1) his solution of calling it right away won't work and (2) it must
>> be getting called somewhere in O3 since Alpha store-conditionals do
>> work there.
>
> I don't know about store conditional, but I didn't see anywhere completeAcc
> was called on the path stores take in the load/store queue. There may be
> some other mechanism to handle that, but again I don't really know how those
> work in O3. If they -do- work using some other mechanism, then that would
> take care of (1), right?

Here's the answer, from line ~1244 of iew_impl.hh:

            } else if (inst->isStore()) {
                fault = ldstQueue.executeStore(inst);

                // If the store had a fault then it may not have a mem req
                if (!inst->isStoreConditional() && fault == NoFault) {
                    inst->setExecuted();

Not sure what this means about the right fix, but it does explain
how/why things work today.  I will go out on a limb and say that this
does seem like a really crufty workaround; it would be nice to treat
all stores the same way.

>> Is there a reason not to update the register in initiateAcc?
>
> Faults. If the access faults for some reason, you have to undo the register
> updates in initateAcc for the instruction to appear atomic.

Of course, duh, I was stuck thinking about O3 and forgot about other models.

>> All that said, I'm not necessarily opposed to piggybacking on the ARM
>> solution and just using multiple uops anyway.  The patent is a great
>> guideline, but we shouldn't feel overly constrained by it.
>
> This is true, and I'm sure there are other factors that will push us out of
> sync with real implementations (like exactly how instructions are
> microcoded, for instance), but I'd still feel warmer and fuzzier if we could
> get the store with update to work as specified.
>
> Also a less defensible reason I'd like to make it work is that it'd be
> easier to get O3 to cooperate (I estimate, perhaps incorrectly) than to go
> through all the microcode and update everything to use two microops. I
> realize that's just me being lazy, but if we end up avoiding that it'd be
> nice.

This is getting off topic, but I think I've mentioned before that I
would really like to see the microcode at least semi-auto-generated
from templates... basically if you have templates that support
different addressing modes, plus a way to plug in the uop opcode(s)
that represent the computation, you could support most of the common
instructions much more simply and compactly, and you could handle
situations like this just by updating the appropriate template.  I
realize that's a pretty bug change, but I suspect this won't be the
last time this kind of thing comes up.

Steve
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to