Quoting Jack Whitham <[email protected]>:

> On Wed, Jun 24, 2009 at 09:30:11PM -0400, Gabriel Michael Black wrote:
>> > Also, is it the case that you can write the PC with any opcode?  I vaguely
>> > recall hearing that that was deprecated at some point.
>> >
>>
>> I don't know. I saw some places in the ARM manual where it said some
>> uses of R15 cause undefined behavior. I don't know where to find a
>> concise description of where it is and isn't allowed.
>
> I've also been looking at this problem. I want to get the O3 CPU working with
> the ARM ISA, and I've been making some progress. The two sticking points are
> predicated execution and the PC's "pseudo-GPR" nature. So, hopefully I
> can contribute something to your discussion.
>
> As I see it, the big problem is when the PC is a destination for
> an operation that normally produces a GPR. As a source, you
> can always get the right behaviour quite easily (as you said, + 8):
>> >> Example reading code:
>> >> Rn = (RN == PCReg) ? xc->readPC() : xc->readIntRegOperand(this, 0);
>
> I've gone through the ARM ARM, looked at all the integer instructions,
> and the writes to PC *always* fall into one of the following categories:
> 1. Performed via an explicit branch instruction (b, blx, etc.)
> 2. Performed by load multiple (e.g. ldmia).
> 3. Performed by an ALU or load operation that writes to Rd, with Rd =
>    15. (Rd is machine code bits 15..12 as defined in operands.isa).
>
> Case 1 is easy, same as any other CPU. Case 2 is only slightly more
> difficult, because you can generate a special "write to PC" uop
> resembling a conditional branch.
>
> Case 3 is the difficulty. But it's not so bad, because-
> * The PC is *always* written via Rd. Not Rs, Rm, etc.
> * No other registers are updated (except Cpsr).
>
> Here are the instructions that fall into case 3:
> ADC     ADD     AND     BIC
> CLZ     EOR     LDR     LDRB
> LDRBT   LDRH    LDRSB   LDRSH
> LDRT    MOV     MVN     ORR
> RSB     RSC     SBC     SUB
>
> It is possible to specify r15 as the output of some other instructions
> (e.g. MUL, MLA), but the results are not defined by the ISA.
>
> Clearly, there are a number of ways to handle this. Any of these
> instructions could write the PC, creating a branch.
>
> Here is my suggestion for how to handle case 3 instructions. I am
> relatively new to M5, so this may not be the best way, but some
> preliminary tests suggest it will work. First, the control flags need
> to change if the instruction writes to r15:
>
>     // base_dyn_inst.hh:
>     bool isControl()      const
>     { return staticInst->isControl() || isPCLoad(); }
>     bool isIndirectCtrl() const
>     { return staticInst->isIndirectCtrl() || isPCLoad(); }
>     bool isCondCtrl()     const
>     { return staticInst->isCondCtrl() || isPCLoad(); }
>
> The new function isPCLoad() returns true if the ISA is ARM
> and one of the destination registers is 15 (before renaming):
>
>     bool isPCLoad(void) const
>     {
> #if THE_ISA == ARM_ISA
>         for (int i = 0; i < numDestRegs(); i++) {
>             if (staticInst->destRegIdx(i) == TheISA::PCReg) {
>                 return true;
>             }
>         }
> #endif
>         return false;
>     }
>
> This way, we don't have to change the ISA definitions for case 3
> instructions. They automatically become branches if rD = 15.

These comments apply to all the above. While I believe you that the  
manual says the behavior is undefined if you write to R15 in these  
other cases, that doesn't mean that software people will want to run  
will never attempt to do that and expect certain behavior. It could be  
that the software is old or poorly written (or its the compilers  
fault), but if at all possible I'd like to support those cases as much  
as we reasonably can. If certain cases turn out to be too unreasonable  
to deal with for some definition of reasonable, we can just let those  
go until and if they cause problems. I believe we'll find a general  
mechanism that will handle those cases almost as easily as the defined  
ones.

As far as behaving differently for regular instructions (add, etc.)  
that write to R15, we can detect that happening in the decoder and set  
the right flag right there. That wouldn't be that hard to do, and if  
we can keep isa specific code out of the CPU that would be best.


>
> However, this still leaves the problem of operations that load the new
> PC from memory, such as:
>     ldreq   pc, [sp], #4
>
> On O3, these are tricky because the new PC is not known until the
> load completes (LSQUnit::writeback), but branch mispredictions are only
> recognised when the load starts (LSQUnit::executeLoad). My suggestion
> for these is to generate a branch misprediction in LSQUnit::writeback,
> if (inst->isPCLoad()&&inst->mispredicted), and don't generate a branch
> misprediction in the regular place if inst->isPCLoad().


The right thing to do here might be to microcode loads that we know  
are going to act as branches. The first microop would load, and the  
second would actually perform the branch. That would require a little  
work in the C++ side of the isa description, but it wouldn't be -that-  
painful. That would hopefully avoid putting any new ISA specific code  
in the CPU.

Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to