http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html

The Athlon (and Opteron) uses some clever tricks to handle Register Renaming and OOO processing  (Out-Of-Order) which allows them to shave some 25% of the integer pipeline. The design allows for a simple and fast scheduler that doesn't need special hardware to handle miss-scheduling caused by cache-misses.

 

Register renaming is used to eliminate "False Dependencies" which limit the number of Instructions Per Cycle (IPC) that a processor can execute.  False Dependencies are the result of a limited number of registers. A register that holds an intermediate result needs to be re-used soon for another, maybe unrelated, calculation. Its value is then overwritten and not available anymore. The instruction that overwrites it must always wait for the instruction that needs the previous result.

 

This serializes the execution of the instruction and limits the IPC. This is especially true for an architecture like x86 which has a very small number of registers. The example below shows how register renaming can eliminate false data dependencies: Register rC is overwritten by the 3rd instruction, so the 3rd instruction has to wait for the 2nd instruction: a False Dependency. With register renaming we can use an "arbitrary" large register file. There is no need to re-use rC(r3)  We can simple use another available register instead, register r7 in this case. The basic rule is that all of the instructions that are "in-flight" are given a different destination register. (single assignment)

 

Non Renamed:   rC=rA+rB; rF=rC&rD; rC=rA-rB;

Renamed:       r3=r1+r2; r6=r3&r4; r7=r1-r2;

 

     1.10   Renaming the Integer Registers

 

Opteron has sixteen 64 bit architectural integer registers. Not visible for the programmer are eight more 64 bit scratch registers used to store intermediate results for micro code routines that handle more complex x86 instructions.  The Athlon family of processors handles Register Renaming in the simplest possible way. Which is a compliment because it often takes a lot of smart thinking to figure out how to do things in the simplest way!  People only rarely succeed in this ...

 

As we said, each instruction in flight needs a different destination register.  The total number of renamed registers must be equal or larger then the sum of all instructions-in-flight plus the architectural-registers.  The maximum number of instruction in flight is 72, add everything together then you need 96 "renamed registers".  Two different structures are used to maintain these registers. The instructions-in-flight results are maintained by the result fields of the 72 entry Re-Order Buffer ( ROB ) and the architectural-registers are maintained by the  "Integer Future File and Register File". (  IFFRF )

   

 

Re-Order-Buffer Tag definition

    

wrap

 bit

Instruction In Flight Number

re-order buffer index  0...23 

sub-index  0..2

bit 7

bit 6

bit 5

bit 4

bit 3

bit 2

bit 1

bit 0

 

 

This configuration allows for a very simple renaming scheme which takes -zero- cycles...  Each instruction dispatched from one of the three decode lanes gets a "Re-Order Buffer Tag" or "Instruction In Flight Tag" consisting of:

 

1)   A sub-index 0,1 or 2 which identifies from which of the three lanes the instruction was dispatched.

2)   A value 0..23 that identifies the "cycle" in which the instruction was dispatched. The "cycle counter" wraps to 0 after reaching 23.

3)   A wrap bit. When two instructions have different wrap bits then the cycle counter has wrapped between the dispatches. 

 

     1.11   The  IFFRF:  Integer Future File and Register File

 

 

This register file is used to maintain the 16 architectural registers and the 8 temporary scratch registers. It has two entries for each of the 16 architectural registers. One of the two can be viewed as the actual register as seen by the programmer. It gets its value when the instruction that produced it has "retired"  An instruction is retired when it is sure that no exception or branch-miss-prediction has occurred and all preceding instructions have been retired as well. The value of the register is said to be "non-speculative". 

 

 

40 entry Integer Future File and Register File:   IFFRF

    

16 entries

Retired Architectural Register Values

16 entries

Speculative  Register Values:  "Future File"

  8 entries

Temporary Registers

 

 

Instruction-In-Flight and their results may be cancelled and discarded as long as they have not been retired.  Cancellation can be a a result of a proceeding instruction that caused an exception or a by a branch-miss-prediction. Instructions-In-Flight are in principle always speculative. The results stay speculative even if the instruction has finished. The results only become non-speculative at retirement when the retirement logic determines that no exception has occurred.

 


Reply via email to