FWIW The 'latest' timing figures I had were (from Amdahl or Circle Computer Group, then from IBM):

   * 1995 Hitachi Skyline (bipolar):  RR operations = 3ns; cached SS
     ops = 6-10ns; non-cached SS ops = 60-80ns.
   * 2000+ IBM (CMOS): RR ops = 20ns.

The 1972 System/370 Model 145 might have done it in microseconds, but current processors do it in nanoseconds.

It is the instruction cache faults and SS ops that degrade performance. RR instructions are on average 20 times faster than SS ones: registers are hard-wired, VS is not.

You can try something like the following to check whether SLR/SR is faster than XR - by coding SR's first, then XR's ... and checking for any 'average' CPU time differences between runs:

        TITLE  'LOOPTEST: CHECK ''S(L)R'' VS ''XR'''
        PRINT  ON,GEN
*
*---------------------------------------------------------------------*
* PROGRAM TO CHECK CPU DIFFERENCES : 'S(L)R' VS 'XR'                  *
*---------------------------------------------------------------------*
*
LOOPTEST CSECT                     START CONTROL SECTION
        EQUREGS                   EQUATE REGISTERS
*
BEGIN    STM   R14,R12,12(R13)     SAVE REGISTERS 14->12
        LR    R2,R15              R2  <-- EP
        USING LOOPTEST,R2         SAY SO
        ST    R13,SAVEBLK+4       BACKWARD POINTER
        LR    R14,R13             COPY CALLER'S R13
        LA    R13,SAVEBLK         R13 <-- MY SAVEAREA
        ST    R13,8(,R14)         FORWARD POINTER
*
        LHI   R4,X'7FFF'          R4  <-- 32767
        B     BIGLOOP             FORCE INSTRUCTION PREFETCH
        DC    64F'0'              INSTRUCTION CACHE FILLER
*
BIGLOOP  DS    0H                  OUTER LOOP
        LHI   R5,X'7FFF'          R5  <-- 32767
*
LITLOOP  DS    0H                  INNER LOOP
        SR    R12,R12             'SR' CHECK   (EITHER DO THIS ...
*        XR    R12,R12             'XR' CHECK    ... OR ELSE DO THAT)
LHI R12,X'7FFF' PUT SOMETHING BACK IN R12 *
SKIPFILL DS    0H                  LOOP UNTIL DONE
        BCT   R5,LITLOOP          DO INNER LOOP
        BCT   R4,BIGLOOP          DO OUTER LOOP
*
RETURN   DS    0H                  EXIT
        L     R13,SAVEBLK+4       CALLER'S R13
        LM    R14,R12,12(R13)     RESTORE REGISTERS
        XR    R15,R15             CLEAR RETURN CODE
        BSM   0,R14               BACK TO CALLER
*
SAVEBLK  DC    18F'0'              18 FULLWORDS SAVEAREA INIT F'0'
*
        END   BEGIN               START FROM BEGIN ONWARDS

My ha'pennyworth.



John Gilmore wrote:

The fact that the System/370 Model 145 instruction timings are later
than those for the System/360 does not enhance their value.  Some
simple souls judge that later is always better; but, while this is
certainly true for a loaf of bread, the inference from such examples
to current instruction timings is faulty.

Special casing/optimizing is pervasive in the imnplementations of
z/Architecture  instructions; and here in particular the two cases

|          SR   Ri,Ri
|          XR   Ri,Ri

and

|          SR   Ri,Rj                i ¬= j
|          XR   Ri,Rj                i ¬= j

are trivially easy to distinguish at the hardware level.  There is
indeed evidence, very persuasive but not unfortunately conclusive
evidence, that the two register-zeroing special cases are identified
and optimized on some and perhaps all z/Architecture models.

Moreover, while reading things into other people's posts is always a
perilous undertaking, it seems to me that some authoritative posts in
this thread have come about as close to saying this as the proprieties
involved make possible.

John Gilmore, Ashland, MA 01721 - USA

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN



----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to