An explanation for branch performance?

David Crayford Fri, 29 Apr 2016 05:42:07 -0700

We're doing some performance work on our assembler code and one of mycolleagues ran the following test which was surprising. Unconditionalbranching can add significant overhead. I always believed thatconditional branches were expensive because the branch predictor neededto do more work and unconditional branches were easy to predict. Doesanybody have an explanation for this. Our machine is z114. It appearsthat it's even worse on a z13.


Here's the code.


I wrote a simple program - it tight loops 1 billion times


         L     R4,=A(1*1000*1000*1000)
         LTR   R4,R4
         J     LOOP
*
LOOP     DS   0D                  .LOOP START
         B     NEXT

NEXT     JCT   R4,LOOP

The loop starts with a branch ... I tested it twice - when the CC ismatched (branch happens) and when it is not matched (falls through)


1. When the CC is matched and branching happens, CPU TIME=2.94 seconds

2. When the CC is not matched the code falls through, CPU TIME=1.69seconds - a reduction of 42%


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

An explanation for branch performance?

Reply via email to