taking the 13.2 version:
stmg%r11,%r15,88(%r15)
aghi%r15,-168
lgr%r11,%r15
lgr%r1,%r2
st%r1,164(%r11)
l%r1,164(%r11)
ms%r1,164(%r11)
lgfr%r1,%r1
lgr%r2,%r1
lmg%r11,%r15,256(%r11)
br%r14
you can easy see that the authors have not studied their PoPs long enough
the following sequence yields the very same results with only half the number of instructions. I let people who know more about pipeline delays comment on the other effects.
ST R1,84(R15)
LGFR R1,R2
MSGR R1,R2
LGR R2.R1
BR R14
and in case the location in memory that is addressed in line 6 ,7 and 8
is only for working purpose- even the first line in my code can be omitted.
Martin
Produces the same result


On 07.05.24 19:01, Phil Smith III wrote:
See code produced by different compilers. (Search for "s390x" in the "choose 
compiler" box to find the Z compilers)
https://godbolt.org/

What strange hobbies some people have! (I'm including myself there)

Reply via email to