<snip> I know "it's the one not taken!". But of the B, J, or BR, can they be ordered? I am 99.9% certain that having the branch address in a register is the fastest. But is it significant enough that I should dedicate a register for it? I'm asking because I'm reoptimizing some code which is very heavily used. So heavy, that a 1% improvement is worth while. I am currenly holding 4 different addresses in registers to speed up branch processing in my main loop. This loop is basically compressing blanks using a primitive RLE algorithm. </snip>
I ran a number of tests on our z9, but after a few tries, decided the choice between B, J and BR is trivial. My tests ran through 12 different strings 1 million times, and ran each test 5 times. Using simple CLI, BE, LA, BCT searches was much slower than the SRST instruction. Like the CLCL you are using, SRST scans the data for you in one instruction, and you don't have to manage your own loop. That cut 50% off the reported service units in my tests. When I changed from BE and BNE to JE and JNE, I actually increased service units by about 0.1 percent. When I changed to BER and BNER, I also showed a slight increase - probably because I had to prepare the destination register each time though the million trials. Using TRT to find data doubled the service units. Using TR to mainpulate the data so a second SRST could find the non-blanks increased the times by more than double. Conclusion: Change the scan-for-blank loop to use SRST. You should get a lot of mileage out of that one change. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html