On 4/4/2016 7:24 AM, Gary Weinhold wrote:
Even if there's no actual performance difference for these instructions, wouldn't the "not setting the CC" possibly improve the pipeline, since the hardware knows the next conditional branch does not have to wait for this instruction to be evaluated for affecting the CC?
Indeed! Avoiding CC "interlock" is exactly why an entire suite of "compare and branch" instructions were created! :-)
However, that should not apply in this situation since SLR sets the CC in exactly the same way as all other logical subtract instructions. Both SR and SLR set the CC. Only the meaning of the bits is different.
For the record, our zHISR instruction benchmark shows SLR to be exactly the same speed as SR, XR, etc. on our z13s (2965).
FWIW, non-grande subtracts appear to be ~9% faster than their grande counterparts. Not sure why...
-- Edward E Jaffe Phoenix Software International, Inc 831 Parkview Drive North El Segundo, CA 90245 http://www.phoenixsoftware.com/
