David Bond wrote: > Anyone who thinks that the S/360 instruction timings have any relevance to > how machines work today has no understanding of the last several decades of > processor design. Yes, simple instructions generally execute faster than > more complex instructions. But even that rule of thumb is overshadowed by > pipeline stalls caused by register dependencies, Address Generation > Interlock, address translation, cache effects, branch prediction and other > things.
Yes, but the S/360 timings are the only ones we have. > In the specific case or XR vs SR, both are the same speed on probably any > machine since 1975. But neither can be faster than LHI because XR and SR set > the condition code and LHI does not. Setting the condition code is a > separate suboperation. I used to know how the 360/91 did some of this. I am not sure by now how it did register renaming, and the condition code complicates it, but I think if the condition code is set soon after, before any instruction that tests it, the processor can avoid that problem. The 91 could be much more aggressive in reordering instructions, is it didn't have to worry about page faults. > The difference in the length of XR and SR vs LHI does > not make up for the fact that XR and SR are more complex than LHI. > Furthermore if i-cache has any measurable effect, then alignment or > misalignment of blocks of instructions to i-cache boundaries will almost > always have a bigger effect than individual instruction length. (big snip) If someone really wanted to know the specifics of some instruction use timing, one could take a complex benchmark, such as some of the SPEC programs, Carefully compile it twice, such that one, for example used XR and the other SR to clear registers, then time the difference. Do it a few times to make sure that the difference was reasonably statistically significant. With enough different runs of different programs, one could compute the statistical average execution time for each instruction in actual programs. Hopefully it would average over cache misses. Pipeline stalls probably shouldn't average out, as you want the proper statistical cost in actual use. -- glen ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
