> > Can I infer from this that XR/XGR, all else being equal, is to be > preferred (slightly) over LHI/LGHI? > > If so, why might that be? I would have thought the one that > doesn't touch the CC would be "more efficient" than the one that does. > > Or am I misreading your statement? > > > Instruction fetch bandwidth? Perhaps "don't care about the CC" > implies that no instruction testing CC might be in the pipeline > so lockout is not a concern.
Response from engineering: It all depends how complicated you want to scheduling algorithm to be. If cc being set to 0 is not a problem for scheduling surrounding code, then XR(XGR) will likely be better because of its shorter instruction lengths. On cc-usage, if it may be more optimal to keep a sequence of CC setting instruction .. clear register .. cc using instruction; then L(G)HI can be used. In general, 4-byte instruction is handled quite well and just like 2-byte, except may be on line-crossing. In the case where super-optimal grouping is key, like within a hot loop, then the 2-byte instruction potentially can allow better grouping opportunity depending on the instruction lengths of other instructions and addresses. A 2-byte instruction provides an ease-of-mind scheduling because it will always allow maximum grouping. On the other hand, before we had the "fastpath", I believe the source register was treated as a dependency, so in a hypothetical sequence of: L R1, (mem) AR R2, R1 XR R1,R1 the XR would have waited for the L. However, with the latest fast-path, that is not a concern. So, the complicated algorithm can be: if no cc scheduling conflict, and (pre-zEC12) no immediate earlier setter of GR, then X(G)R should be used; otherwise, it might be better to use L(G)HI. Jim Mulder z/OS System Test IBM Corp. Poughkeepsie, NY
