That's a great explanation Thomas. I'm curious though: how come both compilers produce this same sequence of instructions? I'd have thought it was a rather obscure combination. Is it perhaps more common than I'd suspected, or do GCC and Dignus have some common heritage in the back end?
Best wishes / Mejores deseos / Meilleurs vœux Ian ... On Wednesday, April 20, 2022, 04:17:32 AM GMT+2, Thomas David Rivers <riv...@dignus.com> wrote: I thought I'd bring an explanation to what's going on here... Let's consider the following short C example (just to have something to compile): foo() { unsigned char ovfl; int ccpm, carrybit; ccpm = bar(); carrybit=bar2(); ovfl = (ccpm & carrybit) != 0; blah(ovfl); } The functions bar(), bar2() and blah() are simply external to this source (compilation unit in C terms) and are there so the optimizer doesn't have a clue about the possible values of the variables. When I compile this for z/OS (31-bit mode) with the Dignus compiler, I get this code: * *** ccpm = bar(); carrybit=bar2(); L 15,@lit_153_0 ; bar @@gen_label0 DS 0H BALR 14,15 @@gen_label1 DS 0H L 1,@lit_153_1 ; bar2 LR 2,15 LR 15,1 @@gen_label2 DS 0H BALR 14,15 @@gen_label3 DS 0H * *** * *** ovfl = (ccpm & carrybit) != 0; NR 2,15 LPR 2,2 LCR 2,2 SRL 2,31(0) * *** * *** blah(ovfl); STC 2,80(0,13) ; ovfl which is similar to what's going on with GCC. (The values happen to be in registers though.) Now, how does this work? 1) The two values are AND'd together (this is just a bit-wise/logical AND operation). 2) The absolute value is taken (making the 2's complement sign bit a zero) with the LPR instruction. So we now have either a zero or non-zero (positive) value (or a special case which we'll see below.) 3) The 2's complement of that is taken. If the value is zero, the result is zero - otherwise the result is a negative value (and the sign-bit will be set) (or - another special case, which we'll see below.) 4) The sign-bit is shifted right 31 times to result in either a X'00000000' or X'00000001' in the the final result. So, what's going on in step #2 and why does that work? Especially if we consider that the result of the AND sets the sign-bit? Note that the only value from the AND that is interesting is the situation where the AND results in the sign bit being set, which presumably is cleared after the LPR. Hence the confusion. The "secret" is in the operation of the LPR and LCR instructions for the 2's complement maximum negative value (X'80000000'): These notes in the Principles of Operation give a hint: LPR: An overflow condition occurs when the maximum negative number is complemented; the number remains unchanged. LCR: Zero and the maximum negative number remain unchanged. An overflow condition occurs when the maximum negative number is complemented. So, as it happens, the LPR of the most negative number (X'80000000') produces X'80000000' as its result (and sets overflow, which is ignored.) And the same thing happens for the LCR instruction. Going through the steps, when the result of the AND is X'8000000', we get these values: LPR ==> X'80000000' LCR ==> X'80000000' SRL ==> X'00000001' And, for the X'0000000' value we have: LPR ==> X'00000000' LCR ==> X'00000000' SRL ==> X'00000000' For any other situation where the AND operation produces a negative value (the sign bit is set) you'll have a value which isn't the most negative. Thus some of the lower-order bits (the non-sign-bit) will be set. If we have, for example, X'8xxxxxxx' then LPR ==> X'0xxxxxxx' LCR ==> X'8.......' (whatever the 2's complement of 0xxxxxxx is) SRL ==> X'00000001' Then we only need to consider the situation where the result of the AND is non-zero but positive, which is just an innocuous execution of the LPR instruction, which does "nothing" and proceeds as above. It's a clever sequence of instructions to produce a zero or non-zero value based on an input without a branch. - Dave Rivers - -- riv...@dignus.com Work: (919) 676-0847 Get your mainframe programming tools at http://www.dignus.com