On Thu, 6 Jun 2024 at 23:55, Dan Greiner <[email protected]> wrote:
> AFAIK, there is no reason to expect that the execution of a 64-bit > instruction takes any longer than the execution of an equivalent 32-bit > instruction. For example, the execution of the 32-bit ADD (AR) instruction > should be comparable to the execution of the 64 bit ADD (AGR). > A while ago, I was looking at ancient cryptographic routines in S/370 that had been coded to process data in 16-bit units. I believe most of the unsigned operations we have now did not yet exist, so doing 32 bits per operation was harder, and the word was that the CPU would be doing it like that anyway (apparently for the mini processors in those days). I remember I was pleased to find that even the 32-bit multiplication was just as fast as the 16-bit. When I started to use the grande instructions, I found the MLGR just as fast as MLR. Doing just 1/4 of the steps really helps when doing big-integer multiplication.
