Hi Bastiaan,

On Sun, 10 Aug 2025, Bastiaan Braams wrote:

The third loop in the appended code iterates over the 32-bit integers and at each iteration a simple arithmetic operation is performed (that cannot be optimized away). The fourth loop iterates over the 32-bit reals from -huge(0.0_real32) to huge(0.0_real32) by way of the ieee_next_after function. The timings are reported and again we observe the factor of about 200 difference. Compiling with `gfortran -O5` I get 3.9 seconds for the third loop and 883 seconds for the fourth loop on my Intel i7-1165G7.

Very odd.

I have a bit of C code which uses nextafterf to step through every single REAL*4 (or float in C) from -INFINITY to +INFINITY, and then computes two exponential functions, one being a single precision routine which needs 2 branches, 3 comparisons and just shy of 30 (super-scalar) multiplications and additions, and the other being a double/REAL*8 exp() routine from the system library, and compares them. It takes 46 seconds as a single thread on a Xeon E5-2650v4 which is 40% slower than your CPU (its technology is 4 years older than yours).

Why your code takes 800 seconds for doing a lot less work is beyond me.
Maybe somebody else can shed light on it?  Sorry, I have no experience in
using such routines from Fortran at the level of intricate knowledge to
address your problem.  If I have any brilliant ideas, I will let you know.

Regards - Damian

Reply via email to