Hi Dror,

The hardware instruction does not know anything about errno, which is set to 
EDOM when the input argument is less than 0.

Thus, in order to get the correct behavior, the compiler needs to check the 
result (the __save_sqrt_temp_1 != __save_sqrt_temp_1 check), and if the result 
does not match, it calls the library implementation instead. (IIRC, floating 
point compare is different than integer compare, because NaN != NaN, and the 
compiler can't fold this if statement.)


Note that you can turn off this behavior (if your code does not care errno) 
with the -fno-math-errno flag. I don't have open64 installed on my current 
machine, but it seems to work with gcc.

If you want to dig deeper, you can add the statement "dz[29] = -1;" above the 
for loop, run the program under a debugger, and set a breakpoint at sqrtf. You 
will see that at iteration 29, the program hits the breakpoint. If you 
re-compile the code with the -fno-math-errno flag, then the sqrtf call is not 
even generated.


Rayson

=================================
Grid Engine / Open Grid Scheduler
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/




----- Original Message -----
From: Dror Maydan <may...@tensilica.com>
To: open64-devel@lists.sourceforge.net
Cc: 
Sent: Monday, November 14, 2011 8:49 PM
Subject: [Open64-devel] sqrtf

Given code such as

#include <math.h>
float dz[100];
int foo()
{
    int i;
    for (i=0; i<100; i++) {
      dz[i] = sqrtf(dz[i]);
    }
}


Open64 generates (on X86) a sqrt instruction followed by a conditional 
call to the sqrtf library function.

     __sqrt_arg_temp_0 = dz[i];
     __save_sqrt_temp_1 = _F4SQRT(__sqrt_arg_temp_0);
     if(__save_sqrt_temp_1 != __save_sqrt_temp_1)
     {
       __save_sqrt_temp_1 = sqrtf(__sqrt_arg_temp_0);
     }


Does anyone know the motivation for the library call?  Is something 
missing in the X86 hardware instruction, some error value in the library 
that needs to be set, something else?

Seems like a pretty high performance penalty for this behavior.

Dror

PS gcc seems to do the same thing

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel


------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Reply via email to