http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55723



vincenzo Innocente <vincenzo.innocente at cern dot ch> changed:



           What    |Removed                     |Added

----------------------------------------------------------------------------

            Summary|SLP vectorization vs loop:  |SLP vectorization vs loop:

                   |SLP more efficient!         |SLP more efficient: loop

                   |                            |vectorization inefficient

                   |                            |in presence of multiple

                   |                            |"blends"



--- Comment #1 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 
2012-12-17 19:25:37 UTC ---

moving the second blending  before the polynomial makes the two loops to

produce almost identical code

This is not always possible though.

Bug in the loop optimizer?



template<typename Float>

inline

Float atan(Float t) {

  constexpr float PIO4F = 0.7853981633974483096f;

  constexpr Float zero = {0};

  Float z= (t > 0.4142135623730950f) ? (t-1.0f)/(t+1.0f) : t;

  Float ret = ( t > 0.4142135623730950f ) ? zero+PIO4F : zero;



  Float z2 = z * z;

  ret +=

    ((( 8.05374449538e-2f * z2

    - 1.38776856032E-1f) * z2

      + 1.99777106478E-1f) * z2

     - 3.33329491539E-1f) * z2 * z

    + z;



  return ret;

}

Reply via email to