https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83240

            Bug ID: 83240
           Summary: x86_64 vectorized sqrt of denormal yields -inf when
                    DAZ=0
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gson at gson dot org
  Target Milestone: ---

Created attachment 42766
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42766&action=edit
Preprocessed source

When compiling for x95_64 with -O3 -ffast-math, gcc vectorizes square roots of
single-precision floats into SSE code using the rsqrtps instruction followed by
a Newton-Raphson step to calculate four square roots in one go.  This code
returns incorrect results (-infinity rather than zero) for denormal inputs when
the DAZ (Denormals Are Zero) flag is not set.

On Linux, if the -ffast-math option is also given at the link stage, the DAZ
flag is set by crtfastmath.o, and the problem does not occur.  However, if
-ffast-math is used when compiling library code, it is difficult to guarantee
that the link stage, which may be under the control of an entirely different
software project, also uses the -ffast-math option.  Also, some systems do not
currently support crtfastmath.o at all (see http://gnats.netbsd.org/50940). 
Therefore, it seems to me that code built with -ffast-math ought to work
correctly (even if not with optimal performance) whether DAZ is set or not.  Or
conversely, if there is an intentional requirement that DAZ must be set when
executing code compiled with -ffast-math, this requirement ought to be clearly
and prominently documented.

The attached test program demonstrates the issue.  Compile and run it as
follows (using a separate compile and link stage to suppress the use of
crtfastmath.o):

 gcc -c -O3 -ffast-math test_denormal.i
 gcc test_denormal.o -lm -o test_denormal
 ./test_denormal

This issue affects a wide range of gcc versions, including 6.3.0 and the
current SVN trunk (r255277).

Reply via email to