sse4_1-blendps.c fails spuriously on i686

vries at gcc dot gnu.org Thu, 22 Sep 2011 04:53:53 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50485


             Bug #: 50485
           Summary: gcc.target/i386/sse4_1-blendps.c fails spuriously on
                    i686
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: testsuite
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: vr...@gcc.gnu.org


gcc.target/i386/sse4_1-blendps.c fails spuriously for me on i686. Diego Novillo
mentions something similar in http://gcc.gnu.org/ml/gcc/2011-07/msg00296.html.

The test uses an uninitialized var src3. Initializing part of the var makes the
test fail reliably for me with the same failure mode: 
...
Index: sse4_1-blendps.c
===================================================================
--- sse4_1-blendps.c (revision 178880)
+++ sse4_1-blendps.c (working copy)
@@ -64,6 +64,7 @@ TEST (void)
     } src3;
   int i;

+  src3.f[1] = __builtin_nansf ("");
   init_blendps (src1.f, src2.f);

   /* Check blendps imm8, m128, xmm */
...

The test aborts because the assignment 'tmp[1] = src2[1]' changes the Nan:
...
static int
check_blendps (__m128 *dst, float *src1, float *src2)
{
  float tmp[4];
  int j;

  memcpy (&tmp[0], src1, sizeof (tmp));
  for (j = 0; j < 4; j++)
    if ((MASK & (1 << j)))
      tmp[j] = src2[j];

  return memcmp (dst, &tmp[0], sizeof (tmp));
}
...

The assignment is translated as a push/pop on the float stack:
...
#(insn 17 39 42 2 (set (reg:SF 8 st)
#        (mem:SF (plus:SI (reg/v/f:SI 2 cx [orig:65 src2 ] [65])
#                (const_int 4 [0x4])) [3 MEM[(float *)src2_10(D) + 4B]+0 S4
A32])) sse4_1-blendps.c:46 108 {*movsf_internal}
#     (expr_list:REG_DEAD (reg/v/f:SI 2 cx [orig:65 src2 ] [65])
#        (expr_list:REG_EQUIV (mem/s/c:SF (plus:SI (reg/f:SI 20 frame)
#                    (const_int -12 [0xfffffffffffffff4])) [3 tmp+4 S4 A32])
#            (nil))))
    flds    4(%ecx)    # 17    *movsf_internal/1    [length = 3]
...
#(insn 18 15 21 2 (set (mem/s/c:SF (plus:SI (reg/f:SI 7 sp)
#                (const_int 20 [0x14])) [3 tmp+4 S4 A32])
#        (reg:SF 8 st)) sse4_1-blendps.c:46 108 {*movsf_internal}
#     (expr_list:REG_DEAD (reg:SF 8 st)
#        (nil)))
    fstps    20(%esp)    # 18    *movsf_internal/2    [length = 4]
...

By going through the float stack, the signalling Nan turns into a quiet Nan.
That seems to correspond with what is said at
http://stackoverflow.com/questions/2247447/usefulness-of-signaling-nan .

So after the push/pop tmp[1] contains a quiet Nan, while the corresponding part
of dst contains the bit representation of the signalling Nan, the memcmp
returns != 0 and the test aborts.

[Bug testsuite/50485] New: gcc.target/i386/sse4_1-blendps.c fails spuriously on i686

Reply via email to