[Bug tree-optimization/48610] New: [4.3 Regression]: loop miscompilation; load removed by -funroll-loops

hp at gcc dot gnu.org Thu, 14 Apr 2011 10:04:33 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48610


           Summary: [4.3 Regression]: loop miscompilation; load removed by
                    -funroll-loops
           Product: gcc
           Version: 4.3.6
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: h...@gcc.gnu.org
            Target: sparc64-*-*


Created attachment 23984
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23984
Repeat with gcc -m32 -funroll-loops -O2 -save-temps -mcpu=ultrasparc  -mvis -c
and link together with the driver in the other attachment

This is a bug present in at least the gcc-4.3 series; I've checked gcc-4.3.5
and the gcc-core-4.3-20110410.tar.bz2 snapshot. It is not present in gcc-4.4.0,
and not in SVN r170836 from trunk (later the 4.6 branch).

I've triaged the bug to *disappear* with the SVN commit r139263 (at 2008-08-20)
to trunk (later the 4.4-branch), but the commit or ChangeLog entry does not
mention any bug (any wrong-code) being fixed and was in a swarm of improvement
commits.  I've been unable to locate the corresponding message to gcc-patches@.

The change is related to loops, but I can't tell whether this is a correction
or just an unrelated change hiding the bug, so entering this report as a note
may be helpful, seeing as gcc-4.3 was the installed version at time of this
writing. I have *not* analysed the gcc execution with/without this revision. 
It seems likely this bug will trig for other ports, and for code that is not
even vector-related.

It is present in "gcc version 4.3.2 (Debian 4.3.2-1.1)" (plain gcc on gcc54)
but not in "gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)" (gcc-4.1
on gcc54), so I think it's correct to call it a regression.

As seen in the attachment note, the bug requires -m32 -funroll-loops to appear.
 The "-mcpu=ultrasparc -mvis" are also necessary for the builtins and vector
code to be valid.  Again, the bug does not appear if the 64-bit ABI is used.
Beware that the installed gcc on gcc54 uses the 32-bit ABI by default, but when
you compile your own, you get the 64-bit ABI by default. The difference is
cancelled by the explicit -m32 option.

In the attached code, the test-program is spread out over two files.  I don't
think that's necessary, but on the other hand fitting VIS code for use in the
gcc test-suite doesn't seem like a good idea, seeing as it plain doesn't
execute correctly (wrong result) on Ultrasparc T1 (gcc-63), which according to
a STFW should emulate the instructions in the kernel, "2.6.26-2-sparc64-smp
(Debian 2.6.26-21lenny4)".

In the test-case, the manually unrolled code (the three expanded X macro
invocations) isn't executed, only the "x < rem" loop, and the wrong result
comes from the second (and last) iteration.  (There were originally four macro
invocations, but the first has been reduced to the crumb at the beginning of
the loop.)

The seemingly-pointless asms with identical input and output and faking use of
other variables are an attempt to force dependencies to the GSR register. GSR
is a (user-accessible) control register used by two of the builtins (of which
only one, __builtin_vis_faligndatav8qi, remains in the code; the other insn is
now expressed through an asm, fpack16). Unfortunately, the SPARC VIS vector
port has no support for expressing this dependence, arguably a (separate) bug
or at least incompleteness in the implementation.  This matter is only
coincidental to this bug, AFAICT.

[Bug tree-optimization/48610] New: [4.3 Regression]: loop miscompilation; load removed by -funroll-loops

Reply via email to