Bug ID: 89354
           Summary: Combine pass yields wrong code with -O2 and -msse2 for
                    32bit target
           Product: gcc
           Version: 6.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot
          Reporter: d.sukhonin at gmail dot com
  Target Milestone: ---

Created attachment 45723
Proposed patch for gcc 6.3 including a test

There is a problem with compiling this code (options -O2 -msse2 -m32):

$ gcc-6 -m32 -O2 -msse2 /src/gcc/gcc/testsuite/
&& ./a.out

#define BIT33 0x100000000ULL

#include <stdint.h>

static uint64_t const MASK33 = (1ULL << 33) - 1;
static uint64_t qword = 0;

static uint64_t
get_low33 (void)
  return qword & MASK33;

static void
set_bit33 (void)
  qword = (qword & ~MASK33) | BIT33;

static void
main (int, char**)
  set_bit33 ();

  if (get_low33 () != BIT33))
    abort ();

Investigation showed, that during combine pass for set_bit33 function wrong RTL
is yielded: the second operand of ior is truncated into dword, i.e. 0, and the
bit 33 is never switched on.

I have a fix attached. It does not allow narrowing while matching zero_extract
That is, before the fix the optimizer was dealing with this RTL inst (Note,
Trying 5, 6, 7 -> 8:
Failed to match this instruction:
(set (zero_extract:SI (mem/c:DI (plus:SI (reg:SI 87)
                (const:SI (unspec:SI [
                            (symbol_ref:SI ("qword") [flags 0x2]  <var_decl
0x7f6921b205a0 qword>)
                        ] UNSPEC_GOTOFF))) [3 qword+0 S8 A64])
        (const_int 33 [0x21])
        (const_int 0 [0]))
    (const_int 0 [0]))

Now zero_extract has DI mode because the inner instruction mem has larger, DI,
mode size:

Trying 5, 6, 7 -> 8:
Failed to match this instruction:
(set (zero_extract:DI (mem/c:DI (symbol_ref:SI ("qword") [flags 0x2]  <var_decl
0x7f7c3083c480 qword>) [3 qword+0 S8 A64])
        (const_int 33 [0x21])
        (const_int 0 [0]))
    (const_int 4294967296 [0x100000000]))


* get_best_reg_extraction_insn is called here

* zero_extract is tried out

$ gcc-6 -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18+deb9u1'
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie
--with-system-zlib --disable-browser-plugin --enable-java-awt=gtk
--enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre
--enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--with-target-system-zlib --enable-objc-gc=auto --enable-multiarch
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)

It is reproduced on all versions >= 6.3.
I didn't test on older ones.

