date:20190106

[Bug ipa/87957] [9 Regression] ICE tree check: expected tree that contains ‘decl minimal’ structure, have ‘identifier_node’ in warn_odr, at ipa-devirt.c:1051 since r265519

2019-01-06 Thread ebotcazou at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87957

--- Comment #25 from Eric Botcazou  ---
Or, alternatively, from the top-level build directory where you have copied the
gnat.dg/lto8* files, just: gcc/xgcc -Bgcc -S lto8.adb -flto -I gcc/ada/rts

[Bug rtl-optimization/88331] [9 Regression] ICE in rtl_verify_bb_layout, at cfgrtl.c:2987

2019-01-06 Thread andriusspamtest at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88331

Andrius Burokas  changed:

   What|Removed |Added

 CC||andriusspamtest at gmail dot 
com

--- Comment #12 from Andrius Burokas  ---
While trying to compile GCC trunk (commit
cea12873eeeaa7952e315626991b2e162218e134, Thu Dec 27 16:31:50 2018 +) on
Cygwin (CYGWIN_NT-10.0 LT04LT1279HR2 2.11.2(0.329/5/3) 2018-11-08 14:34 x86_64
Cygwin) got an ICE similar to one reported by Rainer
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88331#c6)

==
configure parameters before 'make'
==
andbur@LT04LT1279HR2 ~/gcc
$ ./../gcc_source/configure --build=x86_64-pc-cygwin --host=x86_64-pc-cygwin
--target=x86_64-pc-cygwin --without-libiconv-prefix --without-libintl-prefix
--enable-shared --enable-shared-libgcc --enable-static
--enable-version-specific-runtime-libs --enable-bootstrap --enable-__cxa_atexit
--with-dwarf2 --with-tune=generic
--enable-languages=c,c++,fortran,lto,objc,obj-c++ --enable-graphite
--enable-threads=posix --enable-libatomic --enable-libcilkrts --enable-libgomp
--enable-libitm --enable-libquadmath --enable-libquadmath-support
--disable-libssp --enable-libada --disable-symvers --with-gnu-ld --with-gnu-as
--without-libiconv-prefix --without-libintl-prefix --with-system-zlib
--enable-linker-build-id --with-default-libstdcxx-abi=gcc4-compatible
--enable-libstdcxx-filesystem-ts
--with-cloog-include=/usr/local/include/cloog-isl --enable-lto
--enable-checking=release

==
Last few lines where the compilation broke
==
libtool: compile:  /home/andbur/gcc/./gcc/xgcc -B/home/andbur/gcc/./gcc/
-B/usr/local/x86_64-pc-cygwin/bin/ -B/usr/local/x86_64-pc-cygwin/lib/ -isystem
/usr/local/x86_64-pc-cygwin/include -isystem
/usr/local/x86_64-pc-cygwin/sys-include -fno-checking -DHAVE_CONFIG_H -I.
-I../.././../gcc_source/libgfortran -iquote../.././../gcc_source/libgfortran/io
-I../.././../gcc_source/libgfortran/../gcc
-I../.././../gcc_source/libgfortran/../gcc/config
-I../.././../gcc_source/libgfortran/../libquadmath -I../.././gcc
-I../.././../gcc_source/libgfortran/../libgcc -I../libgcc
-I../.././../gcc_source/libgfortran/../libbacktrace -I../libbacktrace
-I../libbacktrace -std=gnu11 -Wall -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition -Wextra -Wwrite-strings
-Werror=implicit-function-declaration -Werror=vla -fcx-fortran-rules
-ffunction-sections -fdata-sections -ffast-math -ftree-vectorize -funroll-loops
--param max-unroll-times=4 -g -O2 -MT matmul_i4.lo -MD -MP -MF
.deps/matmul_i4.Tpo -c ../.././../gcc_source/libgfortran/generated/matmul_i4.c 
-DDLL_EXPORT -DPIC -o .libs/matmul_i4.o
during RTL pass: postreload
../.././../gcc_source/libgfortran/generated/matmul_i4.c: In function
‘matmul_i4_avx2’:
../.././../gcc_source/libgfortran/generated/matmul_i4.c:1210:1: internal
compiler error: in reload_combine_note_use, at postreload.c:1547
 1210 | }
  | ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make[3]: *** [Makefile:4194: matmul_i4.lo] Error 1
make[3]: Leaving directory '/home/andbur/gcc/x86_64-pc-cygwin/libgfortran'
make[2]: *** [Makefile:1562: all] Error 2
make[2]: Leaving directory '/home/andbur/gcc/x86_64-pc-cygwin/libgfortran'
make[1]: *** [Makefile:18695: all-target-libgfortran] Error 2
make[1]: Leaving directory '/home/andbur/gcc'
make: *** [Makefile:988: all] Error 2

[Bug ipa/87957] [9 Regression] ICE tree check: expected tree that contains ‘decl minimal’ structure, have ‘identifier_node’ in warn_odr, at ipa-devirt.c:1051 since r265519

2019-01-06 Thread ebotcazou at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87957

--- Comment #24 from Eric Botcazou  ---
> I am not sure how to get command line for debugger but I assume that it does
> not simplify type name because type_with_linkage_p returns true which is
> wrong for Ada types (it tests whether it is C++ ODR type).

Run the gnat.dg testsuite and copy-and-paste the command line from the log file
without the -q option in the middle.  You'll get:

/home/eric/build/gcc/native/gcc/xgcc -c
-I/home/eric/svn/gcc/gcc/testsuite/gnat.dg/ -B/home/eric/build/gcc/native/gcc
--RTS=/home/eric/build/gcc/native/x86_64-suse-linux/./libada
-fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -gnatws -flto -lm -I-
/home/eric/svn/gcc/gcc/testsuite/gnat.dg/lto8.adb
+===GNAT BUG DETECTED==+
| 9.0.0 20190104 (experimental) [trunk revision 267574] (x86_64-suse-linux) GCC
error:|
| in fld_incomplete_type_of, at tree.c:5348|

[Bug c++/84436] [8/9 Regression] Missed optimization with switch on enum constants returning the same value

2019-01-06 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84436

--- Comment #11 from Martin Liška  ---
(In reply to Romain Geissler from comment #10)
> Hi,
> 
> FYI, I bisected this revision r265463 to introduce a regression when
> building the llvm toolchain.
> 
> If you do the following:
>  - build gcc 9 >= r265463 (including recent revisions from late December)
>  - build clang 7 or clang 8 svn with this gcc 9 (it will work)
>  - finally with the resulting clang, build llvm's compiler-rt
> 
> then clang itself will ICE. Before r265463 it was not the case, and
> considering the nature of the fix (missed optimization) I suspect more a gcc
> bug rather than a clang one.
> 
> The exact clang failure is:
> FAILED: CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o 
> /workdir/build/final-system/llvm-build/./bin/clang
> --target=x86_64-1a-linux-gnu -DVISIBILITY_HIDDEN  -O2 -mmmx -msse -msse2
> -msse3
> -I/workdir/build/final-system/llvm-temporary-static-dependencies/install/
> include
> -I/workdir/build/final-system/llvm-temporary-static-dependencies/install/
> include/ncursesw -O3 -DNDEBUG-m64 -std=c11 -fPIC -fno-builtin
> -fvisibility=hidden -fomit-frame-pointer -MD -MT
> CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o -MF
> CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o.d -o
> CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o   -c
> /workdir/src/llvm-8.0.0/compiler-rt/lib/builtins/divtc3.c
> fatal error: error in backend: Cannot select: 0x1a4b558: ch = fsqrt
> 0x199e868, 0x1a46e38, FrameIndex:i64<0>
>   0x1a46e38: f80,ch = CopyFromReg 0x199e868, Register:f80 %0
> 0x1a46d68: f80 = Register %0
>   0x1a4bd10: i64 = FrameIndex<0>
> In function: __divtc3
> clang-8: error: clang frontend command failed with exit code 70 (use -v to
> see invocation)
> clang version 8.0.0 
> Target: x86_64-1a-linux-gnu
> Thread model: posix
> InstalledDir: /workdir/build/final-system/llvm-build/./bin
> clang-8: note: diagnostic msg: PLEASE submit a bug report to
> https://bugs.llvm.org/ and include the crash backtrace, preprocessed source,
> and associated run script.
> clang-8: note: diagnostic msg: 
> 
> 
> PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
> Preprocessed source(s) and associated run script(s) are located at:
> clang-8: note: diagnostic msg: /tmp/divtc3-106b93.c
> clang-8: note: diagnostic msg: /tmp/divtc3-106b93.sh
> clang-8: note: diagnostic msg: 
> 
> Note: step 2 and 3 (build clang then build compiler-rt with the resulting
> clang) is done automatically when bootstrapping a 2-stage PGO clang using
> this cmake configuration:
> https://github.com/llvm-mirror/clang/blob/master/cmake/caches/PGO.cmake
> 
> I don't know how to help more in investigating this regression. If I can do
> something, please ask.
> 
> Cheers,
> Romain

Thank you for the report. Can you please provide exact steps how to build the
llvm/clang?

[Bug ipa/88711] [9 Regression] scan-ipa-dump inline "Inlined tp_sum/

2019-01-06 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88711

Martin Liška  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |hubicka at gcc dot 
gnu.org

[Bug c++/88521] gcc 9.0 from r266355 miscompile x265 for mingw-w64 target

2019-01-06 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88521

--- Comment #9 from Martin Liška  ---
Author: marxin
Date: Mon Jan  7 07:31:19 2019
New Revision: 267622

URL: https://gcc.gnu.org/viewcvs?rev=267622=gcc=rev
Log:
PR target/88521
* config/i386/i386.c (function_value_ms_64): Return small sturct in
AX_REG and float/double in FIRST_SSE_REG for 4 or 8 byte modes.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr88521.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/testsuite/ChangeLog

[Bug target/88717] Unnecessary vzeroupper

2019-01-06 Thread crazylht at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88717

--- Comment #1 from 刘袋鼠  ---
Pass_insert_vzeroupper uses mode_switch to insert `vzeroupper`.

In function entry and functon body, 256bits/512bits registers are used ,so it
will set mode as `AVX_U128_DIRTY`. But for function exit no 256bits/512bits
register is returned, so `AVX_U128_CLEAN` is set. 

Then `case AVX_U128_CLEAN` will be triggered for mode switching, maybe we
should handle ix86_avx_u128_mode_exit.

Simple case show vzeroupper disappear when return a 512bits register.

```
test.i:

typedef float __v16sf __attribute__ ((__vector_size__ (64)));
typedef float __m512 __attribute__ ((__vector_size__ (64), __may_alias__));

__m512
foo (float *p, __m512 x)
{
  *p = ((__v16sf)x)[0];
  return x;
}


test.s
--
foo:
.LFB0:
.cfi_startproc
vmovss  %xmm0, (%rdi)
ret
.cfi_endproc
.LFE0:
```

[Bug target/88715] Cross Compile on Debian Linux to create a uClibc tool chain GCC fails to compile

2019-01-06 Thread tomsies at mighty dot co.za

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88715

DirkInSA  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from DirkInSA  ---
Thanks, was just on the stack overflow thread that you kindly posted.

Will close this bug!

[Bug target/88715] Cross Compile on Debian Linux to create a uClibc tool chain GCC fails to compile

2019-01-06 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88715

--- Comment #5 from Andrew Pinski  ---
https://raghunathlolur.wordpress.com/2014/06/30/combined-tree-build-of-gcc-binutils-and-libraries/

https://stackoverflow.com/questions/1726042/recipe-for-compiling-binutils-gcc-together

etc.

[Bug target/88715] Cross Compile on Debian Linux to create a uClibc tool chain GCC fails to compile

2019-01-06 Thread tomsies at mighty dot co.za

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88715

--- Comment #4 from DirkInSA  ---
Symlink was a simply the whole binutils source directory into gcc source
directory.

I was not aware that each of bfd, binutils, config, cpu & etc needed to be
linked (I assume) into the base gcc directory - will have a go at that and see
what happens.

Is the issue here that the variable as is empty, and so exec is trying to exec
--32 ... instead of as --32 ...?

[Bug tree-optimization/88732] different results on -O0 and -O1, -O2, -O3, -Os

2019-01-06 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88732

--- Comment #3 from Andrew Pinski  ---
(In reply to Amos Wang from comment #2)
> (In reply to Marc Glisse from comment #1)
> > Why not read the documentation for that function?
> > "If x is 0, the result is undefined."
> 
> Why the results are different at different optimizing optionss? If it's an
> undefined behaviour, I think all results should be the same.

undefined behavior means that the value could be different at different times.

[Bug tree-optimization/88732] different results on -O0 and -O1, -O2, -O3, -Os

2019-01-06 Thread amocywang at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88732

--- Comment #2 from Amos Wang  ---
(In reply to Marc Glisse from comment #1)
> Why not read the documentation for that function?
> "If x is 0, the result is undefined."

Why the results are different at different optimizing optionss? If it's an
undefined behaviour, I think all results should be the same.

[Bug tree-optimization/88732] different results on -O0 and -O1, -O2, -O3, -Os

2019-01-06 Thread glisse at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88732

--- Comment #1 from Marc Glisse  ---
Why not read the documentation for that function?
"If x is 0, the result is undefined."

[Bug tree-optimization/88732] New: different results on -O0 and -O1, -O2, -O3, -Os

2019-01-06 Thread amocywang at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88732

Bug ID: 88732
   Summary: different results on -O0 and -O1, -O2, -O3, -Os
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amocywang at gmail dot com
  Target Milestone: ---

> $ gcc -v  
>   
>
Using built-in specs.
COLLECT_GCC=/home/wgc/installs/gcc_trunks/trunk_r267598/bin/gcc
COLLECT_LTO_WRAPPER=/home/wgc/installs/gcc_trunks/trunk_r267598/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure LDFLAGS=-Wl,--no-as-needed
--prefix=/home/wgc/installs/gcc_trunks/trunk_r267598
--with-gmp=/home/wgc/installs/gmp-6.1.2
--with-mpfr=/home/wgc/installs/mpfr-4.0.1
--with-mpc=/home/wgc/installs/mpc-1.1.0 --with-isl= --with-cloog=
--enable-languages=c,c++ --disable-multilib --disable-bootstrap
Thread model: posix
gcc version 9.0.0 20190105 (experimental) (GCC)

$ gcc -O0 program.c
$ ./a.out
63
$ gcc -O1 program.c
$ ./a.out
64 
$ gcc -O2 program.c
$ ./a.out
64
$ gcc -O3 program.c
$ ./a.out
64
$ gcc -Os program.c
$ ./a.out
64

$ cat program.c
static int a, b;
unsigned short int c() {
  b = __builtin_clzl(a);
  return a;
}
int main() {
  c();
  printf("%d\n", b);
  return 0;
}

I have validated this difference with the command line option
"-fno-strict-aliasing -fwrapv" and the results are different on different
optimizing options, same with the results above. It seems the test program is
correct.

[Bug target/88715] Cross Compile on Debian Linux to create a uClibc tool chain GCC fails to compile

2019-01-06 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88715

--- Comment #3 from Andrew Pinski  ---
Can you attach the full log?  And also attach config.log in the top level
directory?

> Actually binutils-2.29 ... are symlinked into the gcc source tree.

How did you do the symlink here?  Is it a symlink just to the binutils source
directory or symlinks to the files in the binutils source directories?

A combined tree requires a "merged" tree.

[Bug target/88715] Cross Compile on Debian Linux to create a uClibc tool chain GCC fails to compile

2019-01-06 Thread tomsies at mighty dot co.za

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88715

--- Comment #2 from tomsies at mighty dot co.za ---
Actually binutils-2.29 (along with gmp-6.1.0, mpc-1.0.3 and mpfr-3.1.4) are
symlinked into the gcc source tree. So they should be built as part of the
compile.

My assumption is that the as script generated during the compile process is
there to pick up the relevant assembler at different stages of the compile, and
so this is a bug, because whatever assembler it is trying to exec will fail
with "as: 106: exec: --32: not found" because the exec string is not quoted?

[Bug lto/66229] LTO fails with -fauto-profile on mcf

2019-01-06 Thread andi at firstfloor dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66229

--- Comment #4 from andi at firstfloor dot org ---
Did some testing. Previously pretty much everything I tried failed.
I don't have mcf, but git, less, gcc LTO+autofdo bootstrap all appear to work
now.

So it's likely fixed.

Would be good if someone could confirm with the actual mcf.

[Bug c/88731] [DR 481] Rejects well-formed program using bit-fields in _Generic.

2019-01-06 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88731

--- Comment #1 from Andrew Pinski  ---
It is a TC rather than a DR which means it is not part of C11 but the next one.

[Bug tree-optimization/88713] Vectorized code slow vs. flang

2019-01-06 Thread elrodc at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #14 from Chris Elrod  ---
It's not really reproducible across runs:

$ time ./gfortvectests 
 Transpose benchmark completed in   22.7010765
 SIMD benchmark completed in   1.37529969
 All are equal: F
 All are approximately equal: F
 Maximum relative error   6.20566949E-04
 First record X:  0.188879877  0.377619117  -1.67841911E-02
 First record Xt:  0.10071  0.377619147  -1.67841911E-02
 Second record X:  -8.14126506E-02 -0.421755224 -0.199057430
 Second record Xt:  -8.14126655E-02 -0.421755224 -0.199057430

real0m2.414s
user0m2.406s
sys 0m0.005s

$ time ./flangvectests 
 Transpose benchmark completed in7.630980
 SIMD benchmark completed in   0.6455200
 All are equal:  F
 All are approximately equal:  F
 Maximum relative error   2.0917827E-04
 First record X:   0.58675421.568364   0.1006735
 First record Xt:   0.58675411.568363   0.1006735
 Second record X:   0.2894785  -0.1510675  -9.3419194E-02
 Second record Xt:   0.2894785  -0.1510675  -9.3419187E-02

real0m0.839s
user0m0.832s
sys 0m0.006s

$ time ./gfortvectests 
 Transpose benchmark completed in   22.0195961
 SIMD benchmark completed in   1.36087596
 All are equal: F
 All are approximately equal: F
 Maximum relative error   2.49150675E-04
 First record X: -0.284217566   2.13768221E-02 -0.475293010
 First record Xt: -0.284217596   2.13767942E-02 -0.475293040
 Second record X:   1.75664220E-02  -9.29893106E-02  -4.37139049E-02
 Second record Xt:   1.75664220E-02  -9.29893106E-02  -4.37139049E-02

real0m2.344s
user0m2.338s
sys 0m0.003s

$ time ./flangvectests 
 Transpose benchmark completed in7.881181
 SIMD benchmark completed in   0.6132510
 All are equal:  F
 All are approximately equal:  F
 Maximum relative error   2.0917827E-04
 First record X:   0.58675421.568364   0.1006735
 First record Xt:   0.58675411.568363   0.1006735
 Second record X:   0.2894785  -0.1510675  -9.3419194E-02
 Second record Xt:   0.2894785  -0.1510675  -9.3419187E-02

real0m0.861s
user0m0.853s
sys 0m0.006s


It's also probably wasn't quite right to call it "error", because it's
comparing the values from the scalar and vectorized versions. Although it is
unsettling if the differences are high; there should be an exact match,
ideally.

Back to Julia, using mpfr (set to 252 bits of precision), and rounding to
single precision for an exactly rounded answer...

X32gfort # calculated from gfortran
X32flang # calculated from flang
Xbf  # mpfr, 252-bit precision ("BigFloat" in Julia)

julia> Xbf32 = Float32.(Xbf) # correctly rounded result

julia> function ULP(x, correct) # calculates ULP error
   x == correct && return 0
   if x < correct
   error = 1
   while nextfloat(x, error) != correct
   error += 1
   end
   else
   error = 1
   while prevfloat(x, error) != correct
   error += 1
   end
   end
   error
   end
ULP (generic function with 1 method)

julia> ULP.(X32gfort, Xbf32)'
3×1024 Adjoint{Int64,Array{Int64,2}}:
 7  1  1  8  3  2  1  1  1  27  4  1  4  6  0  0  2  0  2  4  0  7  1  1  3  8 
4  2  2  …  1  0  2  0  0  1  2  3  1  5  1  1  0  0  0  2  3  2  1  2  3  1  0
 1  1  0  2  0  41
 4  2  1  1  6  1  0  1  1   2  2  0  0  3  0  1  0  3  1  1  0  1  1  0  0  3 
1  0  0 0  1  0  1  0  1  0  1  1  4  1  1  0  2  0  1  0  1  0  0  0  1  2
 1  1  1  0  0   1
 1  1  0  1  1  0  0  0  0   1  1  0  0  1  0  1  1  1  0  1  1  0  0  1  0  1 
0  0  0 0  0  1  0  0  0  0  0  1  0  0  1  1  1  0  0  1  0  1  1  0  1  1
 0  0  0  0  0   1

julia> mean(ans)
1.9462890625

julia> ULP.(X32flang, Xbf32)'
3×1024 Adjoint{Int64,Array{Int64,2}}:
 4  1  0  3  0  0  0  1  1  5  2  1  1  6  3  0  1  0  0  1  1  21  0  1  2  8 
2  3  0  0  …  1  1  1  15  2  1  1  5  1  1  1  0  0  0  0  0  2  1  3  1  1 
1  1  1  1  1  0  11
 3  1  1  0  1  0  0  1  0  0  1  0  0  2  1  1  1  6  0  0  0   2  1  0  1  4 
1  1  0  3 1  1  1   1  2  1  1  0  1  1  0  0  1  0  1  0  0  1  0  0  1 
1  1  0  1  0  0   0
 1  0  1  0  0  0  1  1  0  1  0  0  0  1  1  0  0  1  1  0  1   1  0  1  0  1 
0  0  1  0 0  0  1   0  0  0  0  0  0  2  0  0  0  0  0  1  1  1  1  0  1 
0  0  0  0  0  0   1

julia> mean(ans)
1.3388671875


So in that case, gfortran's version had about 1.95 ULP error on average, and
Flang about 1.34 ULP error.

[Bug c/88731] New: Rejects well-formed program using bit-fields in _Generic.

2019-01-06 Thread anders.granlund.0 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88731

Bug ID: 88731
   Summary: Rejects well-formed program using bit-fields in
_Generic.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anders.granlund.0 at gmail dot com
  Target Milestone: ---

Test case (prog.c):

  int main()
  {
struct S { unsigned x:4; } s;

_Generic(s.x, unsigned: 0);
  }

Compilation command line:

  gcc prog.c -Wall -Wextra -std=c11 -pedantic-errors

Observed behaviour:

  The following error message is outputed:

error: '_Generic' selector of type 'unsigned char:4' is not compatible
   with any association

Expected behaviour:

  No error message outputed.

  See http://www.open-std.org/Jtc1/sc22/wg14/www/docs/summary.htm#dr_481:

"... It was noted that bitfields are of integer type. ..."

Note:

  Clang accepts the program without any error message.

[Bug tree-optimization/88713] Vectorized code slow vs. flang

2019-01-06 Thread jvdelisle at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

Jerry DeLisle  changed:

   What|Removed |Added

 CC||jvdelisle at gcc dot gnu.org

--- Comment #13 from Jerry DeLisle  ---
I noticed the Maximum Relative error in your benchmarks is significantly larger
in the flang test vs the gfortran test. Is this a factor that matters?

[Bug debug/88730] gcc generates wrong debug information at -Og

2019-01-06 Thread qrzhang at gatech dot edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88730

--- Comment #1 from Qirun Zhang  ---
It appears to be a regression in 8.X.

[Bug debug/88730] New: gcc generates wrong debug information at -Og

2019-01-06 Thread qrzhang at gatech dot edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88730

Bug ID: 88730
   Summary: gcc generates wrong debug information at -Og
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: qrzhang at gatech dot edu
  Target Milestone: ---

Unlike PR88686, it happens at -Og.

The correct value of j should be 5. At "-Og", gdb prints "j = 2" incorrectly.


$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/home/absozero/trunk/root-gcc/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/home/absozero/trunk/root-gcc
--enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
gcc version 9.0.0 20190106 (experimental) [trunk revision 267609] (GCC)




$ gcc-trunk -g abc.c outer.c
$ gdb-trunk -x cmds -batch a.out
Breakpoint 1 at 0x400507: file abc.c, line 11.
0

Breakpoint 1, main () at abc.c:11
11optimize_me_not();
$1 = 5



$ gcc-trunk -g -Og abc.c outer.c
$ gdb-trunk -x cmds -batch a.out
Breakpoint 1 at 0x4004e4: file abc.c, line 11.
0

Breakpoint 1, main () at abc.c:11
11optimize_me_not();
$1 = 2





=== files to reproduce 
$ cat abc.c
int a;
int main() {
  int b, j;
  b = 0;
  for (; b < 1; b++) {
j = 0;
for (; j < 5; j++)
  ;
  }
  printf("%X\n", a);
  optimize_me_not();
}

$ cat outer.c
optimize_me_not(){}

$ cat cmds
b 11
r
p j
kill
q

[Bug tree-optimization/88713] Vectorized code slow vs. flang

2019-01-06 Thread elrodc at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #12 from Chris Elrod  ---
Created attachment 45363
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45363=edit
Fortran program for running benchmarks.

Okay, thank you.

I attached a Fortran program you can run to benchmark the code.
It randomly generates valid inputs, and then times running the code 10^5 times.
Finally, it reports the average time in microseconds.

The SIMD times are the vectorized version, and the transposed times are the
non-vectorized versions. In both cases, Flang produces much faster code.

The results seem in line with what I got benchmarking shared libraries from
Julia.
I linked rt for access to the high resolution clock.


$ gfortran -Ofast -lrt -march=native -mprefer-vector-width=512
vectorization_tests.F90 -o gfortvectests

$ time ./gfortvectests 
 Transpose benchmark completed in   22.7799759
 SIMD benchmark completed in   1.34003162
 All are equal: F
 All are approximately equal: F
 Maximum relative error   8.27204276E-05
 First record X:   1.02466011 -0.689792156 -0.404027045
 First record Xt:   1.02465975 -0.689791918 -0.404026985
 Second record X: -0.546353579   3.37308086E-03   1.15257287
 Second record Xt: -0.546353400   3.37312138E-03   1.15257275

real0m2.418s
user0m2.412s
sys 0m0.003s

$ flang -Ofast -lrt -march=native -mprefer-vector-width=512
vectorization_tests.F90 -o flangvectests

$ time ./flangvectests 
 Transpose benchmark completed in7.232568
 SIMD benchmark completed in   0.6596010
 All are equal:  F
 All are approximately equal:  F
 Maximum relative error   2.0917827E-04
 First record X:   0.58675421.568364   0.1006735
 First record Xt:   0.58675411.568363   0.1006735
 Second record X:   0.2894785  -0.1510675  -9.3419194E-02
 Second record Xt:   0.2894785  -0.1510675  -9.3419187E-02

real0m0.801s
user0m0.794s
sys 0m0.005s

[Bug c++/84436] [8/9 Regression] Missed optimization with switch on enum constants returning the same value

2019-01-06 Thread romain.geissler at amadeus dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84436

Romain Geissler  changed:

   What|Removed |Added

 CC||romain.geissler at amadeus dot 
com

--- Comment #10 from Romain Geissler  ---
Hi,

FYI, I bisected this revision r265463 to introduce a regression when building
the llvm toolchain.

If you do the following:
 - build gcc 9 >= r265463 (including recent revisions from late December)
 - build clang 7 or clang 8 svn with this gcc 9 (it will work)
 - finally with the resulting clang, build llvm's compiler-rt

then clang itself will ICE. Before r265463 it was not the case, and considering
the nature of the fix (missed optimization) I suspect more a gcc bug rather
than a clang one.

The exact clang failure is:
FAILED: CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o 
/workdir/build/final-system/llvm-build/./bin/clang --target=x86_64-1a-linux-gnu
-DVISIBILITY_HIDDEN  -O2 -mmmx -msse -msse2 -msse3
-I/workdir/build/final-system/llvm-temporary-static-dependencies/install/include
-I/workdir/build/final-system/llvm-temporary-static-dependencies/install/include/ncursesw
-O3 -DNDEBUG-m64 -std=c11 -fPIC -fno-builtin -fvisibility=hidden
-fomit-frame-pointer -MD -MT CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o
-MF CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o.d -o
CMakeFiles/clang_rt.builtins-x86_64.dir/divtc3.c.o   -c
/workdir/src/llvm-8.0.0/compiler-rt/lib/builtins/divtc3.c
fatal error: error in backend: Cannot select: 0x1a4b558: ch = fsqrt 0x199e868,
0x1a46e38, FrameIndex:i64<0>
  0x1a46e38: f80,ch = CopyFromReg 0x199e868, Register:f80 %0
0x1a46d68: f80 = Register %0
  0x1a4bd10: i64 = FrameIndex<0>
In function: __divtc3
clang-8: error: clang frontend command failed with exit code 70 (use -v to see
invocation)
clang version 8.0.0 
Target: x86_64-1a-linux-gnu
Thread model: posix
InstalledDir: /workdir/build/final-system/llvm-build/./bin
clang-8: note: diagnostic msg: PLEASE submit a bug report to
https://bugs.llvm.org/ and include the crash backtrace, preprocessed source,
and associated run script.
clang-8: note: diagnostic msg: 


PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-8: note: diagnostic msg: /tmp/divtc3-106b93.c
clang-8: note: diagnostic msg: /tmp/divtc3-106b93.sh
clang-8: note: diagnostic msg: 

Note: step 2 and 3 (build clang then build compiler-rt with the resulting
clang) is done automatically when bootstrapping a 2-stage PGO clang using this
cmake configuration:
https://github.com/llvm-mirror/clang/blob/master/cmake/caches/PGO.cmake

I don't know how to help more in investigating this regression. If I can do
something, please ask.

Cheers,
Romain

[Bug target/88715] Cross Compile on Debian Linux to create a uClibc tool chain GCC fails to compile

2019-01-06 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88715

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2019-01-06
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Did you build binutils for i686-uClibc-linux?  If not this is not a bug.

[Bug tree-optimization/88719] [9 Regression] wrong code at -O2, -O3, and -Os on x86_64-linux-gnu

2019-01-06 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88719

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski  ---
This is not a bug but undefined behavior as you have an alias violation; even
though you have an union there.

if (**g = 0)
  ;
else if (*e)
  --h;


You store via long int but load via short int.
Since you don't type pune through the union type but rather the base types;
this is undefined behavior.

[Bug preprocessor/88728] Boostrap with -Og fails with garbled file libgcov-profiler.i

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88728

--- Comment #2 from Thomas Koenig  ---
Created attachment 45362
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45362=edit
config.status

And here is the config.log.

See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88729 ,  which
exposed am (apparently) different problem while trying -Og on a POWER9.

[Bug c/88729] ICE in libiberty during bootstrap with debug info

2019-01-06 Thread koenigni at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88729

Nicolas Koenig  changed:

   What|Removed |Added

 CC||koenigni at gcc dot gnu.org

--- Comment #1 from Nicolas Koenig  ---
Forgot to add, it tested it with the current trunk (r267612)

[Bug preprocessor/88728] Boostrap with -Og fails with garbled file libgcov-profiler.i

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88728

--- Comment #1 from Thomas Koenig  ---
Created attachment 45361
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45361=edit
config log

Here's the config.log from the configure.

[Bug preprocessor/88728] New: Boostrap with -Og fails with garbled file libgcov-profiler.i

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88728

Bug ID: 88728
   Summary: Boostrap with -Og fails with garbled file
libgcov-profiler.i
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

Created attachment 45360
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45360=edit
Generated libgcov-profiler.i file

I tried booting with -Og, with a strange result.

Setting CFLAGS and CXXFLAGS to "-Og -save-temps" and running
with a recent trunk resulted in

RMAT -DHAVE_CC_TLS  -DUSE_TLS -o _gcov_indirect_call_topn_profiler.o -MT
_gcov_indirect_call_topn_profiler.o -MD -MP -MF
_gcov_indirect_call_topn_profiler.dep -DL_gcov_indirect_call_topn_profiler -c
../../../ggdb3/libgcc/libgcov-profiler.c
../../../ggdb3/libgcc/libgcov-profiler.c:352:1: Warnung: Datendefinition hat
keinen Typ oder Speicherklasse
  352 | #endif
  | ^~~~
../../../ggdb3/libgcc/libgcov-profiler.c:352:1: Warnung: »int« ist Standardtyp
in Deklaration von »alue« [-Wimplicit-int]
../../../ggdb3/libgcc/libgcov-profiler.c:352:7: Fehler: expected identifier or
»(« before numeric constant
  352 | #endif
  |   ^
../../../ggdb3/libgcc/libgcov-profiler.c:353:23: Fehler: expected declaration
specifiers or »...« before »&« token
  353 | 
  |   ^
../../../ggdb3/libgcc/libgcov-profiler.c:353:37: Fehler: expected declaration
specifiers or »...« before numeric constant
  353 | 
  | ^
../../../ggdb3/libgcc/libgcov-profiler.c:353:40: Fehler: expected declaration
specifiers or »...« before numeric constant
  353 | 
  |^
../../../ggdb3/libgcc/libgcov-profiler.c:354:1: Fehler: expected identifier or
»(« before »}« token

The generated file libgcov-profiler.i (attached) is quite garbled:

# 159 "../../../ggdb3/libgcc/libgcov-profiler.c"
void
__gcov_one_value_profiler_atomic (gcov_type *counters, gcov_type value)
{
  __gcov_one_value_profiler_body (counters, value, 1);
}
omic)
__atomic_fetch_add ([2], 1, 0);
  else
counters[2]++;
}
= value;

[Bug c/88729] New: ICE in libiberty during bootstrap with debug info

2019-01-06 Thread koenigni at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88729

Bug ID: 88729
   Summary: ICE in libiberty during bootstrap with debug info
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: koenigni at gcc dot gnu.org
  Target Milestone: ---

While trying to compile gcc with maximum debug info it ICE'd in
libiberty/d-demangle.c in expand_gimple_statement().

The error happens only on power (powerpc64le-unknown-linux-gnu), on x86-64 it
dies in a different way a bit later (Thomas will file the bug report for that).

Below is the way the compiler was configured and the command that caused the
ICE. 

$ CFLAGS="-Og -ggdb3" CXXFLAGS="-Og -ggdb3" ../trunk/configure
--snip--
$ make
--snip--
  gcc -c -DHAVE_CONFIG_H -Og -ggdb3 -save-temps  -I.
-I../../../trunk/libiberty/../include  -W -Wall -Wwrite-strings -Wc++-compat
-Wstrict-prototypes -Wshadow=local -pedantic  -D_GNU_SOURCE  
../../../trunk/libiberty/d-demangle.c -o noasan/d-demangle.o; \
else true; fi
gcc -c -DHAVE_CONFIG_H -Og -ggdb3 -save-temps  -I.
-I../../../trunk/libiberty/../include  -W -Wall -Wwrite-strings -Wc++-compat
-Wstrict-prototypes -Wshadow=local -pedantic  -D_GNU_SOURCE
../../../trunk/libiberty/d-demangle.c -o d-demangle.o
../../../trunk/libiberty/d-demangle.c: In function ‘dlang_value’:
../../../trunk/libiberty/d-demangle.c:1278:14: warning: this statement may fall
through [-Wimplicit-fallthrough=]
1278 |   mangled++;
 |   ~~~^~
../../../trunk/libiberty/d-demangle.c:1284:5: note: here
1284 | case '0': case '1': case '2': case '3': case '4':
 | ^~~~
../../../trunk/libiberty/d-demangle.c: In function ‘dlang_type’:
../../../trunk/libiberty/d-demangle.c:619:10: warning: this statement may fall
through [-Wimplicit-fallthrough=]
619 |   if (!dlang_call_convention_p (mangled))
|  ^
../../../trunk/libiberty/d-demangle.c:626:5: note: here
626 | case 'F': /* function T (D) */
| ^~~~
during RTL pass: expand
../../../trunk/libiberty/d-demangle.c: In function ‘dlang_identifier’:
../../../trunk/libiberty/d-demangle.c:850:8: internal compiler error: in
set_value_range, at tree-vrp.c:289
850 |if (strncmp (mangled, "__ctor", len) == 0)
|^~~~
0x10d620b3 set_value_range(value_range*, value_range_type, tree_node*,
tree_node*, bitmap_head*)
../../trunk-caf/gcc/tree-vrp.c:289
0x10d6c2d3 extract_range_from_binary_expr_1(value_range*, tree_code,
tree_node*, value_range*, value_range*)
../../trunk-caf/gcc/tree-vrp.c:1604
0x10d6e76b determine_value_range_1
../../trunk-caf/gcc/tree-vrp.c:6865
0x10d6eb43 determine_value_range(tree_node*,
generic_wide_int*, generic_wide_int*)
../../trunk-caf/gcc/tree-vrp.c:6900
0x10393243 get_size_range(tree_node*, tree_node**, bool)
../../trunk-caf/gcc/calls.c:1258
0x1039a00b maybe_warn_nonstring_arg(tree_node*, tree_node*)
../../trunk-caf/gcc/calls.c:1617
0x1039bafb initialize_argument_information
../../trunk-caf/gcc/calls.c:2197
0x1039cfdf expand_call(tree_node*, rtx_def*, int)
../../trunk-caf/gcc/calls.c:3577
0x10375c2f expand_builtin_strncmp
../../trunk-caf/gcc/builtins.c:4793
0x103861cb expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
../../trunk-caf/gcc/builtins.c:7445
0x1054ae3f expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../trunk-caf/gcc/expr.c:10943
0x10564f53 expand_expr
../../trunk-caf/gcc/expr.h:279
0x10564f53 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
../../trunk-caf/gcc/expr.c:8456
0x1054733b expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../trunk-caf/gcc/expr.c:11230
0x1055a9c7 expand_expr
../../trunk-caf/gcc/expr.h:279
0x1055a9c7 store_expr(tree_node*, rtx_def*, int, bool, bool)
../../trunk-caf/gcc/expr.c:5556
0x1055c653 expand_assignment(tree_node*, tree_node*, bool)
../../trunk-caf/gcc/expr.c:5420
0x103b6c6f expand_call_stmt
../../trunk-caf/gcc/cfgexpand.c:2685
0x103b6c6f expand_gimple_stmt_1
../../trunk-caf/gcc/cfgexpand.c:3575
0x103b6c6f expand_gimple_stmt
../../trunk-caf/gcc/cfgexpand.c:3734
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
Makefile:703: recipe for target 'd-demangle.o' failed
make[3]: *** [d-demangle.o] Error 1
make[3]: Leaving directory
'/home/gcc/trunk-bin/build-powerpc64le-unknown-linux-gnu/libiberty'
Makefile:2686: recipe for target 'all-build-libiberty' failed
make[2]: *** [all-build-libiberty] Error 2
make[2]: Leaving directory '/home/gcc/trunk-bin'
Makefile:27153: recipe for target

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 88713, which changed state.

Bug 88713 Summary: Vectorized code slow vs. flang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|WONTFIX |---

[Bug tree-optimization/88713] Vectorized code slow vs. flang

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

Thomas Koenig  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
   Last reconfirmed||2019-01-06
  Component|fortran |tree-optimization
 Blocks|36854   |53947
 Resolution|WONTFIX |---
Summary|_gfortran_internal_pack@PLT |Vectorized code slow vs.
   |prevents vectorization  |flang
 Ever confirmed|0   |1

--- Comment #11 from Thomas Koenig  ---
OK, so I think it makes sense to reopen this bug as a missed
optimization for the vectorizer (reopen because it would be a shame
to lose all the info you already provided).

It seems like gcc could be much better, also possibly with some
more help from the gfortran front end.  A factor of two is not to
be ignored.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36854
[Bug 36854] [meta-bug] fortran front-end optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug ipa/87957] [9 Regression] ICE tree check: expected tree that contains ‘decl minimal’ structure, have ‘identifier_node’ in warn_odr, at ipa-devirt.c:1051 since r265519

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87957

Jan Hubicka  changed:

   What|Removed |Added

 CC||jason at redhat dot com

--- Comment #23 from Jan Hubicka  ---
Hello,
sorry for late response. The assert which trigger is there so we do not copy
one DECL_NAME multiple times which code would do when there are multiple types
that points to same TYPE_NAME being simplified.

Because early debug info is already output we are supposed to translate all
TYPE_NAMEs to IDENTIFIER_NODE with exception of main variants of C++ ODR types
where TYPE_NAME is preserved and we compute its DECL_ASSEMBLER_NAME so we can
identify same types cross-module later.

The name is supposed to be simplified by fld_simplified_type_name.
I am not sure how to get command line for debugger but I assume that it does
not simplify type name because type_with_linkage_p returns true which is wrong
for Ada types (it tests whether it is C++ ODR type). type_with_linkage_p
basically assumes that only C++ types have DECL_NAME of TYPE_DECL which is not
true for Ada, so we need to find way to tell these types apart.

[Bug c++/86747] [8 Regression] rejects-valid with redundant friend declaration

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86747

--- Comment #7 from Jonathan Wakely  ---
This patch also fixed PR 87651 and PR 87652 which are also regressions on
gcc-8-branch.

[Bug c++/87651] [8 Regression] inner class with template template friend declaration of same name fails to compile in gcc 8.1, 8.2, and 9.0

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87651

Jonathan Wakely  changed:

   What|Removed |Added

 Status|RESOLVED|NEW
  Known to work||9.0
 Depends on||86747
 Resolution|FIXED   |---
Summary|[8/9 Regression] inner  |[8 Regression] inner class
   |class with template |with template template
   |template friend declaration |friend declaration of same
   |of same name fails to   |name fails to compile in
   |compile in gcc 8.1, 8.2,|gcc 8.1, 8.2, and 9.0
   |and 9.0 |

--- Comment #3 from Jonathan Wakely  ---
But not fixed on gcc-8-branch, so it should be kept open until the regression
is fixed.

Started to compile with r266875, which fixed:

[PR86747] tsubst friend tpl ctxt before looking it up for dupes

Possibly a dup of that one. It might get fixed when that is backported to
gcc-8-branch.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86747
[Bug 86747] [8 Regression] rejects-valid with redundant friend declaration

[Bug c/88716] Improved diagnostics: No detection of conflicting function definitions in some cases.

2019-01-06 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88716

Martin Sebor  changed:

   What|Removed |Added

 CC||msebor at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=86418,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=83656,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=82922

--- Comment #1 from Martin Sebor  ---
See also pr82922 and pr86418.

[Bug c++/87652] [8/9 Regression] inner class template of outer class template can't access friend's protected data member

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87652

Jonathan Wakely  changed:

   What|Removed |Added

 Status|RESOLVED|NEW
 Depends on||86747
 Resolution|FIXED   |---

--- Comment #5 from Jonathan Wakely  ---
But not fixed on gcc-8-branch, so it should be kept open until the regression
is fixed.

Started to compile with r266875, which fixed:

[PR86747] tsubst friend tpl ctxt before looking it up for dupes

Possibly a dup of that one. It might get fixed when that is backported to
gcc-8-branch.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86747
[Bug 86747] [8 Regression] rejects-valid with redundant friend declaration

[Bug c++/79624] comma separate auto variables deduce different types under dependent lookup

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79624

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-06
 Ever confirmed|0   |1

[Bug c++/65799] Allows constexpr conversion from cv void * to other type

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65799

Jonathan Wakely  changed:

   What|Removed |Added

Version|8.2.0   |4.9.2
   Target Milestone|--- |7.0

--- Comment #3 from Jonathan Wakely  ---
Fixed for GCC 7.1 by r238909, which fixed:

PR c++/60760 - arithmetic on null pointers should not be allowed in
constant
PR c++/71091 - constexpr reference bound to a null pointer dereference

[Bug libstdc++/86756] Don't define __cpp_lib_filesystem unless --enable-libstdcxx-filesystem-ts

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86756

Jonathan Wakely  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jonathan Wakely  ---
The C++17 std::filesystem library is now supported unconditionally, so it's
correct to define __cpp_lib_filesystem unconditionally. So this is fixed.

[Bug libstdc++/86756] Don't define __cpp_lib_filesystem unless --enable-libstdcxx-filesystem-ts

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86756

--- Comment #3 from Jonathan Wakely  ---
Author: redi
Date: Sun Jan  6 22:34:29 2019
New Revision: 267615

URL: https://gcc.gnu.org/viewcvs?rev=267615=gcc=rev
Log:
PR libstdc++/86756 add std::filesystem::path to libstdc++.so

Move the C++17 std::filesystem::path definitions from the libstdc++fs.a
archive to the main libstdc++ library. The path classes do not depend on
any OS functions, so can be defined unconditionally on all targets
(rather than depending on --enable-libstdcxx-filesystem-ts). The tests
should pass on all targets too.

PR libstdc++/86756
* config/abi/pre/gnu.ver (GLIBCXX_3.4): Make various patterns for
typeinfo and vtables less greedy.
(GLIBCXX_3.4.26): Export symbols for std::filesystem::path.
* src/c++17/Makefile.am: Add fs_path.cc and cow-fs_path.cc.
* src/c++17/Makefile.in: Regenerate.
* src/c++17/cow-fs_path.cc: Move src/filesystem/cow-std-path.cc to
here, and change name of included file.
* src/c++17/fs_path.cc: Move src/filesystem/std-path.cc to here.
* src/filesystem/Makefile.am: Remove std-path.cc and cow-std-path.cc
from sources.
* src/filesystem/Makefile.in: Regenerate.
* src/filesystem/cow-std-path.cc: Move to src/c++17/cow-fs_path.cc.
* src/filesystem/std-path.cc: Move to src/c++17/fs_path.cc.
* testsuite/27_io/filesystem/path/append/path.cc: Remove -lstdc++fs
from dg-options and remove dg-require-filesystem-ts.
* testsuite/27_io/filesystem/path/append/source.cc: Likewise.
* testsuite/27_io/filesystem/path/assign/assign.cc: Likewise.
* testsuite/27_io/filesystem/path/assign/copy.cc: Likewise.
* testsuite/27_io/filesystem/path/compare/compare.cc: Likewise.
* testsuite/27_io/filesystem/path/compare/lwg2936.cc: Likewise.
* testsuite/27_io/filesystem/path/compare/path.cc: Likewise.
* testsuite/27_io/filesystem/path/compare/strings.cc: Likewise.
* testsuite/27_io/filesystem/path/concat/path.cc: Likewise.
* testsuite/27_io/filesystem/path/concat/strings.cc: Likewise.
* testsuite/27_io/filesystem/path/construct/80762.cc: Likewise.
* testsuite/27_io/filesystem/path/construct/copy.cc: Likewise.
* testsuite/27_io/filesystem/path/construct/default.cc: Likewise.
* testsuite/27_io/filesystem/path/construct/format.cc: Likewise.
* testsuite/27_io/filesystem/path/construct/locale.cc: Likewise.
* testsuite/27_io/filesystem/path/construct/range.cc: Likewise.
* testsuite/27_io/filesystem/path/construct/string_view.cc: Likewise.
* testsuite/27_io/filesystem/path/decompose/extension.cc: Likewise.
* testsuite/27_io/filesystem/path/decompose/filename.cc: Likewise.
* testsuite/27_io/filesystem/path/decompose/parent_path.cc: Likewise.
* testsuite/27_io/filesystem/path/decompose/relative_path.cc: Likewise.
* testsuite/27_io/filesystem/path/decompose/root_directory.cc:
Likewise.
* testsuite/27_io/filesystem/path/decompose/root_name.cc: Likewise.
* testsuite/27_io/filesystem/path/decompose/root_path.cc: Likewise.
* testsuite/27_io/filesystem/path/decompose/stem.cc: Likewise.
* testsuite/27_io/filesystem/path/generation/normal.cc: Likewise.
* testsuite/27_io/filesystem/path/generation/normal2.cc: Likewise.
* testsuite/27_io/filesystem/path/generation/proximate.cc: Likewise.
* testsuite/27_io/filesystem/path/generation/relative.cc: Likewise.
* testsuite/27_io/filesystem/path/generic/generic_string.cc: Likewise.
* testsuite/27_io/filesystem/path/itr/components.cc: Likewise.
* testsuite/27_io/filesystem/path/itr/traversal.cc: Likewise.
* testsuite/27_io/filesystem/path/modifiers/clear.cc: Likewise.
* testsuite/27_io/filesystem/path/modifiers/make_preferred.cc:
Likewise.
* testsuite/27_io/filesystem/path/modifiers/remove_filename.cc:
Likewise.
* testsuite/27_io/filesystem/path/modifiers/replace_extension.cc:
Likewise.
* testsuite/27_io/filesystem/path/modifiers/replace_filename.cc:
Likewise.
* testsuite/27_io/filesystem/path/modifiers/swap.cc: Likewise.
* testsuite/27_io/filesystem/path/native/string.cc: Likewise.
* testsuite/27_io/filesystem/path/nonmember/append.cc: Likewise.
* testsuite/27_io/filesystem/path/nonmember/hash_value.cc: Likewise.
* testsuite/27_io/filesystem/path/query/empty.cc: Likewise.
* testsuite/27_io/filesystem/path/query/has_extension.cc: Likewise.
* testsuite/27_io/filesystem/path/query/has_filename.cc: Likewise.
* testsuite/27_io/filesystem/path/query/has_parent_path.cc: Likewise.
* testsuite/27_io/filesystem/path/query/has_relative_path.cc: Likewise.
*

[Bug libstdc++/86756] Don't define __cpp_lib_filesystem unless --enable-libstdcxx-filesystem-ts

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86756

--- Comment #4 from Jonathan Wakely  ---
Author: redi
Date: Sun Jan  6 22:34:37 2019
New Revision: 267616

URL: https://gcc.gnu.org/viewcvs?rev=267616=gcc=rev
Log:
PR libstdc++/86756 Move rest of std::filesystem to libstdc++.so

Move std::filesystem directory iterators and operations from
libstdc++fs.a to main libstdc++ library. These components have many
dependencies on OS support, which is not available on all targets. Some
additional autoconf checks and conditional compilation is needed to
ensure the files will build for all targets. Previously this code was
not compiled without --enable-libstdcxx-filesystem-ts but the C++17
components should be available for all hosted builds.

The tests for these components no longer need to link to libstdc++fs.a,
but are not expected to pass on all targets. To avoid numerous failures
on targets which are not expected to pass the tests (due to missing OS
functionality) leave the dg-require-filesystem-ts directives in place
for now. This will ensure the tests only run for builds where the
filesystem-ts library is built, which presumably means some level of OS
support is present.

PR libstdc++/86756
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for utime and
lstat and define _GLIBCXX_USE_UTIME and _GLIBCXX_USE_LSTAT.
* config.h.in: Regenerate.
* config/abi/pre/gnu.ver (GLIBCXX_3.4.26): Export symbols for
remaining std::filesystem types and functions.
* configure: Regenerate.
* src/c++17/Makefile.am: Add C++17 filesystem sources.
* src/c++17/Makefile.in: Regenerate.
* src/c++17/cow-fs_dir.cc: Move src/filesystem/cow-std-dir.cc to
here, and change name of included file.
* src/c++17/cow-fs_ops.cc: Move src/filesystem/cow-std-ops.cc to
here, and change name of included file.
* src/c++17/fs_dir.cc: Move src/filesystem/std-dir.cc to here. Change
path to dir-common.h.
* src/c++17/fs_ops.cc: Move src/filesystem/std-ops.cc to here. Change
path to ops-common.h. Disable -Wunused-parameter warnings.
(internal_file_clock): Define unconditionally.
[!_GLIBCXX_HAVE_SYS_STAT_H] (internal_file_clock::from_stat): Do not
define.
(do_copy_file, do_space): Move definitions to ops.common.h.
(copy, file_size, hard_link_count, last_write_time, space): Only
perform operation when _GLIBCXX_HAVE_SYS_STAT_H is defined, otherwise
report an error.
(last_write_time, read_symlink): Remove unused attributes from
parameters.
* src/filesystem/Makefile.am: Remove C++17 filesystem sources.
* src/filesystem/Makefile.in: Regenerate.
* src/filesystem/cow-std-dir.cc: Move to src/c++17/cow-fs_dir.cc.
* src/filesystem/cow-std-ops.cc: Move to src/c++17/cow-fs_ops.cc.
* src/filesystem/std-dir.cc: Move to src/c++17/fs_dir.cc.
* src/filesystem/std-ops.cc: Move to src/c++17/fs_ops.cc.
* src/filesystem/dir-common.h [!_GLIBCXX_HAVE_DIRENT_H]: Define
dummy types and functions instead of using #error.
* src/filesystem/dir.cc [!_GLIBCXX_HAVE_DIRENT_H]: Use #error.
* src/filesystem/ops-common.h [!_GLIBCXX_USE_LSTAT] (lstat): Define
in terms of stat.
[!_GLIBCXX_HAVE_UNISTD_H]: Define dummy types and functions.
(do_copy_file, do_space): Move definitions here from std-ops.cc.
* src/filesystem/ops.cc: Adjust calls to do_copy_file and do_space
to account for new namespace.
* testsuite/27_io/filesystem/directory_entry/86597.cc: Remove
-lstdc++fs from dg-options.
* testsuite/27_io/filesystem/directory_entry/lwg3171.cc: Likewise.
* testsuite/27_io/filesystem/file_status/1.cc: Likewise.
* testsuite/27_io/filesystem/filesystem_error/cons.cc: Likewise.
* testsuite/27_io/filesystem/filesystem_error/copy.cc: Likewise.
* testsuite/27_io/filesystem/iterators/directory_iterator.cc:
Likewise.
* testsuite/27_io/filesystem/iterators/pop.cc: Likewise.
* testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Likewise.
* testsuite/27_io/filesystem/operations/absolute.cc: Likewise.
* testsuite/27_io/filesystem/operations/canonical.cc: Likewise.
* testsuite/27_io/filesystem/operations/copy.cc: Likewise.
* testsuite/27_io/filesystem/operations/copy_file.cc: Likewise.
* testsuite/27_io/filesystem/operations/create_directories.cc:
Likewise.
* testsuite/27_io/filesystem/operations/create_directory.cc: Likewise.
* testsuite/27_io/filesystem/operations/create_symlink.cc: Likewise.
* testsuite/27_io/filesystem/operations/current_path.cc: Likewise.
* testsuite/27_io/filesystem/operations/equivalent.cc: Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
*

[Bug c++/68423] override/final doesn't cause error in templated class without base

2019-01-06 Thread haining.cpp at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68423

Ryan R Haining  changed:

   What|Removed |Added

 Resolution|WONTFIX |INVALID

--- Comment #3 from Ryan R Haining  ---
Seems like WONTFIX by the last comment

[Bug c++/68423] override/final doesn't cause error in templated class without base

2019-01-06 Thread haining.cpp at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68423

Ryan R Haining  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #2 from Ryan R Haining  ---
Seems like WONTFIX by the last comment

[Bug c++/79624] comma separate auto variables deduce different types under dependent lookup

2019-01-06 Thread haining.cpp at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79624

--- Comment #4 from Ryan R Haining  ---
still a problem on head: https://wandbox.org/permlink/tPYbCp1jWYmc9O9J

[Bug c++/65799] Allows constexpr conversion from cv void * to other type

2019-01-06 Thread haining.cpp at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65799

Ryan R Haining  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Ryan R Haining  ---
code is correctly failing on head:
https://wandbox.org/permlink/6vj1agxCRZaTYs7b

[Bug c++/87652] [8/9 Regression] inner class template of outer class template can't access friend's protected data member

2019-01-06 Thread haining.cpp at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87652

Ryan R Haining  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Ryan R Haining  ---
working on head: https://wandbox.org/permlink/4dCLIfaIPWij81Vi

[Bug c++/87651] [8/9 Regression] inner class with template template friend declaration of same name fails to compile in gcc 8.1, 8.2, and 9.0

2019-01-06 Thread haining.cpp at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87651

Ryan R Haining  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
  Known to fail|9.0 |

--- Comment #2 from Ryan R Haining  ---
It looks to be working on head now:
https://wandbox.org/permlink/FTjpi30scFEAgGyO

[Bug c/88727] New: Diagnostics improvement: Detection of undefined behaviour. Incomplete type in tenative definition with internal linkage.

2019-01-06 Thread anders.granlund.0 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88727

Bug ID: 88727
   Summary: Diagnostics improvement: Detection of undefined
behaviour. Incomplete type in tenative definition with
internal linkage.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anders.granlund.0 at gmail dot com
  Target Milestone: ---

Test case (prog.c):

  static struct S s;

  int main()
  {
  }

  struct S { int x; };

Compilation command line:

  gcc prog.c -Wall -Wextra -std=c11 -pedantic-errors 

Observed behaviour:

  No error messages outputes.

Possible improvement of behaviour:

  Outputing an error message about using an incomplete type in the tenative
  definition  static struct S s; .

  The program has undefined behaviour becuase of a violation of 6.9.2/2:

  "If the declaration of an identifier for an object is a tentative definition
   and has internal linkage, the declared type shall not be an incomplete
type."

  GCC detects such undefined behaviour in other cases (for example using the 
  incomplete type int []). It would be good if it could also hande the case in
  the test case for this bug report.

Note:

  Clang detects the undefined behaviour for this program and outputs an error
  message.

[Bug libstdc++/87431] valueless_by_exception() should unconditionally return false if all the constructors are noexcept

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87431

Jonathan Wakely  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Jonathan Wakely  ---
Should be fixed now.

[Bug libstdc++/87431] valueless_by_exception() should unconditionally return false if all the constructors are noexcept

2019-01-06 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87431

--- Comment #14 from Jonathan Wakely  ---
Author: redi
Date: Sun Jan  6 20:52:34 2019
New Revision: 267614

URL: https://gcc.gnu.org/viewcvs?rev=267614=gcc=rev
Log:
PR libstdc++/87431 fix regression introduced by r264574

The previous patch for PR 87431 assumed that initialing a scalar type
could not throw, but it can obtain its value via a conversion operator,
which could throw. This meant the variant could get into a valueless
state, but the valueless_by_exception() member function would always
return false.

This patch fixes it by changing the emplace members to have strong
exception safety when initializing a contained value of trivially
copyable type. The _M_valid() member gets a corresponding change to
always return true for trivially copyable types, not just scalar types.

Strong exception safety (i.e. never becoming valueless) is achieved by
only replacing the current contained value once any potentially throwing
operations have completed. If constructing the new contained value can
throw then a new std::variant object is constructed to hold it, and then
move-assigned to *this (which won't throw).

PR libstdc++/87431
* include/std/variant (_Variant_storage::_M_valid):
Check is_trivially_copyable instead of is_scalar.
(variant::emplace(Args&&...)): If construction of the new
contained value can throw and its type is trivially copyable then
construct into a temporary variant and move from it, to provide the
strong exception safety guarantee.
(variant::emplace(initializer_list, Args&&...)):
Likewise.
* testsuite/20_util/variant/87431.cc: New test.
* testsuite/20_util/variant/run.cc: Adjust test so that throwing
conversion causes valueless state.

Added:
trunk/libstdc++-v3/testsuite/20_util/variant/87431.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/std/variant
trunk/libstdc++-v3/testsuite/20_util/variant/run.cc

[Bug debug/88723] [9 regression] PR debug/88635 patch breaks testsuite_shared.cc compilation

2019-01-06 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88723

--- Comment #3 from Jakub Jelinek  ---
Created attachment 45359
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45359=edit
gcc9-pr88723.patch

Actually, UNSPEC_MOVE_GOTDATA always has non-constant arguments, such UNSPECs
we would never consider in the hook before.  Does the following patch fix that?

[Bug lto/88130] [9 Regression] ICE in copy_function_or_variable, at lto-streamer-out.c:2315 since r260963

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88130

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Jan Hubicka  ---
Fixed.

[Bug lto/88130] [9 Regression] ICE in copy_function_or_variable, at lto-streamer-out.c:2315 since r260963

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88130

--- Comment #7 from Jan Hubicka  ---
Author: hubicka
Date: Sun Jan  6 20:11:15 2019
New Revision: 267613

URL: https://gcc.gnu.org/viewcvs?rev=267613=gcc=rev
Log:

Backport from mainline
2019-01-02  Jan Hubicka  

PR lto/88130
* varpool.c (varpool_node::ctor_useable_for_folding_p): Also return
false at WPA time when body was removed.

Added:
branches/gcc-8-branch/gcc/testsuite/g++.dg/torture/pr88130.C
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/varpool.c

[Bug ipa/85103] [8/9 Regression] Performance regressions on SPEC with r257582

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85103

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

--- Comment #18 from Jan Hubicka  ---
I see, I can reproduce it when I remove __inline__ in the function and build
with O3.  What happens is the following.   MainGtU is quite large hand unrolled
implementation of N^2logN sorting

Bool mainGtU ( UInt32  i1, 
   UInt32  i2,
   UChar*  block, 
   UInt16* quadrant,
   UInt32  nblock,
   Int32*  budget )
{
   Int32  k;
   UChar  c1, c2;
   UInt16 s1, s2;

   AssertD ( i1 != i2, "mainGtU" );
   /* 1 */
   c1 = block[i1]; c2 = block[i2];
   if (c1 != c2) return (c1 > c2);
   i1++; i2++;
   /* 2 */
   c1 = block[i1]; c2 = block[i2];
   if (c1 != c2) return (c1 > c2);
   i1++; i2++;
   /* 3 */
   c1 = block[i1]; c2 = block[i2];
   if (c1 != c2) return (c1 > c2);
   i1++; i2++;
   /* 4 */
   c1 = block[i1]; c2 = block[i2];
   if (c1 != c2) return (c1 > c2);
   i1++; i2++;
...

we decide to split it after first 5 conditionals for some reason.
The partial function is large
IPA function summary for mainGtU.part.0/41 inlinable
  global time: 240.00
  self size:   243
  global size: 243
  min size:   0
  self stack:  0
  global stack:0
size:200.00, time:199.00
size:4.00, time:2.00,  executed if:(not inlined)
size:10.00, time:10.00,  nonconst if:(op0 changed)
size:10.00, time:10.00,  nonconst if:(op1 changed)
size:9.00, time:9.00,  nonconst if:(op0 changed || op2 changed)
size:9.00, time:9.00,  nonconst if:(op1 changed || op2 changed)
size:1.00, time:1.00,  nonconst if:(op4 changed)
  calls:

While the outer function is:

IPA function summary for mainGtU/30 inlinable
  global time: 29.125000
  self size:   36
  global size: 36
  min size:   16
  self stack:  0
  global stack:0
size:15.00, time:15.00
size:3.00, time:2.00,  executed if:(not inlined)
size:3.00, time:3.00,  nonconst if:(op0 changed || op2 changed)
size:3.00, time:3.00,  nonconst if:(op2 changed || op1 changed)
size:2.00, time:2.00,  nonconst if:(op0 changed)
size:2.00, time:2.00,  nonconst if:(op1 changed)
  calls:
mainGtU.part.0/41 function not considered for inlining
  loop depth: 0 freq:0.12 size: 8 time: 17 callee size:121 stack: 0

with 200 instructions that we think can't be optimized (I am not sure why we do
not track accesses to individual block indices).

Later we indeed consider mainGtU.part before the split away part:
Badness calculation for mainGtU/30 -> mainGtU.part.0/41
  size growth 231, time 238.00 unspec 240.00 
  -0.000181: guessed profile. frequency 0.125000, count -1 caller count -1
time w/o inlining 59.125000, time with inlining 56.75 overall growth -12
(current) -12 (original) -12 (compensated)

later we consider the individual parts

 Estimated badness is -0.01, frequency 7718.74.
Badness calculation for mainSimpleSort/32 -> mainGtU/30
  size growth 256, time 54.75 unspec 56.75  big_speedup
  -0.01: guessed profile. frequency 7718.740543, count -1 caller count
-1 time w/o inlining 827687.848633, time with inlining 681031.777832 overall
growth 501 (current) 39 (original) 1521 (compensated)
  Adjusted by hints -0.01

and inline first one because speedup is considered to be big, but after
inlining the function becomes heavy and remaining two are not inlined.

There is mis-accounting bug for the time needed for execution of manGtU. I
fixed it yesterday for trunk which now has more realistic time estimate for the
sequence of ifs:
IPA function summary for mainGtU.part.0/41 inlinable
  global time: 19.766641
  self size:   243  
  global size: 243  
  min size:   0 
  self stack:  0
  global stack:0
size:200.00, time:9.771543  
size:4.00, time:2.004863,  executed if:(not inlined)
size:10.00, time:1.998047,  nonconst if:(op0 changed)   
size:10.00, time:1.998047,  nonconst if:(op1 changed)   
size:9.00, time:1.996094,  nonconst if:(op0 changed || op2 changed) 
size:9.00, time:1.996094,  nonconst if:(op1 changed || op2 changed) 
size:1.00, time:0.001953,

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-06 Thread elrodc at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #10 from Chris Elrod  ---
(In reply to Thomas Koenig from comment #9)
> Hm.
> 
> It would help if your benchmark was complete, so I could run it.
> 

I don't suppose you happen to have and be familiar with Julia? If you (or
someone else here is), I'll attach the code to generate the fake data (the most
important point is that columns 5:10 of BPP are the upper triangle of a 3x3
symmetric positive definite matrix).

I have also already written a manually unrolled version that gfortran likes..

But I could write Fortran code to create an executable and run benchmarks.
What are best practices? system_clock?

(In reply to Thomas Koenig from comment #9)
> 
> However, what happens if you put int
> 
> real, dimension(:) ::  Uix
> real, dimension(:), intent(in)  ::  x
> real, dimension(:), intent(in)  ::  S
> 
> ?
> 
> gfortran should not pack then.

You're right! I wasn't able to follow this exactly, because it didn't want me
to defer shape on Uix. Probably because it needs to compile a version of
fpdbacksolve that can be called from the shared library?

Interestingly, with that change, Flang failed to vectorize the code, but
gfortran did. Compilers are finicky.

Flang, original:

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --
  minimum time: 655.827 ns (0.00% GC)
  median time:  665.698 ns (0.00% GC)
  mean time:689.967 ns (0.00% GC)
  maximum time: 1.061 μs (0.00% GC)
  --
  samples:  1
  evals/sample: 162

Flang, not specifying shape: # assembly shows it is using xmm

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --
  minimum time: 8.086 μs (0.00% GC)
  median time:  8.315 μs (0.00% GC)
  mean time:8.591 μs (0.00% GC)
  maximum time: 20.299 μs (0.00% GC)
  --
  samples:  1
  evals/sample: 3

gfortran, transposed version (not vectorizable): 

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --
  minimum time: 20.643 μs (0.00% GC)
  median time:  20.901 μs (0.00% GC)
  mean time:21.441 μs (0.00% GC)
  maximum time: 54.103 μs (0.00% GC)
  --
  samples:  1
  evals/sample: 1

gfortran, not specifying shape:

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --
  minimum time: 1.290 μs (0.00% GC)
  median time:  1.316 μs (0.00% GC)
  mean time:1.347 μs (0.00% GC)
  maximum time: 4.562 μs (0.00% GC)
  --
  samples:  1
  evals/sample: 10


Assembly confirms it is using zmm registers (but this time is much too fast not
to be vectorized, anyway).


For why gfortran is still slower than the Flang version, here is the loop body:

.L16:
vmovups (%r10,%rax), %zmm0
vcmpps  $4, %zmm0, %zmm4, %k1
vrsqrt14ps  %zmm0, %zmm1{%k1}{z}
vmulps  %zmm0, %zmm1, %zmm2
vmulps  %zmm1, %zmm2, %zmm0
vmulps  %zmm5, %zmm2, %zmm2
vaddps  %zmm6, %zmm0, %zmm0
vmulps  %zmm2, %zmm0, %zmm0
vrcp14ps%zmm0, %zmm8
vmulps  %zmm0, %zmm8, %zmm0
vmulps  %zmm0, %zmm8, %zmm0
vaddps  %zmm8, %zmm8, %zmm8
vsubps  %zmm0, %zmm8, %zmm8
vmulps  (%r8,%rax), %zmm8, %zmm9
vmulps  (%r9,%rax), %zmm8, %zmm10
vmulps  (%r12,%rax), %zmm8, %zmm8
vmovaps %zmm9, %zmm3
vfnmadd213ps0(%r13,%rax), %zmm9, %zmm3
vcmpps  $4, %zmm3, %zmm4, %k1
vrsqrt14ps  %zmm3, %zmm2{%k1}{z}
vmulps  %zmm3, %zmm2, %zmm3
vmulps  %zmm2, %zmm3, %zmm1
vmulps  %zmm5, %zmm3, %zmm3
vaddps  %zmm6, %zmm1, %zmm1
vmulps  %zmm3, %zmm1, %zmm1
vmovaps %zmm9, %zmm3
vfnmadd213ps(%rdx,%rax), %zmm10, %zmm3
vrcp14ps%zmm1, %zmm0
vmulps  %zmm1, %zmm0, %zmm1
vmulps  %zmm1, %zmm0, %zmm1
vaddps  %zmm0, %zmm0, %zmm0
vsubps  %zmm1, %zmm0, %zmm11
vmulps  %zmm11, %zmm3, %zmm12
vmovaps %zmm10, %zmm3
vfnmadd213ps(%r14,%rax), %zmm10, %zmm3
vfnmadd231ps%zmm12, %zmm12, %zmm3
vcmpps  $4, %zmm3, %zmm4, %k1
vrsqrt14ps  %zmm3, %zmm1{%k1}{z}
vmulps  %zmm3, %zmm1, %zmm3
vmulps  %zmm1, %zmm3, %zmm0
vmulps  %zmm5, %zmm3, %zmm3
vmovups (%rcx,%rax), %zmm1
vaddps  %zmm6, %zmm0, %zmm0
vmulps  %zmm3, %zmm0, %zmm0
vrcp14ps%zmm0, %zmm2
vmulps  %zmm0, %zmm2, %zmm0
vmulps  %zmm0, %zmm2, %zmm0
vaddps  %zmm2, %zmm2, %zmm2
vsubps  %zmm0, %zmm2, %zmm0
vmulps  %zmm0, %zmm11, %zmm3
vmulps  %zmm12, %zmm3, %zmm3
vxorps  %zmm7, %zmm3, %zmm3
vmulps  %zmm1, %zmm3, %zmm2
vmulps  %zmm3, %zmm9, %zmm3
vfnmadd231ps%zmm8, %zmm9, %zmm1

[Bug c/80354] Poor support to silence -Wformat-truncation=1

2019-01-06 Thread colomar.6.4.3 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80354

Alejandro Colomar  changed:

   What|Removed |Added

 CC||colomar.6.4.3 at gmail dot com

--- Comment #10 from Alejandro Colomar  ---
In some cases, that warning saved me from writing dangerous code (using file
paths), and it's been very useful.  But casting to void means one and only one
thing:  _I DO know what I'm doing and I do NOT care, so STFU!_
And if the casting to void is dangerous (ie. casting to void the result of
storing a path), that's my fault.

I currently have a program where I want, for some reason, to truncate the
output to an 80-char line:


#include 

#define LINE_SIZE   (80)

int main(int argc, char *argv[])
{
chartruncated[LINE_SIZE];

(void)snprintf(truncated, LINE_SIZE, "I DO want to truncate this line "
"to a 80 char line, and discard any remaining text so "
"that it is not displayed or stored anywhere because I
"
"don't care about it. "
" Usually one would check for the return value of "
"snprintf, but in this precise situation, I do not "
"care about it, so the explicit cast to (void) should "
"silence the warning, just like it does in other "
"warnings such as -Wunused-variable.");

printf("%s\n", truncated);

return 0;
}


But in the same project I have paths and that kind of uses of snprintf where I
still want to have the warnings enabled (and even more, -Werror so that they
never compile):


if (snprintf(file_name, FILENAME_MAX, "%s/%s", file_path, saved_name))
{
goto err_path;
}


Many other warnings are supressed with (void), why is this one so special?


Anyways, I'm reporting a new bug, because this one is "RESOLVED INVALID".

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #9 from Thomas Koenig  ---
Hm.

It would help if your benchmark was complete, so I could run it.

However, what happens if you put int

real, dimension(:) ::  Uix
real, dimension(:), intent(in)  ::  x
real, dimension(:), intent(in)  ::  S

?

gfortran should not pack then.

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-06 Thread elrodc at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #8 from Chris Elrod  ---
Created attachment 45358
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45358=edit
gfortran compiled assembly for the tranposed version of the original code.

Here is the assembly for the loop body of the transposed version of the code,
compiled by gfortran:


.L8:
vmovss  36(%rsi), %xmm0
addq$40, %rsi
vrsqrtss%xmm0, %xmm2, %xmm2
addq$12, %rdi
vmulss  %xmm0, %xmm2, %xmm0
vmulss  %xmm2, %xmm0, %xmm0
vmulss  %xmm7, %xmm2, %xmm2
vaddss  %xmm8, %xmm0, %xmm0
vmulss  %xmm2, %xmm0, %xmm0
vmulss  -8(%rsi), %xmm0, %xmm5
vmulss  -12(%rsi), %xmm0, %xmm4
vmulss  -32(%rsi), %xmm0, %xmm0
vmovaps %xmm5, %xmm3
vfnmadd213ss-16(%rsi), %xmm5, %xmm3
vmovaps %xmm4, %xmm2
vfnmadd213ss-20(%rsi), %xmm5, %xmm2
vmovss  %xmm0, -4(%rdi)
vrsqrtss%xmm3, %xmm1, %xmm1
vmulss  %xmm3, %xmm1, %xmm3
vmulss  %xmm1, %xmm3, %xmm3
vmulss  %xmm7, %xmm1, %xmm1
vaddss  %xmm8, %xmm3, %xmm3
vmulss  %xmm1, %xmm3, %xmm3
vmulss  %xmm3, %xmm2, %xmm6
vmovaps %xmm4, %xmm2
vfnmadd213ss-24(%rsi), %xmm4, %xmm2
vfnmadd231ss%xmm6, %xmm6, %xmm2
vrsqrtss%xmm2, %xmm10, %xmm10
vmulss  %xmm2, %xmm10, %xmm1
vmulss  %xmm10, %xmm1, %xmm1
vmulss  %xmm7, %xmm10, %xmm10
vaddss  %xmm8, %xmm1, %xmm1
vmulss  %xmm10, %xmm1, %xmm1
vmulss  %xmm1, %xmm3, %xmm2
vmulss  %xmm6, %xmm2, %xmm2
vmovss  -36(%rsi), %xmm6
vxorps  %xmm9, %xmm2, %xmm2
vmulss  %xmm6, %xmm2, %xmm10
vmulss  %xmm2, %xmm5, %xmm2
vfmadd231ss -40(%rsi), %xmm1, %xmm10
vfmadd132ss %xmm4, %xmm2, %xmm1
vfnmadd132ss%xmm0, %xmm10, %xmm1
vmulss  %xmm0, %xmm5, %xmm0
vmovss  %xmm1, -12(%rdi)
vsubss  %xmm0, %xmm6, %xmm0
vmulss  %xmm3, %xmm0, %xmm3
vmovss  %xmm3, -8(%rdi)
cmpq%rsi, %rax
jne .L8


While Flang had a second loop of scalar code (to catch the N mod [SIMD vector
width] remainder of the vectorized loop), there are no secondary loops in the
gfortran code, meaning these must all be scalar operations (I have a hard time
telling apart SSE from scalar code...).

It looks similar in the operations it performs to Flang's vectorized loop,
except that it is only performing operations on a single number at a time.
Because to get efficient vectorization, we need corresponding elements to be
contiguous (ie, all the input1s, all the input2s).
We do not get any benefit from having all the different elements with the same
index (the first input1 next to the first input2, next to the first input3...)
being contiguous.


The memory layout I used is performance-optimal, but is something that gfortran
unfortunately often cannot handle automatically (without manual unrolling).
This is why I filed a report on bugzilla.

[Bug c/88726] New: GCC thinks that translation unit does not contain a definition of inline function.

2019-01-06 Thread anders.granlund.0 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88726

Bug ID: 88726
   Summary: GCC thinks that translation unit does not contain a
definition of inline function.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anders.granlund.0 at gmail dot com
  Target Milestone: ---

Test case (test.c):

  int main()
  {
extern inline void f();
  }

  void f()
  {
  }

Compilation command line:

  gcc prog.c -Wall -Wextra -std=c11 -pedantic-errors 

Observed beaviour:

  The following error message was outputed:

error: inline function 'f' declared but never defined

Expected behaviour:

  No error message outputed.

Note:

  Clang accepts the program without any error message outputed.

[Bug c++/88725] New: fails to deduce template specialization in friend declaration

2019-01-06 Thread haining.cpp at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88725

Bug ID: 88725
   Summary: fails to deduce template specialization in friend
declaration
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: haining.cpp at gmail dot com
  Target Milestone: ---

gcc fails to deduce that the friend declaration refers to ::func in the
below

```
template 
void func(T);

class Cls {
friend void ::func(int);
};
```

Rejecting with 

```
error: 'void func(int)' should have been declared inside '::'
 friend void ::func(int);
   ^
```

I believe this should be allowed by [temp.friend]/1.3:
http://eel.is/c++draft/temp.friend#1.3

see discussion here: https://stackoverflow.com/q/54055575/1013719

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-06 Thread elrodc at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #7 from Chris Elrod  ---
Created attachment 45357
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45357=edit
Assembly generated by Flang compiler on the original version of the code.

This is the main loop body in the Flang compiled version of the original code
(starts line 132):

.LBB1_8:# %vector.body
# =>This Inner Loop Header: Depth=1
leaq(%rsi,%rbx,4), %r12
vmovups (%rcx,%r12), %zmm2
addq%rcx, %r12
leaq(%r12,%rcx), %rbp
vmovups (%r11,%rbp), %zmm3
addq%r11, %rbp
leaq(%rcx,%rbp), %r13
leaq(%rcx,%r13), %r8
leaq(%r8,%rcx), %r10
leaq(%r10,%rcx), %r14
vmovups (%rcx,%r14), %zmm4
vrsqrt14ps  %zmm4, %zmm5
vmulps  %zmm5, %zmm4, %zmm4
vfmadd213ps %zmm0, %zmm5, %zmm4 # zmm4 = (zmm5 * zmm4) + zmm0
vmulps  %zmm1, %zmm5, %zmm5
vmulps  %zmm4, %zmm5, %zmm4
.Ltmp1:
.loc1 31 1 is_stmt 1# vectorization_test.f90:31:1
vmulps  (%rcx,%r8), %zmm4, %zmm5
.loc1 32 1  # vectorization_test.f90:32:1
vmulps  (%rcx,%r10), %zmm4, %zmm6
vmovups (%rcx,%r13), %zmm7
.loc1 33 1  # vectorization_test.f90:33:1
vfnmadd231ps%zmm6, %zmm6, %zmm7 # zmm7 = -(zmm6 * zmm6) + zmm7
vrsqrt14ps  %zmm7, %zmm8
vmulps  %zmm8, %zmm7, %zmm7
vfmadd213ps %zmm0, %zmm8, %zmm7 # zmm7 = (zmm8 * zmm7) + zmm0
vmulps  %zmm1, %zmm8, %zmm8
vmulps  %zmm7, %zmm8, %zmm7
vmovups (%rcx,%rbp), %zmm8
.loc1 35 1  # vectorization_test.f90:35:1
vfnmadd231ps%zmm5, %zmm6, %zmm8 # zmm8 = -(zmm6 * zmm5) + zmm8
vmulps  %zmm8, %zmm7, %zmm8
vmulps  %zmm5, %zmm5, %zmm9
vfmadd231ps %zmm8, %zmm8, %zmm9 # zmm9 = (zmm8 * zmm8) + zmm9
vsubps  %zmm9, %zmm3, %zmm3
vrsqrt14ps  %zmm3, %zmm9
vmulps  %zmm9, %zmm3, %zmm3
vfmadd213ps %zmm0, %zmm9, %zmm3 # zmm3 = (zmm9 * zmm3) + zmm0
vmulps  %zmm1, %zmm9, %zmm9
vmulps  %zmm3, %zmm9, %zmm3
.loc1 39 1  # vectorization_test.f90:39:1
vmulps  %zmm8, %zmm7, %zmm8
.loc1 40 1  # vectorization_test.f90:40:1
vmulps  (%rcx,%r12), %zmm4, %zmm4
.loc1 39 1  # vectorization_test.f90:39:1
vmulps  %zmm3, %zmm8, %zmm8
.loc1 41 1  # vectorization_test.f90:41:1
vmulps  %zmm8, %zmm2, %zmm9
vfmsub231ps (%rsi,%rbx,4), %zmm3, %zmm9 # zmm9 = (zmm3 * mem) -
zmm9
vmulps  %zmm5, %zmm3, %zmm3
vfmsub231ps %zmm8, %zmm6, %zmm3 # zmm3 = (zmm6 * zmm8) - zmm3
vfmadd213ps %zmm9, %zmm4, %zmm3 # zmm3 = (zmm4 * zmm3) + zmm9
.loc1 42 1  # vectorization_test.f90:42:1
vmulps  %zmm4, %zmm6, %zmm5
vmulps  %zmm5, %zmm7, %zmm5
vfmsub231ps %zmm7, %zmm2, %zmm5 # zmm5 = (zmm2 * zmm7) - zmm5
.Ltmp2:
.loc1 15 1  # vectorization_test.f90:15:1
vmovups %zmm3, (%rdi,%rbx,4)
movq-16(%rsp), %rbp # 8-byte Reload
vmovups %zmm5, (%rbp,%rbx,4)
vmovups %zmm4, (%rax,%rbx,4)
addq$16, %rbx
cmpq%rbx, %rdx
jne .LBB1_8



zmm registers are 64 byte registers. It vmovups from memory into registers,
performs a series of arithmetics and inverse square roots on them, and then
vmovups three of these 64 byte registers back into memory.

That is the most efficient memory access pattern (as demonstrated empirically
via benchmarks).

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-06 Thread elrodc at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #6 from Chris Elrod  ---
Created attachment 45356
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45356=edit
Code to demonstrate that transposing makes things slower.

Thomas Koenig, I am well aware that Fortran is column major. That is precisely
why I chose the memory layout I did.

Benchmark of the "optimal" corrected code:

@benchmark gforttest($X32t, $BPP32t, $N)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --
  minimum time: 20.647 μs (0.00% GC)
  median time:  20.860 μs (0.00% GC)
  mean time:21.751 μs (0.00% GC)
  maximum time: 47.760 μs (0.00% GC)
  --
  samples:  1
  evals/sample: 1


Here is a benchmark (compiling with Flang) of my code, exactly as written
(suboptimal) in the attachments:

@benchmark flangtest($X32,  $BPP32,  $N)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --
  minimum time: 658.329 ns (0.00% GC)
  median time:  668.012 ns (0.00% GC)
  mean time:692.384 ns (0.00% GC)
  maximum time: 1.192 μs (0.00% GC)
  --
  samples:  1
  evals/sample: 161


That is 20 microseconds, vs 670 nanoseconds.

N was 1024, and the exact same data used in both cases (but pretransposed, so I
do not benchmark tranposing).
Benchmarking was done by compiling shared libraries, and using `ccall` and
BenchmarkTools from Julia. As indicated by the reports, the benchmark was run
10,000 times for gfortran, and 1.61 million times for Flang, to get accurate
timings.

I compiled with (march=native is equivalent to march=skylake-avx512):
gfortran -Ofast -march=native -mprefer-vector-width=512
-fno-semantic-interposition -shared -fPIC vectorization_test_transposed.f90 -o
libgfortvectorization_test.so
flang -Ofast -march=native -mprefer-vector-width=512 -shared -fPIC
vectorization_test.f90 -o libflangvectorization_test.so

Flang was built with LLVM 7.0.1.


The "suboptimal" code was close to 32 times faster than the "optimal" code.
I was expecting it to be closer to 16 times faster, given the vector width.


To go into more detail:

"
Fortran lays out the memory for that array as

BPP(1,1), BPP(2,1), BPP(3,1), BPP(4,1), ..., BPP(1,2)

so you are accessing your memory with a stride of n in the
expressions BPP(i,1:3) and BPP(i,5:10). This is very inefficient
anyway, vectorization would not really help in this case.
"

Yes, each call to fpdbacksolve is accessing memory across strides.
But fpdbacksolve itself cannot be vectorized well at all.

What does work, however, is vectorizing across loop iterations.

For example, imagine calling fpdbacksolve on this:

BPP(1:16,1), BPP(1:16,2), BPP(1:16,3), BPP(1:16,5), ..., BPP(1:16,10)

and then performing every single scalar operation defined in fpdbacksolve on an
entire SIMD vector of floats (that is, on 16 floats) at a time.

That would of course require inlining fpdbacksolve (which was achieved with
-fno-semantic-interposition, as the assembly shows), and recompiling it.

Perhaps another way you can imagine it is that fpdbacksolve takes in 9 numbers
(BPP(:,4) was unused), and returns 3 numbers.
Because operations within it aren't vectorizable, we want to vectorize it
ACROSS loop iterations, not within them.
So to facilitate that, we have 9 vectors of contiguous inputs, and 3 vectors of
contiguous outputs. Now, all inputs1 are stored contiguously, as are all
inputs2, etc..., allowing the inputs to efficiently be loaded into SIMD
registers, and each loop iteration to calculate [SIMD vector width] of the
outputs at a time.

Of course, it is inconvenient to handle a dozen vectors. So if they all have
the same length, we can just concatenate them together.


I'll attach the assembly of both code examples as well.
The assembly makes it clear that the "suboptimal" way was vectorized, and the
"optimal" way was not.

The benchmarks make it resoundingly clear that the vectorized ("suboptimal")
version was dramatically faster.

As is, this is a missed optimization, and gfortran is severely falling behind
in performance versus LLVM-based Flang in the highest performance version of
the code.

[Bug debug/88723] [9 regression] PR debug/88635 patch breaks testsuite_shared.cc compilation

2019-01-06 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88723

--- Comment #2 from Rainer Orth  ---
Created attachment 45355
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45355=edit
preprocessed input

Sure.  The cc1plus invocation is

cc1plus -fpreprocessed testsuite_shared.ii -quiet -dumpbase testsuite_shared.cc
-mcpu=v9 -auxbase testsuite_shared -g -O2 -w -std=gnu++98 -version
-fdiagnostics-color=never -fchecking=1 -fmessage-length=0 -fno-show-column
-ffunction-sections -fdata-sections -fno-inline -fPIC
-fno-diagnostics-show-caret -o testsuite_shared.s

[Bug debug/88723] [9 regression] PR debug/88635 patch breaks testsuite_shared.cc compilation

2019-01-06 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88723

--- Comment #1 from Jakub Jelinek  ---
I don't have access to Solaris, can you attach preprocessed testsuite_shared.cc
+ the g++ options used to compile it?
The note generally shouldn't break stuff, it isn't an error, just a debugging
hint (goes away for --enable-checking=release builds) that maybe the sparc
delegitimize hook needs more work.

[Bug d/88724] New: FAIL: gdc.dg/compilable.d -O0 (test for excess errors)

2019-01-06 Thread danglin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88724

Bug ID: 88724
   Summary: FAIL: gdc.dg/compilable.d   -O0  (test for excess
errors)
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: danglin at gcc dot gnu.org
  Target Milestone: ---
  Host: hppa2.0w-hp-hpux11.11
Target: hppa2.0w-hp-hpux11.11
 Build: hppa2.0w-hp-hpux11.11

spawn /test/gnu/gcc/objdir/gcc/testsuite/gdc/../../gdc
-B/test/gnu/gcc/objdir/gcc/testsuite/gdc/../../
/test/gnu/gcc/gcc/gcc/testsuite/gdc.dg/compilable.d -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-I/test/gnu/gcc/gcc/gcc/testsuite/../../libphobos/libdruntime
-I/test/gnu/gcc/gcc/gcc/testsuite/../../libphobos/src
-I/test/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/./libstdc++-v3/include
-I/test/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/./libstdc++-v3/include/hppa2.0w-hp-hpux11.11
-I/test/gnu/gcc/gcc/gcc/testsuite/../../libstdc++-v3/libsupc++ -O0 -I
/test/gnu/gcc/gcc/gcc/testsuite/gdc.dg -I
/test/gnu/gcc/gcc/gcc/testsuite/gdc.dg/imports -Wno-psabi
/test/gnu/gcc/gcc/gcc/testsuite/gdc.dg/imports/gdc27.d
/test/gnu/gcc/gcc/gcc/testsuite/gdc.dg/imports/gdc231.d -S -o compilable.s
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/stdlib.d:201:9: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/stdlib.d:203:9: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/stdlib.d:205:9: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/stdlib.d:207:9: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:151:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:151:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:153:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:153:9: error:
undefined identifier 'tm'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:155:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:155:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:157:9: error:
undefined identifier 'tm'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:159:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:161:9: error:
undefined identifier 'tm'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:161:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:163:9: error:
undefined identifier 'tm'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:163:9: error:
undefined identifier 'time_t', did you mean function 'time'?
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/time.d:165:17: error:
undefined identifier 'tm'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:89:14: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:92:5: error:
undefined identifier 'FILE'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:92:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:94:5: error:
undefined identifier 'FILE'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:94:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:95:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:95:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:97:5: error:
undefined identifier 'FILE'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:97:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:99:5: error:
undefined identifier 'FILE'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:99:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:100:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:100:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:102:5: error:
undefined identifier 'wchar_t'
/test/gnu/gcc/gcc-9/libphobos/libdruntime/core/stdc/wchar_.d:104:5: error:
undefined identifier 'wchar_t'

[Bug debug/88723] [9 regression] PR debug/88635 patch breaks testsuite_shared.cc compilation

2019-01-06 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88723

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |9.0

[Bug debug/88723] New: [9 regression] PR debug/88635 patch breaks testsuite_shared.cc compilation

2019-01-06 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88723

Bug ID: 88723
   Summary: [9 regression] PR debug/88635 patch breaks
testsuite_shared.cc compilation
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: ebotcazou at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---
  Host: sparc-sun-solaris2.11
Target: sparc-sun-solaris2.11
 Build: sparc-sun-solaris2.11

Between 20190104 (r267571) and 20190105 (r267602), libstdc++ testing got broken
on Solaris/SPARC:

+ERROR: could not compile testsuite_shared.cc
+ERROR: tcl error sourcing
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/libstdc++-abi/abi.exp.

The log shows

In file included from
/var/gcc/regression/trunk/11.5-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/sparc-sun-solaris2.11/bits/gthr.h:148,
 from
/var/gcc/regression/trunk/11.5-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/ext/atomicity.h:35,
 from
/var/gcc/regression/trunk/11.5-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/bits/basic_string.h:39,
 from
/var/gcc/regression/trunk/11.5-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/string:55,
 from
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/util/testsuite_shared.cc:18:
/var/gcc/regression/trunk/11.5-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/sparc-sun-solaris2.11/bits/gthr-default.h:
In function 'int __gthread_active_p()':
/var/gcc/regression/trunk/11.5-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/sparc-sun-solaris2.11/bits/gthr-default.h:181:
note: non-delegitimized UNSPEC UNSPEC_MOVE_GOTDATA (14) found in variable
location

and many more, and all compiler output lets target_compile think there's an
error, although the compilation succeeds otherwise.

It turns out that reverting just the dwarf2out.c partof

PR debug/88635
* dwarf2out.c (const_ok_for_output_1): Reject MINUS that contains
SYMBOL_REF, CODE_LABEL or UNSPEC in subexpressions of second argument.
Reject PLUS that contains SYMBOL_REF, CODE_LABEL or UNSPEC in
subexpressions of both operands.
(mem_loc_descriptor): Handle UNSPEC if target hook acks it and all the
subrtxes are CONSTANT_P.

Allows the compilation to succeed without error or messages.

[Bug d/88722] New: :1: internal compiler error: in register_moduleinfo, at d/modules.cc:40 2

2019-01-06 Thread danglin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88722

Bug ID: 88722
   Summary: :1: internal compiler error: in
register_moduleinfo, at d/modules.cc:40 2
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: danglin at gcc dot gnu.org
  Target Milestone: ---
  Host: hppa2.0w-hp-hpux11.11
Target: hppa2.0w-hp-hpux11.11
 Build: hppa2.0w-hp-hpux11.11

spawn /test/gnu/gcc/objdir/gcc/testsuite/gdc/../../gdc
-B/test/gnu/gcc/objdir/gc
c/testsuite/gdc/../../ /test/gnu/gcc/gcc/gcc/testsuite/gdc.dg/gdc254.d
-fno-diag
nostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-I/test/gnu/gcc/gcc/gcc/testsuite/../../libphobos/libdruntime
-I/test/gnu/gcc/gc
c/gcc/testsuite/../../libphobos/src
-I/test/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11
/./libstdc++-v3/include
-I/test/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/./libstdc++
-v3/include/hppa2.0w-hp-hpux11.11
-I/test/gnu/gcc/gcc/gcc/testsuite/../../libstd
c++-v3/libsupc++ -O0 -I /test/gnu/gcc/gcc/gcc/testsuite/gdc.dg -S -o gdc254.s
/test/gnu/gcc/gcc/gcc/testsuite/gdc.dg/gdc254.d:13:1: error: class gdc254.C254
i
nterface function 'void F()' is not implemented
:1: internal compiler error: in register_moduleinfo, at
d/modules.cc:40
2
libbacktrace could not find executable to open
Please submit a full bug report,
with preprocessed source if appropriate.

[Bug tree-optimization/86020] [8/9 Regression] Performance regression in Eigen geometry.cpp test starting with r248334

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86020

--- Comment #6 from Jan Hubicka  ---
 I spent good part of day today trying to recollect what was motivation for the
change.  All i can think of is that it was mistaken micro-optimization as
mentioned in the mail I sent about reverting the patch.

I plan to backport the change if performance turns out to be OK.

[Bug tree-optimization/86020] [8/9 Regression] Performance regression in Eigen geometry.cpp test starting with r248334

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86020

--- Comment #5 from Jan Hubicka  ---
Author: hubicka
Date: Sun Jan  6 17:16:00 2019
New Revision: 267612

URL: https://gcc.gnu.org/viewcvs?rev=267612=gcc=rev
Log:

PR tree-opt/86020
Revert:
2017-05-22  Jan Hubicka  

* ipa-inline.c (edge_badness): Use inlined_time instead of
inline_summaries->get.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-inline.c

[Bug bootstrap/88721] [9 regression] -Wmaybe-uninitialized warnings in sparc.c

2019-01-06 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88721

--- Comment #1 from Rainer Orth  ---
Created attachment 45354
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45354=edit
Possible patch

This patch allowed the bootstrap to finish.

[Bug bootstrap/88721] [9 regression] -Wmaybe-uninitialized warnings in sparc.c

2019-01-06 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88721

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |9.0

[Bug bootstrap/88721] New: [9 regression] -Wmaybe-uninitialized warnings in sparc.c

2019-01-06 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88721

Bug ID: 88721
   Summary: [9 regression] -Wmaybe-uninitialized warnings in
sparc.c
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: ebotcazou at gcc dot gnu.org
  Target Milestone: ---
  Host: sparc-sun-solaris2.11
Target: sparc-sun-solaris2.11
 Build: sparc-sun-solaris2.11

Between 20190104 (r267571) and 20190105 (r267602), Solaris/SPARC bootstrap
began
to fail:

/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sparc.c: In function 'rtx_def*
sparc_function_incoming_arg(cumulative_args_t, machine_mode, const_tree,
bool)':
/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sparc.c:7417:39: error: 'regno'
may be used uninitialized in this function [-Werror=maybe-uninitialized]
 7417 |   return function_arg_union_value (size, mode, slotno, regno);
  |  ~^~~
/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sparc.c:7386:15: note: 'regno' was
declared here
 7386 |   int slotno, regno, padding;
  |   ^

/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sparc.c: In function 'void
sparc_function_arg_advance(cumulative_args_t, machine_mode, const_tree, bool)':
/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sparc.c:7603:14: error: 'padding'
may be used uninitialized in this function [-Werror=maybe-uninitialized]
 7603 |   cum->words += padding;
  |   ~~~^~

[Bug c++/88482] ICE when wrongly declaring __cxa_allocate_exception

2019-01-06 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88482

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Jakub Jelinek  ---
Fixed on the trunk, no plans to backport.

[Bug lto/51765] [9 Regression] Testsuite ICEs with -flto

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51765

Jan Hubicka  changed:

   What|Removed |Added

Summary|Testsuite ICEs with -flto   |[9 Regression] Testsuite
   ||ICEs with -flto

--- Comment #8 from Jan Hubicka  ---
... which I forgot to attach :)
FAIL: gcc.dg/nested-func-12.c (internal compiler error)
FAIL: gcc.dg/nested-func-9.c (internal compiler error)
UNRESOLVED: gcc.dg/tree-prof/pr77698.c scan-rtl-dump-times alignments "internal
loop alignment added" 1
FAIL: g++.dg/ext/vector33.C  -std=c++14 (internal compiler error)
FAIL: g++.dg/ext/vector33.C  -std=c++17 (internal compiler error)

/aux/hubicka/trunk4/gcc/testsuite/gcc.dg/nested-func-12.c: In function 'main':
/aux/hubicka/trunk4/gcc/testsuite/gcc.dg/nested-func-12.c:45:1: error: invalid
conversion in gimple call
struct S

struct S

# .MEM_37 = VDEF <.MEM_36>
MEM[(struct S *)_12] = fn (); [static-chain: ] [return slot
optimization]
/aux/hubicka/trunk4/gcc/testsuite/gcc.dg/nested-func-12.c:45:1: error: invalid
conversion in gimple call
struct S

struct S

# .MEM_45 = VDEF <.MEM_44>
MEM[(struct S *)_26] = fn (); [static-chain: ] [return slot
optimization]
during GIMPLE pass: fixup_cfg
/aux/hubicka/trunk4/gcc/testsuite/gcc.dg/nested-func-12.c:45:1: internal
compiler error: verify_gimple failed
0xc9c401 verify_gimple_in_cfg(function*, bool)
../../gcc/tree-cfg.c:5422
0xb7f85f execute_function_todo
../../gcc/passes.c:1977
0xb8045e execute_todo
../../gcc/passes.c:2031


/aux/hubicka/trunk4/build6/gcc/testsuite/g++11/../../xg++
-B/aux/hubicka/trunk4/build6/gcc/testsuite/g++11/../../
/aux/hubicka/trunk4/gcc/testsuite/g++.dg/ext/vector3
3.C -flto -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -nostdinc++
-I/aux/hubicka/trunk4/build6/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64
-pc-linux-gnu
-I/aux/hubicka/trunk4/build6/x86_64-pc-linux-gnu/libstdc++-v3/include
-I/aux/hubicka/trunk4/libstdc++-v3/libsupc++
-I/aux/hubicka/trunk4/libstdc++-v3/include/backward -I/aux
/hubicka/trunk4/libstdc++-v3/testsuite/util -fmessage-length=0 -std=c++14
-pedantic-errors -Wno-long-long -S -o vector33.s
during IPA pass: fnsummary
/aux/hubicka/trunk4/gcc/testsuite/g++.dg/ext/vector33.C:10:1: internal compiler
error: tree code 'template_parm_index' is not supported in LTO streams
0xe32fd3 lto_write_tree
../../gcc/lto-streamer-out.c:448
0xe32fd3 lto_output_tree_1
../../gcc/lto-streamer-out.c:489
0xe32fd3 DFS::DFS(output_block*, tree_node*, bool, bool, bool)
../../gcc/lto-streamer-out.c:676
0xe33fcf lto_output_tree(output_block*, tree_node*, bool, bool)
../../gcc/lto-streamer-out.c:1628
0xe2c5bc write_global_stream
../../gcc/lto-streamer-out.c:2511
0xe35f7e lto_output_decl_state_streams(output_block*, lto_out_decl_state*)
../../gcc/lto-streamer-out.c:2558
0xe35f7e produce_asm_for_decls()
../../gcc/lto-streamer-out.c:2888
0xe9d0bf write_lto
../../gcc/passes.c:2596
0xea05ee ipa_write_summaries_1
../../gcc/passes.c:2657
0xea05ee ipa_write_summaries()
../../gcc/passes.c:2720
0xb8e132 ipa_passes
../../gcc/cgraphunit.c:2530
0xb8e132 symbol_table::compile()
../../gcc/cgraphunit.c:2618
0xb9016c symbol_table::compile()
../../gcc/cgraphunit.c:2597
0xb9016c symbol_table::finalize_compilation_unit()
../../gcc/cgraphunit.c:2863

Both free_lang_data related. Gimplifier ICE is type simplification so I mark
this as an regression.

[Bug fortran/88047] [9 Regression] ICE in gfc_find_vtab, at fortran/class.c:2843

2019-01-06 Thread dominiq at lps dot ens.fr

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88047

--- Comment #6 from Dominique d'Humieres  ---
> A related test case, also changed between 20180909 and 20180916 :

Confirmed at r264350, not fixed by the patch in comment 2, but by the patch in
comment 3.

Janus, could please you figure out why class_array_3.f03 is failing with your
patch?

[Bug lto/51765] Testsuite ICEs with -flto

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51765

--- Comment #7 from Jan Hubicka  ---
I get only 2 now.

[Bug c/88720] New: Strange error message about nested function declared but not defined when using inline.

2019-01-06 Thread anders.granlund.0 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88720

Bug ID: 88720
   Summary: Strange error message about nested function declared
but not defined when using inline.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anders.granlund.0 at gmail dot com
  Target Milestone: ---

Test case (prog.c):

  static void f();

  int main()
  {
inline void f();
  }

Compilation command line:

  gcc prog.c -Wall -Wextra -std=c11 -pedantic-errors 

Observed behaviour:

  The following error message was outputed:

error: nested function 'f' declared but never defined

Expected behaviour:

  No error message outputed.

Note:

  Clang accepts the program without any error message outputed.

[Bug lto/88185] LTO merges -fPIC/fpie and -fPIE/-fpie options to nothing - fails to warn when both are specified

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88185

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jan Hubicka  ---
Fixed.

[Bug lto/86517] relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object with LTO

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86517

--- Comment #10 from Jan Hubicka  ---
Author: hubicka
Date: Sun Jan  6 15:51:45 2019
New Revision: 267610

URL: https://gcc.gnu.org/viewcvs?rev=267610=gcc=rev
Log:

PR lto/86517
PR lto/88185
* lto-opts.c (lto_write_options): Always stream PIC/PIE mode.
* lto-wrapper.c (merge_and_complain): Fix merging of PIC/PIE.

Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/lto-opts.c
branches/gcc-8-branch/gcc/lto-wrapper.c

[Bug lto/88185] LTO merges -fPIC/fpie and -fPIE/-fpie options to nothing - fails to warn when both are specified

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88185

--- Comment #4 from Jan Hubicka  ---
Author: hubicka
Date: Sun Jan  6 15:51:45 2019
New Revision: 267610

URL: https://gcc.gnu.org/viewcvs?rev=267610=gcc=rev
Log:

PR lto/86517
PR lto/88185
* lto-opts.c (lto_write_options): Always stream PIC/PIE mode.
* lto-wrapper.c (merge_and_complain): Fix merging of PIC/PIE.

Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/lto-opts.c
branches/gcc-8-branch/gcc/lto-wrapper.c

[Bug tree-optimization/88719] New: [9 Regression] wrong code at -O2, -O3, and -Os on x86_64-linux-gnu

2019-01-06 Thread chenjunjie9208 at 163 dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88719

Bug ID: 88719
   Summary: [9 Regression] wrong code at -O2, -O3, and -Os on
x86_64-linux-gnu
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: chenjunjie9208 at 163 dot com
  Target Milestone: ---

$ gcc -v
Using built-in specs.
COLLECT_GCC=./gcc
COLLECT_LTO_WRAPPER=/home/wgc/installs/gcc_trunks/trunk_r267367/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure LDFLAGS=-Wl,--no-as-needed
--prefix=/home/wgc/installs/gcc_trunks/trunk_r267367
--with-gmp=/home/wgc/installs/gmp-6.1.2
--with-mpfr=/home/wgc/installs/mpfr-4.0.1
--with-mpc=/home/wgc/installs/mpc-1.1.0 --with-isl= --with-cloog=
--enable-languages=c,c++ --disable-multilib --disable-bootstrap
Thread model: posix
gcc version 9.0.0 20181223 (experimental) (GCC)

$ gcc -O0 program.c
$ ./a.out
0 
$ gcc -O1 program.c
$ ./a.out
0 
$ gcc -O2 program.c
$ ./a.out
-60 
$ gcc -O3 program.c
$ ./a.out
-60
$ gcc -Os program.c
$ ./a.out
-60

$ cat program.c
struct a {
  unsigned char b;
};
union {
  short int c;
  long int b;
} d = {2};
short int *e = 
long int *f[][1] = {, , , };
long int **g = [0][3];
unsigned int h;
int main() {
  struct a i = {};
  for (; i.b != 60; ++i.b)
if (**g = 0)
  ;
else if (*e)
  --h;
  printf("%d\n", h);
  return 0;
}


The used operating system and CPU are as below:
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz

[Bug c++/88694] constexpr isn't captured correctly in lambda

2019-01-06 Thread ensadc at mailnesia dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88694

ensadc at mailnesia dot com changed:

   What|Removed |Added

 CC||ensadc at mailnesia dot com

--- Comment #4 from ensadc at mailnesia dot com ---
Further reduced:

template  struct A {
  static constexpr T e = U;
  constexpr operator int () { return e; }
};
struct D { template  void print (); };

int
main ()
{
  D d;
  [&](auto i) { auto x = [&] { d.print(); }; }(A{});
}

[Bug c/88718] New: Strange inconsistency between old style and new style declarations of iinline functions.

2019-01-06 Thread anders.granlund.0 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88718

Bug ID: 88718
   Summary: Strange inconsistency between old style and new style
declarations of iinline functions.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anders.granlund.0 at gmail dot com
  Target Milestone: ---

GCC is behaving inconsistently for the following two test cases:

prog1.c:

  static int x;

  inline void g(int a[sizeof(x)])
  {
  }

  int main()
  {
  }

prog2.c:

  static int x;

  inline void g(a)
int a[sizeof(x)];
  {
  }

  int main()
  {
  }

Compiling the first test case with the following compilation command line

  gcc prog1.c -Wall -Wextra -std=c11 -pedantic-errors 

gives no error message.

Compiling the second test case with the following compilation command line

  gcc prog2.c -Wall -Wextra -std=c11 -pedantic-errors 

gives the following error message:

  error: 'x' is static but used in inline function 'g' which is not static

I think there should be an error message in both cases because of 6.7.4/3. At
least the two test cases should behave consistently.

[Bug target/88717] New: Unnecessary vzeroupper

2019-01-06 Thread hjl.tools at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88717

Bug ID: 88717
   Summary: Unnecessary vzeroupper
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: ubizjak at gmail dot com, wei3.xiao at intel dot com,
xuepeng.guo at intel dot com
  Target Milestone: ---
Target: i386,x86_64

[hjl@gnu-cfl-1 tmp]$ cat x.i
typedef float __v16sf __attribute__ ((__vector_size__ (64)));
typedef float __m512 __attribute__ ((__vector_size__ (64), __may_alias__));

void
foo (float *p, __m512 x)
{
  *p = ((__v16sf)x)[0];
}
[hjl@gnu-cfl-1 tmp]$ gcc -mavx512f -S x.i -O2
[hjl@gnu-cfl-1 tmp]$ cat x.s
.file   "x.i"
.text
.p2align 4,,15
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
vmovss  %xmm0, (%rdi)
vzeroupper
ret
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 8.2.1 20181215 (Red Hat 8.2.1-6)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-1 tmp]$ 

Since __m512 is passed to foo, vzeroupper isn't needed.

[Bug tree-optimization/88606] [9 Regression] ICE: verify_type failed (error: type variant differs by TYPE_TRANSPARENT_AGGR)

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88606

Jan Hubicka  changed:

   What|Removed |Added

Summary|ICE: verify_type failed |[9 Regression] ICE:
   |(error: type variant|verify_type failed (error:
   |differs by  |type variant differs by
   |TYPE_TRANSPARENT_AGGR)  |TYPE_TRANSPARENT_AGGR)

--- Comment #3 from Jan Hubicka  ---
Regression then, will take a look.

[Bug tree-optimization/88606] ICE: verify_type failed (error: type variant differs by TYPE_TRANSPARENT_AGGR)

2019-01-06 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88606

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #2 from Segher Boessenkool  ---
Confirmed.

[Bug fortran/36854] [meta-bug] fortran front-end optimization

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36854
Bug 36854 depends on bug 88713, which changed state.

Bug 88713 Summary: _gfortran_internal_pack@PLT prevents vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

Thomas Koenig  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #5 from Thomas Koenig  ---
An additional point. The called routine has

pure function fpdbacksolve(x, S) result(Uix)

real, dimension(3) ::  Uix

... and so on. This _requires_ Uix to be contiguous in memory,
so we need to call the internal pack routine.

This is something we cannot change easily. So, the workaround:

Use the appropriate memory layout in your Fortran programs.

[Bug lto/87525] [7/8/9 Regression] infinite loop generated for fread() if enabling -flto and -D_FORTIFY_SOURCE=2

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87525

Jan Hubicka  changed:

   What|Removed |Added

Summary|infinite loop generated for |[7/8/9 Regression] infinite
   |fread() if enabling -flto   |loop generated for fread()
   |and -D_FORTIFY_SOURCE=2 |if enabling -flto and
   ||-D_FORTIFY_SOURCE=2

--- Comment #12 from Jan Hubicka  ---
I will take a look - symtab handling with multiple declarations is sliperly :(.
I have added regression marker because it definitly regress wrt pre-lto time.

[Bug c/85433] -fdiagnostics-color=auto doesn't work properly with LTO

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85433

Jan Hubicka  changed:

   What|Removed |Added

 CC||jason at redhat dot com
   Assignee|hubicka at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #3 from Jan Hubicka  ---
I looked into this more and problem is collect2 which creates the file. This is
done to handle rpo files. For those compilation is done multiple times until
all templates are needed. In this case error messages from earlier runs are
discarded and that is why we can't output the directly to terminal.

Obviously this is silly to do with LTO, but at the time we create temporary we
do not know if we will LTO or not because linker is called later.

Jason, I wonder are RPO files still relevant and whether there is something to
do about this? I am unassigning myself since I do not know what to do. However
in my tree I simply have hack that disables the RPO path and this gets me nice
colorful warnings (and they come when they are found, not several minutes
later)

[Bug lto/84044] Spurious -Wodr warning with -flto

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84044

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Jan Hubicka  ---
Fixed some time ago.

[Bug c++/81668] LTO ODR warnings are not helpful

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81668

--- Comment #13 from Jan Hubicka  ---
Warnings from comment #8 are fixed now. I would love to know if there are any
issues with what GCC 9 outputs.  We still can't track locations to the original
.o files though.

[Bug lto/66229] LTO fails with -fauto-profile on mcf

2019-01-06 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66229

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2019-01-06
   Assignee|hubicka at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Jan Hubicka  ---
Does this still fail with Bin's fixes?

[Bug c/88584] GCC thinks that the type is complete dispite shaddowing.

2019-01-06 Thread anders.granlund.0 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88584

--- Comment #7 from Anders Granlund  ---
(In reply to jos...@codesourcery.com from comment #6)
> This looks like a case that was missed in, or broken by, my fix for bug 
> 13801, which was supposed to address such cases of entities with different 
> (compatible) types in different scopes.  It seems GCC handled this 
> correctly (i.e. produced an error) in the 3.4 release series only.

Does this mean that we can change the status for this bug report from
UNCONFIRMED to NEW?

[Bug fortran/88658] [9 Regression] Intrinsic MAX1 returns a REAL result, should be INTEGER.

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88658

Thomas Koenig  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Thomas Koenig  ---
Fixed on trunk, closing.

[Bug fortran/88658] [9 Regression] Intrinsic MAX1 returns a REAL result, should be INTEGER.

2019-01-06 Thread tkoenig at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88658

--- Comment #1 from Thomas Koenig  ---
Author: tkoenig
Date: Sun Jan  6 12:48:58 2019
New Revision: 267609

URL: https://gcc.gnu.org/viewcvs?rev=267609=gcc=rev
Log:
2019-01-06  Thomas Koenig  

PR fortran/88658
* gfortran.h: Add macro gfc_real_4_kind
* simplify.c (simplify_min_max): Special case for the types of
AMAX0, AMIN0, MAX1 and MIN1, which actually change the types of
their arguments.

2019-01-06  Thomas Koenig  

PR fortran/88658
* gfortran.dg/min_max_type_2.f90: New test.


Added:
trunk/gcc/testsuite/gfortran.dg/min_max_type_2.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/gfortran.h
trunk/gcc/fortran/simplify.c
trunk/gcc/testsuite/ChangeLog

1 2 >

1 - 100 of 113 matches

Mail list logo