[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-05 Thread elrodc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #3 from Chris Elrod  ---
Created attachment 45353
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45353=edit
g++ assembly output

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-05 Thread elrodc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #2 from Chris Elrod  ---
Created attachment 45352
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45352=edit
gfortran assembly output

[Bug fortran/88713] New: _gfortran_internal_pack@PLT prevents vectorization

2019-01-05 Thread elrodc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

Bug ID: 88713
   Summary: _gfortran_internal_pack@PLT prevents vectorization
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: elrodc at gmail dot com
  Target Milestone: ---

Created attachment 45350
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45350=edit
Fortran version of vectorization test.

I am attaching Fortran and C++ translations of a simple working example.

The C++ version is vectorized, while the Fortran version is not.

The code consists of two functions. One simply runs a for loop, calling the
other function.
The function is vectorizable across loop iterations. g++ does this
succcesfully.

However, gfortran does not, because it repacks data with
call_gfortran_internal_pack@PLT
so that it can no longer be vectorized across iterations.


I compiled with:

gfortran -Ofast -march=skylake-avx512 -mprefer-vector-width=512
-fno-semantic-interposition -shared -fPIC -S vectorization_test.cpp -o
gfortvectorization_test.s

g++ -Ofast -march=skylake-avx512 -mprefer-vector-width=512 -shared -fPIC -S
vectorization_test.cpp -o gppvectorization_test.s


LLVM (via flang and clang) successfully vectorizes both versions.

[Bug fortran/88713] _gfortran_internal_pack@PLT prevents vectorization

2019-01-05 Thread elrodc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #1 from Chris Elrod  ---
Created attachment 45351
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45351=edit
C++ version of the vectorization test case.

[Bug c/81980] Spurious -Wmissing-format-attribute warning in 32-bit mode

2019-01-05 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81980

Eric Gallager  changed:

   What|Removed |Added

 CC||dmalcolm at gcc dot gnu.org,
   ||dodji at gcc dot gnu.org

--- Comment #3 from Eric Gallager  ---
cc-ing diagnostics maintainers

[Bug c++/80789] Better error for passing lambda with capture as function pointer

2019-01-05 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80789

--- Comment #2 from Eric Gallager  ---
not sure whether to cc the C++ FE maintainers or the diagnostics maintainers on
this...

[Bug c++/78502] Analyze 'final'/'override' even for uninstantiated class templates

2019-01-05 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78502

Eric Gallager  changed:

   What|Removed |Added

 CC||jason at redhat dot com,
   ||nathan at gcc dot gnu.org

--- Comment #2 from Eric Gallager  ---
since this might be an accepts-invalid for gcc (or a rejects-valid for clang)
I'm cc-ing the C++ FE maintainers for their interpretation of the standard.

[Bug rtl-optimization/63156] web can't handle AUTOINC correctly

2019-01-05 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63156

Eric Gallager  changed:

   What|Removed |Added

 CC||steven at gcc dot gnu.org
   Assignee|steven at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org

--- Comment #12 from Eric Gallager  ---
(In reply to Eric Gallager from comment #11)
> (In reply to Steven Bosscher from comment #7)
> > (In reply to Carrot from comment #6)
> > > Since it is intentionally to remove flag DF_REF_READ_WRITE on use,
> > 
> > Ah, but I don't think that was the correct fix. The DEF and USE refs should
> > both have the flag set.
> 
> Are you still working on this?

Guess not; unassigning and moving to cc

[Bug libstdc++/88607] forward_list.h contains utf-8 charactor

2019-01-05 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88607

--- Comment #10 from Jonathan Wakely  ---
Author: redi
Date: Sun Jan  6 00:49:11 2019
New Revision: 267607

URL: https://gcc.gnu.org/viewcvs?rev=267607=gcc=rev
Log:
PR libstdc++/88607 add tests using -finput-charset=ascii

This verifies that the  header can be compiled with ASCII
as the input character set.

PR libstdc++/88607
* testsuite/17_intro/headers/c++1998/charset.cc: New test.
* testsuite/17_intro/headers/c++2011/charset.cc: New test.
* testsuite/17_intro/headers/c++2014/charset.cc: New test.
* testsuite/17_intro/headers/c++2017/charset.cc: New test.
* testsuite/17_intro/headers/c++2020/charset.cc: New test.

Added:
trunk/libstdc++-v3/testsuite/17_intro/headers/c++1998/charset.cc
trunk/libstdc++-v3/testsuite/17_intro/headers/c++2011/charset.cc
trunk/libstdc++-v3/testsuite/17_intro/headers/c++2014/charset.cc
trunk/libstdc++-v3/testsuite/17_intro/headers/c++2017/charset.cc
trunk/libstdc++-v3/testsuite/17_intro/headers/c++2020/charset.cc
Modified:
trunk/libstdc++-v3/ChangeLog

[Bug target/85048] [missed optimization] vector conversions

2019-01-05 Thread husseydevin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85048

Devin Hussey  changed:

   What|Removed |Added

 CC||husseydevin at gmail dot com

--- Comment #5 from Devin Hussey  ---
ARM/AArch64 NEON use these:

FromTo   Intrinsic  ARMv7-a  AArch64
intXxY_t -> int2XxY_tvmovl_sX   vmovl.sX sshll #0?
uintXxY_t.   -> uint2XxY_t   vmovl_uX   vmovl.uX ushll #0?
[u]int2XxY_t -> [u]intXxY_t  vmovn_[us]Xvmovn.iX xtn
floatXxY_t   -> intXxY_t vcvt[q]_sX_fX  vcvt.sX.fX   fcvtzs
floatXxY_t   -> uintXxY_tvcvt[q]_uX_fX  vcvt.uX.fX   fcvtzu
intXxY_t -> floatXxY_t   vcvt[q]_fX_sX  vcvt.fX.sX   scvtf
uintXxY_t-> floatXxY_t   vcvt[q]_fX_uX  vcvt.fX.uX   ucvtf
float32x2_t  -> float64x2_t  vcvt_f32_f64   2x vcvt.f64.f32  fcvtl
float64x2_t  -> float32x2_t  vcvt_f64_f32   2x vcvt.f32.f64  fcvtn

Clang optimizes vmovl to vshll by zero for some reason. 

float32x2_t <-> float64x2_t requires 2 VFP instructions on ARMv7-a.

[Bug c/81871] bogus attribute alloc_align accepted

2019-01-05 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81871

Martin Sebor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |9.0
  Known to fail||8.2.0

--- Comment #5 from Martin Sebor  ---
Looks like r266195 fixed it.

$ cat t.c && gcc -S t.c
void __attribute__ ((alloc_align (1))) f (int);

void* __attribute__ ((alloc_align (1))) g (void*);

t.c:1:1: warning: ‘alloc_align’ attribute ignored on a function returning
‘void’ [-Wattributes]
1 | void __attribute__ ((alloc_align (1))) f (int);
  | ^~~~
t.c:3:1: warning: ‘alloc_align’ attribute argument value ‘1’ refers to
parameter type ‘void *’ [-Wattributes]
3 | void* __attribute__ ((alloc_align (1))) g (void*);
  | ^~~~


It's being tested by gcc.dg/attr-alloc_align-4.c so the bug can be resolved. 
Thanks for the reminder!

[Bug c/81871] bogus attribute alloc_align accepted

2019-01-05 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81871

--- Comment #4 from Eric Gallager  ---
(In reply to Martin Sebor from comment #3)
> Let me fix this.

Any progress?

[Bug libstdc++/77776] C++17 std::hypot implementation is poor

2019-01-05 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6

--- Comment #4 from Marc Glisse  ---
(In reply to Matthias Kretz from comment #3)
> Did you consider the error introduced by scaling with __amax? I made sure
> that the division is without error by zeroing the mantissa bits. Here's a
> motivating example that shows an error of 1 ulp otherwise:
> https://godbolt.org/z/_U2K7e

Your "reference" number seems strange. Why not do the computation with double
(or long double or mpfr) or use __builtin_hypotf? Note that it changes the
value.

How precise is hypot supposed to be? I know it is supposed to try and avoid
spurious overflow/underflow, but I am not convinced that it should aim for
correct rounding.

(I see that you are using clang in that godbolt link, with gcc I need to mark
the global variables with "extern const" to get a similar asm)

> About std::fma, how bad is the performance hit if there's no instruction for
> it?

FMA doesn't seem particularly relevant here.

[Bug target/88712] Optimization: mov edx, 0 not replaced with xor edx, edx in this case

2019-01-05 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88712

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64

--- Comment #1 from Andrew Pinski  ---
This is normally controlled by TARGET_USE_MOV0 but that seems like it is only
enabled for k6 and maybe size.

[Bug rtl-optimization/88712] New: Optimization: mov edx, 0 not replaced with xor edx, edx in this case

2019-01-05 Thread matt at godbolt dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88712

Bug ID: 88712
   Summary: Optimization: mov edx, 0 not replaced with xor edx,
edx in this case
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: matt at godbolt dot org
  Target Milestone: ---

The code: 

---snip
int func(int val, const int *ptr)
{
  int res = val + 1234;
  if (res == *ptr)
  {
res = 0;
  }
  return res;
}
---

generates the following ASM on all version of GCC back to 4.9.x:

---
func(int, int const*):
lea eax, [rdi+1234]
mov edx, 0
cmp DWORD PTR [rsi], eax
cmove   eax, edx
ret
---

The `mov edx, 0` is surprising to me. All the other compilers I tested (see
https://godbolt.org/z/Nt9pKp for more details) use the common `xor edx, edx`
(or `xor eax, eax`) idiom for zeroing edx.

Is this a missed optimization in the case of a cmov being generated, or am I
missing something subtle?

[Bug fortran/88710] [F08] Sourced allocation of array fails, yielding wrong bounds and result

2019-01-05 Thread c...@mnet-mail.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88710

--- Comment #4 from c...@mnet-mail.de ---
Thanks, this caught the bounds violation with the following output:

lbound/ubound(a):-1-1 1 2 1 1
lbound/ubound(b):-1-1 1 2 1 1
lbound/ubound(c):-1-1 1 2 1 1
lbound/ubound(t): 0 0 0 3 2 0
 a, b, c, t: 
At line 28 of file test_alloc.F90
Fortran runtime error: Index '1' of dimension 3 of array 't' above upper bound
of 0

Error termination. Backtrace:
#0  0x2b3018cf341a
#1  0x2b3018cf3f75
#2  0x2b3018cf4347
#3  0x403e97
#4  0x40400f
#5  0x2b301917182f
#6  0x4008d8
#7  0x

[Bug fortran/88710] [F08] Sourced allocation of array fails, yielding wrong bounds and result

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88710

--- Comment #3 from Dominique d'Humieres  ---
> For what it's worth, I have compiled the code also with '-Wall'
> and '-Warray-bounds' but both these options didn't give any warning.

The relevant option is -fcheck=bounds.

[Bug c++/85052] Implement support for clang's __builtin_convertvector

2019-01-05 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85052

--- Comment #9 from Matthias Kretz  ---
(In reply to Devin Hussey from comment #7)
> Wait, silly me, this isn't about optimizations, this is about patterns.

Regarding optimizations, PR85048 is a first step (it lists all x86
single-instruction SIMD conversions). I also linked my library implementation
in #5, which provides optimizations for all cases on x86.

[Bug fortran/88710] [F08] Sourced allocation of array fails, yielding wrong bounds and result

2019-01-05 Thread c...@mnet-mail.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88710

--- Comment #2 from c...@mnet-mail.de ---
Yes, the said block accesses 't' outside its bounds (because
the returned bounds are wrong).

Thanks for mentioning this.

For what it's worth, I have compiled the code also with '-Wall'
and '-Warray-bounds' but both these options didn't give any warning.

[Bug libstdc++/77776] C++17 std::hypot implementation is poor

2019-01-05 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6

--- Comment #3 from Matthias Kretz  ---
Did you consider the error introduced by scaling with __amax? I made sure that
the division is without error by zeroing the mantissa bits. Here's a motivating
example that shows an error of 1 ulp otherwise: https://godbolt.org/z/_U2K7e

About std::fma, how bad is the performance hit if there's no instruction for
it?

[Bug middle-end/87836] ICE in cc1 for gcc-6.5.0 with SPARC hardware

2019-01-05 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87836

--- Comment #27 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #26 from Gary Mills  ---
> I have no concerns about removal of gcc support for Solaris 10:  That is an

I've only mentioned it to make clear that the oldest version of Solaris
as that's going to be tested with change quite a bit once S10 support is
gone, certainly to something much newer than the snv_121 as in Illumos.

> obsolete operating system, after all.  illumos is equivalent to Solaris 11.

No, it's not: while it's certainly closer to S11 than S10, it has still
been quite a way from snv_147 (the last OpenSolaris build) to snv_175
(aka Solaris 11.0).  No need to tell me about OpenSolaris/Illumos, btw.:
I've been in the OpenSolaris Pilot from day one.

> gas is used for illumos compilers on x86.  It works on SPARC too, and avoids
> the ICE.  Unfortunately, gcc with gas can't be used to compile the SPARC
> kernel.  That's because some SPARC kernel files are written in assembler
> language.  These won't compile with gas, only with the native assembler.  It

It shouldn't be too hard to introduce make rules (or rather change cw)
to build them with as directly, even if gcc on SPARC starts using gas.
Hasn't this already been done for Illumos on x86?

Alternatively, you can always rewrite them to use gas syntax, and I
doubt that there are many as-specific constructs or directives in there:
it's low-level kernel code, after all.

> would be difficult, but not impossible, to use gcc with gas on SPARC hardware.
>
> I've just attempted to build gcc-7.3.0 on SPARC with an even more restricted
> configuration:
>
>   $
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/configure
> --without-gnu-ld --with-ld=/usr/bin/ld --without-gnu-as --with-as=/usr/bin/as
>
> The compilers are not specified on the command line but they are in the
> environment.  The compilers were identified correctly.
>
> The build got considerably farther, but ended with this error:
>
> libtool: compile: 
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/./gcc/xgcc
> -shared-libgcc
> -B/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/./gcc
> -nostdinc++
> -L/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/src
> -L/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/src/.libs
> -L/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/libsupc++/.libs
> -B/usr/local/sparc-sun-solaris2.11/bin/ 
> -B/usr/local/sparc-sun-solaris2.11/lib/
> -isystem /usr/local/sparc-sun-solaris2.11/include -isystem
> /usr/local/sparc-sun-solaris2.11/sys-include
> -I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/../libgcc
> -I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include/sparc-sun-solaris2.11
> -I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include
> -I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++
> -D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings
> -Wcast-qual -Wabi -fdiagnostics-show-location=once -ffunction-sections
> -fdata-sections -frandom-seed=new_opa.lo -g -O2 -std=gnu++1z -c
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc
>  -fPIC -DPIC -D_GLIBCXX_SHARED -o new_opa.o
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:
> In function 'void* operator new(std::size_t, std::align_val_t)':
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:103:33:
> error: 'aligned_alloc' was not declared in this scope
>while (__builtin_expect ((p = aligned_alloc (align, sz)) == 0, false))
>  ^
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:103:33:
> note: suggested alternative:
> In file included from /usr/include/stdlib.h:39:0,
>  from
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include/cstdlib:75,
>  from
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include/stdlib.h:36,
>  from
> /export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:27:
> 

[Bug middle-end/87836] ICE in cc1 for gcc-6.5.0 with SPARC hardware

2019-01-05 Thread gary_mills at fastmail dot fm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87836

--- Comment #26 from Gary Mills  ---
I have no concerns about removal of gcc support for Solaris 10:  That is an
obsolete operating system, after all.  illumos is equivalent to Solaris 11.

gas is used for illumos compilers on x86.  It works on SPARC too, and avoids
the ICE.  Unfortunately, gcc with gas can't be used to compile the SPARC
kernel.  That's because some SPARC kernel files are written in assembler
language.  These won't compile with gas, only with the native assembler.  It
would be difficult, but not impossible, to use gcc with gas on SPARC hardware.

I've just attempted to build gcc-7.3.0 on SPARC with an even more restricted
configuration:

  $
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/configure
--without-gnu-ld --with-ld=/usr/bin/ld --without-gnu-as --with-as=/usr/bin/as

The compilers are not specified on the command line but they are in the
environment.  The compilers were identified correctly.

The build got considerably farther, but ended with this error:

libtool: compile: 
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/./gcc/xgcc
-shared-libgcc
-B/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/./gcc
-nostdinc++
-L/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/src
-L/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/src/.libs
-L/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/libsupc++/.libs
-B/usr/local/sparc-sun-solaris2.11/bin/ -B/usr/local/sparc-sun-solaris2.11/lib/
-isystem /usr/local/sparc-sun-solaris2.11/include -isystem
/usr/local/sparc-sun-solaris2.11/sys-include
-I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/../libgcc
-I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include/sparc-sun-solaris2.11
-I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include
-I/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++
-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings
-Wcast-qual -Wabi -fdiagnostics-show-location=once -ffunction-sections
-fdata-sections -frandom-seed=new_opa.lo -g -O2 -std=gnu++1z -c
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc
 -fPIC -DPIC -D_GLIBCXX_SHARED -o new_opa.o
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:
In function 'void* operator new(std::size_t, std::align_val_t)':
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:103:33:
error: 'aligned_alloc' was not declared in this scope
   while (__builtin_expect ((p = aligned_alloc (align, sz)) == 0, false))
 ^
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:103:33:
note: suggested alternative:
In file included from /usr/include/stdlib.h:39:0,
 from
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include/cstdlib:75,
 from
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/include/stdlib.h:36,
 from
/export/home/mills/Downloads/code/oi-userland/components/developer/gcc-7/gcc-7.3.0/libstdc++-v3/libsupc++/new_opa.cc:27:
/usr/include/iso/stdlib_c11.h:60:14: note:   'std::aligned_alloc'
 extern void *aligned_alloc(size_t, size_t);
  ^
Makefile:936: recipe for target 'new_opa.lo' failed
make[6]: *** [new_opa.lo] Error 1
make[6]: Leaving directory
'/dpool/export/home/mills/Downloads/code/oi-userland-apr/components/developer/gcc-7/build/sparcv7/sparc-sun-solaris2.11/libstdc++-v3/libsupc++'

There is a patch which seems to fix this error:

--- gcc-7.1.0.orig/libstdc++-v3/libsupc++/new_opa.cc2017-01-26
15:30:45.0 +0100
+++ gcc-7.1.0/libstdc++-v3/libsupc++/new_opa.cc 2017-05-04 17:16:25.920300456
+0200
@@ -31,7 +31,6 @@
 using std::new_handler;
 using std::bad_alloc;

-#if !_GLIBCXX_HAVE_ALIGNED_ALLOC
 #if _GLIBCXX_HAVE__ALIGNED_MALLOC
 #define aligned_alloc(al,sz) _aligned_malloc(sz,al)
 #elif _GLIBCXX_HAVE_POSIX_MEMALIGN
@@ -82,7 +81,6 @@
   return aligned_ptr;
 }
 #endif
-#endif

 _GLIBCXX_WEAK_DEFINITION void *
 operator new (std::size_t sz, std::align_val_t al)

I can't be certain that this patch does not have unwanted side 

[Bug ipa/88711] [regression 9.0] scan-ipa-dump inline "Inlined tp_sum/

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88711

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-05
 Ever confirmed|0   |1

--- Comment #2 from Dominique d'Humieres  ---
Confirmed on darwin.

[Bug fortran/88710] [F08] Sourced allocation of array fails, yielding wrong bounds and result

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88710

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-05
 CC||burnus at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
The behavior has changed between revisions r265171 (2018-10-15)

lbound/ubound(a):-1-1 1 2 1 1
lbound/ubound(b):-1-1 1 2 1 1
lbound/ubound(c):-1-1 1 2 1 1
lbound/ubound(t): 0 0 0 3 2 0

and r265310 (2018-10-19)

lbound/ubound(a):-1-1 1 2 1 1
lbound/ubound(b):-1-1 1 2 1 1
lbound/ubound(c):-1-1 1 2 1 1
lbound/ubound(t): 1 1 1 4 3 1

likely r265212 (pr67125).

Note that the block

   do k = lbound(a,3), ubound(a,3)
  do j = lbound(a,2), ubound(a,2)
 do i = lbound(a,1), ubound(a,1)
write(*,'(1p,4(e23.16,1x))') &
 &   a(i,j,k), b(i,j,k), c(i,j,k), t(i,j,k)
 end do
  end do
   end do

accesses 't' outside its bounds in both cases.

[Bug ipa/88711] [regression 9.0] scan-ipa-dump inline "Inlined tp_sum/

2019-01-05 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88711

--- Comment #1 from kargl at gcc dot gnu.org ---

> The likely cause of this regression is
> 
> 
> r267600 | hubicka | 2019-01-05 09:47:34 -0800 (Sat, 05 Jan 2019) | 2 lines
> 
> * ipa-fnsummary.c (analyze_function_body): Fix accounting of time.

Definitely caused by r267600.  Verified by 'svn merge -r267600:267599 .'
to remove offending patch.

Perhaps, the scan line in the testcase needs to be adjusted?

[Bug c++/85052] Implement support for clang's __builtin_convertvector

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85052

--- Comment #8 from Jakub Jelinek  ---
Note, I've posted in the meantime a newer version of the patch that should
handle the 2x narrowing or 2x widening cases better, see
https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00129.html

[Bug c++/85052] Implement support for clang's __builtin_convertvector

2019-01-05 Thread husseydevin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85052

--- Comment #7 from Devin Hussey  ---
Wait, silly me, this isn't about optimizations, this is about patterns.

It does the same thing it was doing for this code:

typedef unsigned u32x2 __attribute__((vector_size(8)));
typedef unsigned long long u64x2 __attribute__((vector_size(16)));

u64x2 cvt(u32x2 in)
{
return (u64x2) { (unsigned long long)in[0], (unsigned long long)in[1] };
}

[Bug c++/85052] Implement support for clang's __builtin_convertvector

2019-01-05 Thread husseydevin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85052

--- Comment #6 from Devin Hussey  ---
The patch seems to be working.

typedef unsigned u32x2 __attribute__((vector_size(8)));
typedef unsigned long long u64x2 __attribute__((vector_size(16)));

u64x2 cvt(u32x2 in)
{
return __builtin_convertvector(in, u64x2);
}

It doesn't generate the best code, but it isn't bad.

x86_64, SSE4.1:

cvt:
movq%xmm0, %rax
movd%eax, %xmm0
shrq$32, %rax
pinsrq  $1, %rax, %xmm0
ret

x86_64, SSE2:

cvt:
movq%xmm0, %rax
movd%eax, %xmm0
shrq$32, %rax
movq%rax, %xmm1
punpcklqdq  %xmm1, %xmm0
ret

ARMv7a NEON:

cvt:
sub sp, sp, #16
mov r3, #0
str r3, [sp, #4]
str r3, [sp, #12]
add r3, sp, #8
vst1.32 {d0[0]}, [sp]
vst1.32 {d0[1]}, [r3]
vld1.64 {d0-d1}, [sp:64]
add sp, sp, #16
bx  lr

I haven't built the others yet.

The correct code would be this ([signed|unsigned]):

cvt:
vmovl.[s|u]32q0, d0
bx lr

I am testing other targets now. 

For the reference, this is what clang generates for other targets:

aarch64:

cvt:
[s|u]shll   v0.2d, v0.2s, #0
ret

sse4.1/avx:

cvt:
[v]pmov[s|z]xdqxmm0, xmm0
ret

sse2:

signed_cvt:
pxorxmm1, xmm1
pcmpgtd xmm1, xmm0
punpckldq   xmm0, xmm1  # xmm0 =
xmm0[0],xmm1[0],xmm0[1],xmm1[1]
ret

unsigned_cvt:
xorps   xmm1, xmm1
unpcklpsxmm0, xmm1  # xmm0 =
xmm0[0],xmm1[0],xmm0[1],xmm1[1]
ret

[Bug ipa/88711] New: [regression 9.0] scan-ipa-dump inline "Inlined tp_sum/

2019-01-05 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88711

Bug ID: 88711
   Summary: [regression 9.0]  scan-ipa-dump inline "Inlined
tp_sum/
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kargl at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

A recent change (as in the last 12 hours) has introduce this regression
on x86_64-*-freebsd.

FAIL: gfortran.dg/pr79966.f90   -O   scan-ipa-dump inline "Inlined
tp_sum/[0-9]+ into runtptests/[0-9]+"

The likely cause of this regression is


r267600 | hubicka | 2019-01-05 09:47:34 -0800 (Sat, 05 Jan 2019) | 2 lines

* ipa-fnsummary.c (analyze_function_body): Fix accounting of time.

[Bug fortran/88710] New: [F08] Sourced allocation of array fails, yielding wrong bounds and result

2019-01-05 Thread c...@mnet-mail.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88710

Bug ID: 88710
   Summary: [F08] Sourced allocation of array fails, yielding
wrong bounds and result
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: c...@mnet-mail.de
  Target Milestone: ---

The following code shows that sourced allocation of an allocatable
array with gfortran 8.1.0 leads to wrong lower and upper bounds
that do not correspond to those of the source expression. 

Moreover, the initialized array therefore does not yield the correct
result expected from the value of the source expression.

$ cat test_alloc.F90 
program test_alloc

   implicit none

   integer(4) :: i, j, k
   real(8), dimension(:,:,:), allocatable :: a, b, c, t

   allocate( a(-1:2,-1:1,1:1) )
   allocate( b(-1:2,-1:1,1:1) )
   allocate( c(-1:2,-1:1,1:1) )

   a = 1.d0
   b = 2.d0
   c = 0.d0

   allocate(t, source = (a + (c - b)) )

   write(*,'(a,6(i5,1x))') 'lbound/ubound(a): ', lbound(a),  ubound(a)
   write(*,'(a,6(i5,1x))') 'lbound/ubound(b): ', lbound(b),  ubound(b)
   write(*,'(a,6(i5,1x))') 'lbound/ubound(c): ', lbound(c),  ubound(c)
   write(*,'(a,6(i5,1x))') 'lbound/ubound(t): ', lbound(t),  ubound(t)

   write(*,*) 'a, b, c, t: '
   do k = lbound(a,3), ubound(a,3)
  do j = lbound(a,2), ubound(a,2)
 do i = lbound(a,1), ubound(a,1)
write(*,'(1p,4(e23.16,1x))') &
 &   a(i,j,k), b(i,j,k), c(i,j,k), t(i,j,k)
 end do
  end do
   end do

end program test_alloc


Running this code with gfortran 8.1.0 gives the following output.
$ gfortran-8 test_alloc.F90 -o test.gfort; ./test.gfort 
lbound/ubound(a):-1-1 1 2 1 1
lbound/ubound(b):-1-1 1 2 1 1
lbound/ubound(c):-1-1 1 2 1 1
lbound/ubound(t): 0 0 0 3 2 0
 a, b, c, t: 
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00 
0.E+00
 1.E+00  2.E+00  0.E+00 
1.6304166312761136-322
 1.E+00  2.E+00  0.E+00 
1.0023829485142537E-95
 1.E+00  2.E+00  0.E+00 
3.4119363283543871-315
 1.E+00  2.E+00  0.E+00 
0.E+00
 1.E+00  2.E+00  0.E+00 
2.0716172530123468-320
 1.E+00  2.E+00  0.E+00 
9.6317959318370178-317


Both flang 6.0 and pgfortran 18.4-0 yield the following (correct) output 
(notice the different bounds for t, and its values printed in the last column):
$ flang test_alloc.F90 -o test.flang; ./test.flang
lbound/ubound(a):-1-1 1 2 1 1
lbound/ubound(b):-1-1 1 2 1 1
lbound/ubound(c):-1-1 1 2 1 1
lbound/ubound(t):-1-1 1 2 1 1
 a, b, c, t: 
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00
 1.E+00  2.E+00  0.E+00
-1.E+00

Gfortran version used is:
$ gfortran-8 -v
Using built-in specs.
COLLECT_GCC=gfortran-8
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1

[Bug fortran/88653] Is this a compiler bug?

2019-01-05 Thread mtekeev at yandex dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88653

--- Comment #14 from Murat Tekeev  ---
I will establish anew Cygwin and I will try to repeat compilation.
When I used version 7.3, everything was good.
Eventually, there are also other compilers, except gfortran.

[Bug driver/88708] help-dummy.o file left behind

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88708

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
No idea why the documentation suggests the -c there, without it it works just
fine.
With -S too, it actually calls cc1 with -o help-dummy.s but doesn't actually
emit there anything into that file (nor, if it exists previously, removes it or
modifies it).
With -E it actually fails:
./xgcc -B ./ -E -Q -O --help=optimizers
cc1: fatal error: help-dummy: No such file or directory
compilation terminated.
I wonder if we shouldn't treat -E as -S and -c as no -E/-S/-c with these help
options, which is IMHO the best thing.  Without -E/-S/-c, cc1 is executed with
say -o /tmp/cc7Z9tXX.s but doesn't write that file, and as is executed with
-o /tmp/cc4DJDCT.o /tmp/cc7Z9tXX.s and all the temporary files are removed
afterwards.

[Bug target/88706] [og8, nvptx, openacc] Inconsistencies when vector length set using vector_length clause or fopenacc-dim

2019-01-05 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88706

--- Comment #1 from Tom de Vries  ---
(In reply to Tom de Vries from comment #0)
> I think the same problem exists for the other work around in
> nvptx_adjust_parallelism, this one:
> ...
>   /* FIXME: This is overly conservative; worker and vector loop will
> 
>  eventually be combined.  */
>   if (wv)
> return inner_mask & ~GOMP_DIM_MASK (GOMP_DIM_WORKER);
> ...
> It's just harder to spot because the workaround doesn't affect vector length.

Confirmed.

With this additional patch:
...
@@ -5695,7 +5696,10 @@ nvptx_adjust_parallelism (unsigned inner_mask, unsigned
outer_mask)
   /* FIXME: This is overly conservative; worker and vector loop will
  eventually be combined.  */
   if (wv)
-return inner_mask & ~GOMP_DIM_MASK (GOMP_DIM_WORKER);
+{
+  fprintf (stderr, "worker-vector loop workaround applied in %s\n",
current_function_name ());
+  return inner_mask & ~GOMP_DIM_MASK (GOMP_DIM_WORKER);
+}

   /* It's difficult to guarantee that warps in large vector_lengths
  will remain convergent when a vector loop is nested inside a
...

we see for the first case (vector_length set on parallel directive, no
-fopenacc-dim=):
...
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
worker-vector loop workaround applied in test2._omp_fn.1
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
...

and for the second case (no vector_length set on parallel directive, using
-fopenacc-dim=):
...
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
...

[Bug fortran/85855] [7/8/9 Regression] (Maybe) uninitialized descriptor fields of an allocatable array component of a function result

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85855

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #7 from Dominique d'Humieres  ---
> I'm seeing the same behavior on GCC 7.3; this looks to be a duplicate
> of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77504 .

I agree.

*** This bug has been marked as a duplicate of bug 77504 ***

[Bug fortran/77504] "is used uninitialized" with allocatable string and array constructors

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77504

Dominique d'Humieres  changed:

   What|Removed |Added

 CC||vladimir.fuka at gmail dot com

--- Comment #8 from Dominique d'Humieres  ---
*** Bug 85855 has been marked as a duplicate of this bug. ***

[Bug middle-end/24639] [meta-bug] bug to track all Wuninitialized issues

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639
Bug 24639 depends on bug 85855, which changed state.

Bug 85855 Summary: [7/8/9 Regression] (Maybe) uninitialized descriptor fields 
of an allocatable array component of a function result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85855

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

[Bug fortran/85855] [7/8/9 Regression] (Maybe) uninitialized descriptor fields of an allocatable array component of a function result

2019-01-05 Thread johnsonsr at ornl dot gov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85855

Seth Johnson  changed:

   What|Removed |Added

 CC||johnsonsr at ornl dot gov

--- Comment #6 from Seth Johnson  ---
I'm seeing the same behavior on GCC 7.3; this looks to be a duplicate of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77504 .

[Bug tree-optimization/88709] Improve store-merging

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88709

Jakub Jelinek  changed:

   What|Removed |Added

 CC||redi at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Compared to the first testcase, we do handle
struct S { char buf[8]; };
void bar (struct S *);

void
foo (void)
{
  struct S s;
  int a = 0;
  __builtin_memcpy ([4], , sizeof (int));
  s.buf[0] = 5;
  s.buf[1] = 2;
  s.buf[2] = 3;
  s.buf[3] = 2;
  s.buf[5] = 7;
  bar ();
}

though, because the store is in that case MEM[ + 4B] = {} and thus valid for
lhs.

[Bug fortran/88009] [9 Regression] ICE in find_intrinsic_vtab, at fortran/class.c:2761

2019-01-05 Thread janus at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88009

janus at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from janus at gcc dot gnu.org ---
Fixed with r267598. Closing.

[Bug tree-optimization/88709] New: Improve store-merging

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88709

Bug ID: 88709
   Summary: Improve store-merging
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

As shown in:
struct S { char buf[8]; };
void bar (struct S *);

void
foo (void)
{
  struct S s = {};
  s.buf[1] = 1;
  s.buf[3] = 2;
  bar ();
}

or

struct val_t
{
  char data[16];
};

void optimize_me (val_t);
void optimize_me3 (val_t, val_t, val_t);

void
good ()
{
  optimize_me ({ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 });
}

void
bad ()
{
  optimize_me ({ 1, 2, 3, 4, 5 });
}

void
why ()
{
  optimize_me ({ 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 });
}

void
srsly ()
{
  optimize_me3 ({ 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 11, 12, 13, 14, 15, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10 },
{ 21, 22, 23, 24, 25, 20, 20, 20, 10, 20, 20, 20, 20, 20, 20
});
}

void
srsly_not_one_missing ()
{
  optimize_me3 ({ 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 11, 12, 13, 14, 15, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10 },
{ 21, 22, 23, 24, 25, 20, 20, 20, 10, 20, 20, 20, 20, 20, 20,
11 });
}

there is room for improvement in store-merging.  In the first testcase, we
ignore the clearing because !lhs_valid_for_store_merging_p, the lhs is in that
case the whole VAR_DECL rather than a component of it.  And in the second
testcase, we sometimes punt because of the same reason, sometimes because
rhs_valid_for_store_merging_p is false.  Handling these = {} storage clearings
(or perhaps even __builtin_memset calls) is something we could handle, though
with extra care, we don't want to take apart those clears if it doesn't reduce
the amount of needed stores.

[Bug fortran/88009] [9 Regression] ICE in find_intrinsic_vtab, at fortran/class.c:2761

2019-01-05 Thread janus at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88009

--- Comment #4 from janus at gcc dot gnu.org ---
Author: janus
Date: Sat Jan  5 14:32:12 2019
New Revision: 267598

URL: https://gcc.gnu.org/viewcvs?rev=267598=gcc=rev
Log:
2019-01-05  Janus Weil  

PR fortran/88009
* class.c (gfc_find_derived_vtab): Mark the _final component as
artificial.
(find_intrinsic_vtab): Ditto. Also add an extra check to avoid
dereferencing a null pointer and adjust indentation.
* resolve.c (resolve_fl_variable): Add extra check to avoid
dereferencing a null pointer. Move variable declarations to local
scope.
(resolve_fl_procedure): Add extra check to avoid dereferencing a null
pointer.
* symbol.c (check_conflict): Suppress errors for artificial symbols.

2019-01-05  Janus Weil  

PR fortran/88009
* gfortran.dg/blockdata_10.f90: New test case.

Added:
trunk/gcc/testsuite/gfortran.dg/blockdata_10.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/class.c
trunk/gcc/fortran/resolve.c
trunk/gcc/fortran/symbol.c
trunk/gcc/testsuite/ChangeLog

[Bug c/88698] Relax generic vector conversions

2019-01-05 Thread husseydevin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88698

--- Comment #10 from Devin Hussey  ---
Well what about a special type attribute or some kind of transparent_union like
thing for Intel's types? It seems that Intel's intrinsics are the main (only)
platform that uses generic types.

[Bug fortran/88653] Is this a compiler bug?

2019-01-05 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88653

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #13 from Thomas Koenig  ---
I checked this with the exact same version on Cygwin, no
errors detected.

So, this loos like an installation or hardware problem. Could
you maybe re-install the compiler?

[Bug driver/88708] New: help-dummy.o file left behind

2019-01-05 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88708

Bug ID: 88708
   Summary: help-dummy.o file left behind
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

When using

  gcc -c -Q -O --help=optimizers

the driver leaves behind the help-dummy.o file.  This happens with gcc trunk
and all prior versions I was able to test.

[Bug fortran/88632] [F08] function contained in module invisible to submodule unless declared public

2019-01-05 Thread pault at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88632

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #2 from Paul Thomas  ---
Created attachment 45349
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45349=edit
A provisional patch that fixes the problem

The attached fixes this but causes regressions:

FAIL: gfortran.dg/module_private_2.f90   -O   scan-tree-dump-times optimized
"priv" 0
FAIL: gfortran.dg/public_private_module_7.f90   -O   scan-assembler-not
__m_common_attrs_MOD_other
FAIL: gfortran.dg/public_private_module_8.f90   -O   scan-assembler-not
__m_MOD_myotherlen
FAIL: gfortran.dg/public_private_module_2.f90   -O   scan-assembler-not two
FAIL: gfortran.dg/public_private_module_2.f90   -O   scan-assembler-not six
FAIL: gfortran.dg/warn_unused_function_2.f90   -O   (test for warnings, line
16)

I think that this is best dealt with by extending the patch by flagging the
module as having a module function/subroutine, which implies that there is a
submodule somewhere, and making all the module procedures TREE_PUBLIC. That
will suppress the above regressions.

Otherwise, I will have to find someway of persuading the linker to find the
symbol from the submodule.

First I must get the C-interop patch out of the way and then I will come back
to this PR.

Paul

[Bug libgomp/88707] Random failures of libgomp.c++/task-reduction-(8|10).C on x86_64-apple-darwin18

2019-01-05 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88707

--- Comment #2 from Iain Sandoe  ---
(on Darwin17 I had a recent build)

I find that a built exe fails quite often; here's a sample of the hung program
(it appears deadlocked, not consuming any CPU).

The correct libraries are being loaded.

Sampling process 23844 for 3 seconds with 1 millisecond of run time between
samples
Sampling completed, processing symbols...
Analysis of sampling task-reduction-10.exe (pid 23844) every 1 millisecond
Process: task-reduction-10.exe [23844]
Path:   
/Volumes/scratch/10-13-his/gcc-trunk-gcc/x86_64-apple-darwin17/libgomp/testsuite/task-reduction-10.exe
Load Address:0x10835
Identifier:  task-reduction-10.exe
Version: 0
Code Type:   X86-64
Parent Process:  bash [34246]

Date/Time:   2019-01-05 12:53:30.784 +
Launch Time: 2019-01-05 12:52:40.943 +
OS Version:  Mac OS X 10.13.6 (17G4015)
Report Version:  7
Analysis Tool:   /usr/bin/sample

Physical footprint: 568K
Physical footprint (peak):  576K


Call graph:
2799 Thread_58184392   DispatchQueue_1: com.apple.main-thread  (serial)
+ 2799 ???  (in )  [0x7f9679c02718]
+   2799 gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606] 
bar.c:92
+ 2799 gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
+   2799 _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
+ 2799 __psynch_cvwait  (in libsystem_kernel.dylib) + 10 
[0x7fff534aea16]
2799 Thread_58184395
+ 2799 ???  (in )  [0x2060]
+   2799 ???  (in )  [0x7f9679c02cb8]
+ 2799 gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606] 
bar.c:92
+   2799 gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
+ 2799 _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
+   2799 __psynch_cvwait  (in libsystem_kernel.dylib) + 10 
[0x7fff534aea16]
2799 Thread_58184397
+ 2799 ???  (in )  [0x2060]
+   2799 ???  (in )  [0x7f9679c030b8]
+ 2799 gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606] 
bar.c:92
+   2799 gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
+ 2799 _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
+   2799 __psynch_cvwait  (in libsystem_kernel.dylib) + 10 
[0x7fff534aea16]
2799 Thread_58184398
+ 2799 ???  (in )  [0x2060]
+   2799 ???  (in )  [0x7f9679c032b8]
+ 2799 gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606] 
bar.c:92
+   2799 gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
+ 2799 _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
+   2799 __psynch_cvwait  (in libsystem_kernel.dylib) + 10 
[0x7fff534aea16]
2799 Thread_58184399
+ 2799 ???  (in )  [0x2060]
+   2799 ???  (in )  [0x7f9679c034b8]
+ 2799 gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606] 
bar.c:92
+   2799 gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
+ 2799 _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
+   2799 __psynch_cvwait  (in libsystem_kernel.dylib) + 10 
[0x7fff534aea16]
2799 Thread_58184400
+ 2799 ???  (in )  [0x2060]
+   2799 ???  (in )  [0x7f9679d00118]
+ 2799 gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606] 
bar.c:92
+   2799 gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
+ 2799 _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
+   2799 __psynch_cvwait  (in libsystem_kernel.dylib) + 10 
[0x7fff534aea16]
2799 Thread_58184401
  2799 ???  (in )  [0x2060]
2799 ???  (in )  [0x7f9679d00318]
  2799 gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606] 
bar.c:92
2799 gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
  2799 _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
2799 __psynch_cvwait  (in libsystem_kernel.dylib) + 10 
[0x7fff534aea16]

Total number in stack (recursive counted multiple, when >=5):
7   __psynch_cvwait  (in libsystem_kernel.dylib) + 0 
[0x7fff534aea0c]
7   _pthread_cond_wait  (in libsystem_pthread.dylib) + 732 
[0x7fff53677589]
7   gomp_barrier_wait_end  (in libgomp.1.dylib) + 86  [0x10862d606]
 bar.c:92
7   gomp_sem_wait  (in libgomp.1.dylib) + 40  [0x10862d488] 
sem.c:71
6   ???  (in )  [0x2060]

Sort by top of stack, same collapsed (when >= 5):
__psynch_cvwait  (in libsystem_kernel.dylib)19593

[Bug target/60563] FAIL: g++.dg/ext/sync-4.C on *-apple-darwin*

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60563

Dominique d'Humieres  changed:

   What|Removed |Added

   Priority|P3  |P4
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #20 from Dominique d'Humieres  ---
Silenced on trunk and release branches, closing.

The test will XPASS when the ld problem will be fixed and the darwin hack could
then be removed.

[Bug libgomp/88707] Random failures of libgomp.c++/task-reduction-(8|10).C on x86_64-apple-darwin18

2019-01-05 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88707

Iain Sandoe  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-05
 Ever confirmed|0   |1

--- Comment #1 from Iain Sandoe  ---
looking through my last set of results, the first occurrence I see is for
Darwin16 (OSX 10.12), but since this is a random fail - that might be
inconclusive.

[Bug target/60563] FAIL: g++.dg/ext/sync-4.C on *-apple-darwin*

2019-01-05 Thread dominiq at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60563

--- Comment #19 from dominiq at gcc dot gnu.org ---
Author: dominiq
Date: Sat Jan  5 12:44:12 2019
New Revision: 267597

URL: https://gcc.gnu.org/viewcvs?rev=267597=gcc=rev
Log:
2019-01-05  Dominique d'Humieres  

PR target/60563
* g++.dg/ext/sync-4.C: Add dg-xfail-run-if for darwin.


Modified:
branches/gcc-7-branch/gcc/testsuite/ChangeLog
branches/gcc-7-branch/gcc/testsuite/g++.dg/ext/sync-4.C

[Bug libgomp/88707] New: Random failures of libgomp.c++/task-reduction-(8|10).C on x86_64-apple-darwin18

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88707

Bug ID: 88707
   Summary: Random failures of libgomp.c++/task-reduction-(8|10).C
on x86_64-apple-darwin18
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dominiq at lps dot ens.fr
CC: iains at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-apple-darwin18
Target: x86_64-apple-darwin18
 Build: x86_64-apple-darwin18

On x86_64-apple-darwin18 I see

WARNING: program timed out.
FAIL: libgomp.c++/task-reduction-10.C execution test
WARNING: program timed out.
FAIL: libgomp.c++/task-reduction-8.C execution test

since they were introduced at revision r265930.

Not only the tests are randomly timed out for -m32 or -m64, but I have to to
kill the executable manually. I don't see the problem on darwin 10.

[Bug middle-end/82564] ICE at -O1 and above: in assign_stack_temp_for_type, at function.c:783

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82564

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||jakub at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #4 from Jakub Jelinek  ---
Fixed on the trunk.

[Bug target/88620] [7/8 Regression] ICE in assign_stack_temp_for_type, at function.c:837

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88620

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[7/8/9 Regression] ICE in   |[7/8 Regression] ICE in
   |assign_stack_temp_for_type, |assign_stack_temp_for_type,
   |at function.c:837   |at function.c:837

--- Comment #5 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug fortran/88653] Is this a compiler bug?

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88653

--- Comment #12 from Dominique d'Humieres  ---
It seems that the problem comes from your installation.

Did you build gfortran yourself or did you get it from some binary
distribution? If the later, from where? Did you report the problem to them?

Is this the first time you use gfortran? If no, what was the last working
version?

> A list of files that failed to compile.

Does this mean that the other files compile and run?

[Bug target/60563] FAIL: g++.dg/ext/sync-4.C on *-apple-darwin*

2019-01-05 Thread dominiq at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60563

--- Comment #18 from dominiq at gcc dot gnu.org ---
Author: dominiq
Date: Sat Jan  5 11:17:40 2019
New Revision: 267596

URL: https://gcc.gnu.org/viewcvs?rev=267596=gcc=rev
Log:
2019-01-05  Dominique d'Humieres  

PR target/60563
* g++.dg/ext/sync-4.C: Add dg-xfail-run-if for darwin.


Modified:
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/sync-4.C

[Bug target/88706] New: [og8, nvptx, openacc] Inconsistencies when vector length set using vector_length clause or fopenacc-dim

2019-01-05 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88706

Bug ID: 88706
   Summary: [og8, nvptx, openacc] Inconsistencies when vector
length set using vector_length clause or fopenacc-dim
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Consider libgomp testcase vred2d-128.c (posted partially here):
...
gentest (test1, "acc parallel loop gang vector_length (128)",
 "acc loop vector reduction(+:t1) reduction(-:t2)")

gentest (test2, "acc parallel loop gang vector_length (128)",
 "acc loop worker vector reduction(+:t1) reduction(-:t2)")

gentest (test3, "acc parallel loop gang worker vector_length (128)",
 "acc loop vector reduction(+:t1) reduction(-:t2)")

gentest (test4, "acc parallel loop",
 "acc loop reduction(+:t1) reduction(-:t2)")
...

The resulting front-end attributes are:
...
$ grep -A1 __attribute__ vred2d-128.c.088t.fixup_cfg4
__attribute__((oacc function (, , 128), omp target entrypoint))
test1._omp_fn.0 (long int * t2, long int * t1, int[1] * a2, int[1] *
a1)
--
__attribute__((oacc function (, , 128), omp target entrypoint))
test2._omp_fn.1 (long int * t2, long int * t1, int[1] * a2, int[1] *
a1)
--
__attribute__((oacc function (, , 128), omp target entrypoint))
test3._omp_fn.2 (long int * t2, long int * t1, int[1] * a2, int[1] *
a1)
--
__attribute__((oacc function (, , ), omp target entrypoint))
test4._omp_fn.3 (long int * t2, long int * t1, int[1] * a2, int[1] *
a1)
...

When we compile at -O2 and grep for the resulting dimensions, we have:
...
$ grep FUNC_MAP vred2d-128.s
//:FUNC_MAP "test1$_omp_fn$0", 0, 0x1, 0x80
//:FUNC_MAP "test2$_omp_fn$1", 0, 0x1, 0x80
//:FUNC_MAP "test3$_omp_fn$2", 0, 0, 0x20
//:FUNC_MAP "test4$_omp_fn$3", 0, 0, 0x20
...

Note that the vector length for test3 has been downgraded by the
-mno-long-vector-in-workers workaround.

Now if we remove the hardcoded vector-length (128) from test1, test2 and test3,
and we add -fopenacc-dim=::128 we have instead:
...
//:FUNC_MAP "test1$_omp_fn$0", 0, 0x1, 0x80
//:FUNC_MAP "test2$_omp_fn$1", 0, 0, 0x80
//:FUNC_MAP "test3$_omp_fn$2", 0, 0, 0x80
//:FUNC_MAP "test4$_omp_fn$3", 0, 0, 0x80
...

The change on test4 is expected.

But the change on test3 is unexpected. It should not matter whether we set the
vector length on the parallel directive, or using -fopenacc-dim, the effect of
-mno-long-vector-in-workers should be the same.

The cause for this can be seen by adding this print statement:
...
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 110dbffe0d0..5aab6db169f 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5688,6 +5688,7 @@ nvptx_adjust_parallelism (unsigned inner_mask, unsigned
outer_mask)
   offload_attrs oa;

   populate_offload_attrs ();
+  fprintf (stderr, "oa.vector_length in nvptx_adjust_parallelism: %d\n",
oa.vector_length);

   if (oa.vector_length == PTX_WARP_SIZE)
 return inner_mask;
...

If we have the first case (vector_length set on parallel directive, no
-fopenacc-dim=), we have:
...
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 128
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
...

But in the second case (no vector_length set on parallel directive, using
-fopenacc-dim=), we have:
...
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
oa.vector_length in nvptx_adjust_parallelism: 32
...

I think the same problem exists for the other work around in
nvptx_adjust_parallelism, this 

[Bug target/88620] [7/8/9 Regression] ICE in assign_stack_temp_for_type, at function.c:837

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88620

--- Comment #4 from Jakub Jelinek  ---
Author: jakub
Date: Sat Jan  5 11:14:12 2019
New Revision: 267595

URL: https://gcc.gnu.org/viewcvs?rev=267595=gcc=rev
Log:
PR middle-end/82564
PR target/88620
* expr.c (expand_assignment): For calls returning VLA structures
if to_rtx is not a MEM, force it into a stack temporary.

* gcc.dg/nested-func-12.c: New test.
* gcc.c-torture/compile/pr82564.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr82564.c
trunk/gcc/testsuite/gcc.dg/nested-func-12.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/expr.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/82564] ICE at -O1 and above: in assign_stack_temp_for_type, at function.c:783

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82564

--- Comment #3 from Jakub Jelinek  ---
Author: jakub
Date: Sat Jan  5 11:14:12 2019
New Revision: 267595

URL: https://gcc.gnu.org/viewcvs?rev=267595=gcc=rev
Log:
PR middle-end/82564
PR target/88620
* expr.c (expand_assignment): For calls returning VLA structures
if to_rtx is not a MEM, force it into a stack temporary.

* gcc.dg/nested-func-12.c: New test.
* gcc.c-torture/compile/pr82564.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr82564.c
trunk/gcc/testsuite/gcc.dg/nested-func-12.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/expr.c
trunk/gcc/testsuite/ChangeLog

[Bug c/88698] Relax generic vector conversions

2019-01-05 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88698

--- Comment #9 from Marc Glisse  ---
(In reply to Devin Hussey from comment #2)
> What I am saying is that I think -flax-vector-conversions should be default,
> or we should only have minimal warnings instead of errors.
> 
> That will make generic vectors much easier to use.

And more confusing / error-prone, there is a compromise.

> typedef uint32_t u32x4 __attribute__((vector_size(16)));
> 
> u32x4 shift(u32x4 val)
> {
> return _mm_srli_epi32(val, 15);
> }

Indeed, when calling an intrinsic, it could make sense to allow other vector
types of the same size. Or would you expect the same behavior if you were
calling your own function instead of _mm_srli_epi32?

> 3. Cast. Good lord, if you thought intrinsics were ugly, this will change
> your mind:
> 
> return (u32x4)_mm_srli_epi32((__m128i)val, 15);

It isn't that bad. First, if you only use intrinsics, you shouldn't define
u32x4, then you only have __m128i, __m128 and __m128d, fewer conversions are
needed. Then, if you do define u32x4, you can rewrite that as

  return val >> 15;

> This is the second issue: unsigned long and unsigned int are the same size
> and should have no issues converting between each other.

We could special case this. But note that in C/C++, we don't consider int and
long as the same type just because they have the same size, and reinterpreting
int* as long* violates strict aliasing.

> typedef unsigned u32x4 __attribute__((vector_size(16)));
> typedef unsigned long long u64x2 __attribute__((vector_size(16)));
> 
> u64x2 cast(u32x4 val)
> {
> return val;
> }
> 
> 
> This should emit a warning without a cast. I would recommend an error, but
> Clang without -Wvector-conversion accepts this without any complaining.

At some point it isn't easy to have a different behavior for an implicit
conversion in different contexts. Should the intrinsics be marked with some
magic flag that asks to be lax about their arguments?


(In reply to Devin Hussey from comment #5)
> Clang even allows this:
> 
> #include 
> 
> uint32x4_t mult(uint16x8_t top, uint32x4_t bot)
> {
> return top * bot;
> }

We clearly don't want that...

[Bug debug/88635] [8 Regression] Assembler error when building with "-g -O2 -m32"

2019-01-05 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88635

--- Comment #5 from Jakub Jelinek  ---
Author: jakub
Date: Sat Jan  5 11:12:35 2019
New Revision: 267594

URL: https://gcc.gnu.org/viewcvs?rev=267594=gcc=rev
Log:
PR debug/88635
* dwarf2out.c (const_ok_for_output_1): Reject MINUS that contains
SYMBOL_REF, CODE_LABEL or UNSPEC in subexpressions of second argument.
Reject PLUS that contains SYMBOL_REF, CODE_LABEL or UNSPEC in
subexpressions of both operands.
(mem_loc_descriptor): Handle UNSPEC if target hook acks it and all the
subrtxes are CONSTANT_P.
* config/i386/i386.c (ix86_const_not_ok_for_debug_p): Revert
2018-11-09 changes.

* gcc.dg/debug/dwarf2/pr88635.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/debug/dwarf2/pr88635.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/dwarf2out.c
trunk/gcc/testsuite/ChangeLog

[Bug target/60563] FAIL: g++.dg/ext/sync-4.C on *-apple-darwin*

2019-01-05 Thread dominiq at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60563

--- Comment #17 from dominiq at gcc dot gnu.org ---
Author: dominiq
Date: Sat Jan  5 11:09:11 2019
New Revision: 267593

URL: https://gcc.gnu.org/viewcvs?rev=267593=gcc=rev
Log:
2019-01-05  Dominique d'Humieres  

PR target/60563
Missing PR entry in the previous commit.


Modified:
trunk/gcc/testsuite/ChangeLog

[Bug target/88638] [9 Regression] FAIL: *string-format-1.* on darwin

2019-01-05 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88638

--- Comment #3 from Dominique d'Humieres  ---
> I submitted the patch below for review.  Dominique, if you have
> an opportunity to test it on Darwin and let me know if there are
> any outstanding problems that would be great.
> https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00181.html

The patch for c-family/c-attribs.c no longer applies due to revision r267591: I
used

--- ../_clean/gcc/c-family/c-attribs.c  2019-01-05 05:45:01.0 +0100
+++ gcc/c-family/c-attribs.c2019-01-05 06:04:49.0 +0100
@@ -632,16 +632,12 @@ positional_argument (const_tree fntype, 
}

   bool type_match;
-  if (code == STRING_CST && POINTER_TYPE_P (argtype))
-   {
- /* Where the expected code is STRING_CST accept any pointer
-to a narrow character type, qualified or otherwise.  */
- tree type = TREE_TYPE (argtype);
- type = TYPE_MAIN_VARIANT (type);
- type_match = (type == char_type_node
-   || type == signed_char_type_node
-   || type == unsigned_char_type_node);
-   }
+  if (code == STRING_CST)
+   /* Where the expected code is STRING_CST accept any pointer
+  expected by attribute format (this includes possibly qualified
+  char pointers and, for targets like Darwin, also pointers to
+  struct CFString).  */
+   type_match = valid_format_string_type_p (argtype);
   else if (code == INTEGER_TYPE)
/* For integers, accept enums, wide characters and other types
   that match INTEGRAL_TYPE_P except for bool.  */
@@ -652,6 +648,21 @@ positional_argument (const_tree fntype, 

   if (!type_match)
{
+ if (code == STRING_CST)
+   {
+ /* Reject invalid format strings with an error.  */
+ if (argno < 1)
+   error ("%qE attribute argument value %qE refers to "
+  "parameter type %qT",
+  atname, pos, argtype);
+ else
+   error ("%qE attribute argument %i value %qE refers to "
+  "parameter type %qT",
+  atname, argno, pos, argtype);
+
+ return NULL_TREE;
+   }
+
  if (argno < 1)
warning (OPT_Wattributes,
 "%qE attribute argument value %qE refers to "

A quick tests showed that it fixed the reported failures.

[Bug c/88698] Relax generic vector conversions

2019-01-05 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88698

--- Comment #8 from Andrew Pinski  ---
(In reply to Devin Hussey from comment #7)
> I mean, sure, but how about this?
> 
> What about meeting in the middle?

The problem is how do you implement the rules that are required by both the
Altivec and Neon programming manuals?  Do you treat those types differently? 
And then what about the generic vector types to/from the Altivec/Neon types? 
How do you want to have those handled?

Basically GCC was trying to follow what the Altivec (VMX) PEM says with respect
to the types and their casting.

For reference of the Altivec PEM:
https://www.nxp.com/docs/en/reference-manual/ALTIVECPEM.pdf

Have you read the Altivec PEM?  GCC vector extension is/was modeled mostly
after the Altivec PEM with a few additions aftwards (like operators and
condtionals). 

Here is the patch which added vector_size:
https://gcc.gnu.org/ml/gcc-patches/2001-12/msg00379.html

Here is the patch that made it in which added the operators:
https://gcc.gnu.org/ml/gcc/2002-05/msg02234.html

Notice that this patch has the following test:
+ v4si a, b;
..
+ uv4si f;
...
+   f = a; /* { dg-error "incompatible types in assignment" } */

As mentioned in the thread which added -flax-vector-conversions, that was an
accident that some versions of GCC accepted the assignment without the cast.
Somehow the testcase got lost.

[Bug c/88698] Relax generic vector conversions

2019-01-05 Thread husseydevin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88698

--- Comment #7 from Devin Hussey  ---
I mean, sure, but how about this?

What about meeting in the middle?

-fno-lax-vector-conversions generates errors like it  does now.
-flax-vector-conversions shuts GCC up.
No flag causes warnings on -Wpedantic or -Wvector-conversion.

If we really want to enforce the standard, we should  also add a pedantic
warning for when we use overloads on intrinsic types without -std=gnu*.
-Wgnu-vector-extensions or something:

warning:
{
   arithmetic operators |
   logical operators |
   array subscripts |
   initializer lists
}
on vector types are a GNU extension

I feel that the weird promotion rules Clang uses should be an error, and
assignment to different types should warn without a cast.

[Bug ipa/88702] [6/7/8 regression] We do terrible job optimizing IsHTMLWhitespace from Firefox

2019-01-05 Thread hubicka at ucw dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88702

--- Comment #4 from Jan Hubicka  ---
> The only pass that can do about this (at least right now) is reassoc (both 1
> and 2), which is too late for inlining.  So, either teach fnsplit not to
> separate multiple if comparisons of the same variable against constants, or
> schedule reasoc or just the maybe_optimize_range_tests part thereof in some
> early pass.

Yep, I also found out about reassoc.
Teaching fnsplit to pattern match this is just a partial solution - we
would still miscalculate size of function body for functions like this
(which indeed look quite common). I will experiment with early reassoc.

I kind of debugged what happens later. Because code is compiled with -O2
and growth gets positive for both inlines and functions are not inline,
we won't inline.

[Bug c/88698] Relax generic vector conversions

2019-01-05 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88698

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #6 from Alexander Monakov  ---
My recommendation is to use a union like below; this allows writing code using
both generic vectors and intrinsics without casts, and having each operation
show exactly what lane types it operates on:

typedef unsigned char  u8v  __attribute__((vector_size(16)));
typedef unsigned short u16v __attribute__((vector_size(16)));
typedef unsigned int   u32v __attribute__((vector_size(16)));

typedef union {
u8v   u8;
u16v  u16;
u32v  u32;
__m128i m;
} uv;

Example use:

uv x, t, lo_nib, hi_nib;

memcpy(, ptr, sizeof x);
t.u32 = x.u32 >> 4;
lo_nib.u8 = x.u8 & 15;
hi_nib.u8 = t.u8 & 15;
lo_nib.m  = _mm_shuffle_epi8(lut.m, lo_nib.m);
hi_nib.m  = _mm_shuffle_epi8(lut.m, hi_nib.m);

This also allows writing 256-bit and 128-bit versions together when appropriate
(with help of extra macros for using the right intrinsic function).

Would you like to see the documentation mention this pattern?