[Bug target/92950] Wrong load instructions emitted for movv1qi

2019-12-15 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92950

Andreas Krebbel  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||s390x-ibm-linux
   Priority|P3  |P2
   Host||s390x-ibm-linux
  Build||s390x-ibm-linux

[Bug target/92950] New: Wrong load instructions emitted for movv1qi

2019-12-15 Thread krebbel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92950

Bug ID: 92950
   Summary: Wrong load instructions emitted for movv1qi
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

The following testcase abort when being compiled with -O3 -march=z13 on IBM Z:

struct a {
  int b;
  char c;
};
struct a d = {1, 16};
struct a *e = 

int f = 0;

int main() {
  struct a g = {0, 0 };
  f = 0;

  for (; f <= 1; f++) {
g = d;
*e = g;
  }

  if (d.c != 16)
__builtin_abort();
}

The movv1qi pattern emits halfword load instructions instead of character
loads.

[Bug c/77992] Provide feature to initialize padding bytes to avoid information leaks

2019-12-15 Thread mine260309 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

Lei YU  changed:

   What|Removed |Added

 CC||mine260309 at gmail dot com

--- Comment #18 from Lei YU  ---
I hit the exact same problem with GCC 7.4.0 and 9.2.1, and finds out that:
* When all fields are initialized, the padding bytes are not initialized;
* When there is a field is not provided with init value, the padding bytes are
initialized to 0.

See below code snippet:

#include 
#include 
#include 

struct struct_with_padding {
uint32_t a;
uint64_t b;
uint32_t c;
};
int main()
{
struct struct_with_padding s;
memset(, 0xff, sizeof(s));
s = (struct struct_with_padding) {
.a = 0x,
.b = 0x,
#ifdef SHOW_GCC_BUG
.c = 0x,
#else
//.c = 0x,
#endif
};

uint8_t* p8 = (uint8_t*)();

printf("data: ");
for (size_t i = 0; i < sizeof(s); ++i)
{
printf("0x%02x ", p8[i]);
}
printf("\n");
return 0;
}


With `.c = 0x`, the example output:

data: 0xaa 0xaa 0xaa 0xaa 0xff 0xff 0xff 0xff 0xbb 0xbb 0xbb 0xbb 0xbb 0xbb
0xbb 0xbb 0xdd 0xdd 0xdd 0xdd 0xff 0xff 0xff 0xff

Without that, the example output:

data: 0xaa 0xaa 0xaa 0xaa 0x00 0x00 0x00 0x00 0xbb 0xbb 0xbb 0xbb 0xbb 0xbb
0xbb 0xbb 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00


So it looks like GCC could be improved by initializing the padding bytes in
both cases?

[Bug tree-optimization/92949] bswap/store merging does not handle BIT_INSERT_EXPR/BIT_FIELD_REF

2019-12-15 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92949

--- Comment #4 from Andrew Pinski  ---
Created attachment 47502
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47502=edit
one testcase

This is just one example expanded; note scan-tree-dump-times needed to be
updated as I added a few functions to it.

[Bug tree-optimization/92949] bswap/store merging does not handle BIT_INSERT_EXPR/BIT_FIELD_REF

2019-12-15 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92949

Andrew Pinski  changed:

   What|Removed |Added

Summary|bswap/store merging does|bswap/store merging does
   |not handle BIT_INSERT_EXPR  |not handle
   ||BIT_INSERT_EXPR/BIT_FIELD_R
   ||EF

--- Comment #3 from Andrew Pinski  ---
It does not handle BIT_FIELD_REF either which can be seen if you have
maybe_expand_lhs in my patch just return false.  That is we only lower loads
and not stores.

[Bug lto/48200] Implement function attribute for symbol versioning (.symver)

2019-12-15 Thread xry111 at mengyan1223 dot wang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48200

--- Comment #41 from Xi Ruoyao  ---
(In reply to Jan Hubicka from comment #40)
> I posted initial patch here
> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01334.html

I applied it into gcc-9.2.0 and it works.  But, unfortunately, the problem with
LTO and symver is not fixed.

A simple testcase fails:

$ cat foo.c
__attribute__ ((__symver__ ("foo@VERS_1")))
int foo_v1 (void)
{
return 1;
}

__attribute__ ((__symver__ ("foo@@VERS_2")))
int foo_v2 (void)
{
return 2;
}
$ cat version.map
VERS_1 {
global:
foo;
local:
*;
};

VERS_2 {
} VERS_1;
$ gcc foo.c -flto -Wl,--version-script -Wl,version.map -shared -Wl,--as-needed
--save-temp
$ cat foo.res
1
foo.o 4
211 93dd820662070d19 PREVAILING_DEF_IRONLY foo_v1
213 93dd820662070d19 PREVAILING_DEF_IRONLY foo@VERS_1
222 93dd820662070d19 PREVAILING_DEF_IRONLY foo_v2
224 93dd820662070d19 PREVAILING_DEF_IRONLY foo@@VERS_2
$ grep symver cc*.s || echo "no symver"
no symver

[Bug tree-optimization/92949] bswap/store merging does not handle BIT_INSERT_EXPR

2019-12-15 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92949

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> Created attachment 47501 [details]
> The current bit-field lowering patch

I attached the current bit-field lowering patch so if someone wants to work on
this, they can use that to test it out with it.
NOTE on x86_64 and/or aarch64, you need to change SLOW_BYTE_ACCESS to be 1. 
The documentation and implementation for SLOW_BYTE_ACCESS only deals with
bit-fields and actually makes worse code when defined to be 0 :).  I think
there is another thread about that already.

[Bug tree-optimization/92949] bswap/store merging does not handle BIT_INSERT_EXPR

2019-12-15 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92949

--- Comment #1 from Andrew Pinski  ---
Created attachment 47501
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47501=edit
The current bit-field lowering patch

[Bug libfortran/90374] Fortran 2018: Support d0.d, e0.d, es0.d, en0.d, g0.d and ew.d e0 edit descriptors for output

2019-12-15 Thread jvdelisle at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90374

--- Comment #20 from Jerry DeLisle  ---
While working on this I found another issue:

program test
  implicit none
  real(8) :: rn
  character(32) :: afmt, aresult
  rn = 0.000314e8_8
  write (*,fmt="(E0.8e0, a3)") rn, "<<<"
end

$ gfc c10.f90 
$ ./a.out 
At line 6 of file c10.f90 (unit = 6, file = 'stdout')
Fortran runtime error: Expected REAL for item 2 in formatted transfer, got
CHARACTER
(E0.8e0, a3)

This is a more serious error since I am parsing wrong here.
 ^

[Bug rtl-optimization/92591] ICE in optimize_sc, at modulo-sched.c:1063

2019-12-15 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92591

--- Comment #8 from Arseny Solokha  ---
Is there a backport pending, or can this PR be closed?

[Bug ipa/92794] [10 Regression] ICE in decide_about_value, at ipa-cp.c:5186

2019-12-15 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92794

--- Comment #4 from Arseny Solokha  ---
% powerpc-e300c3-linux-gnu-gcc-10.0.0-alpha20191208 -v
Using built-in specs.
COLLECT_GCC=powerpc-e300c3-linux-gnu-gcc-10.0.0-alpha20191208
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/powerpc-e300c3-linux-gnu/10.0.0-alpha20191208/lto-wrapper
Target: powerpc-e300c3-linux-gnu
Configured with:
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-10.0.0_alpha20191208/work/gcc-10-20191208/configure
--host=x86_64-pc-linux-gnu --target=powerpc-e300c3-linux-gnu
--build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/powerpc-e300c3-linux-gnu/gcc-bin/10.0.0-alpha20191208
--includedir=/usr/lib/gcc/powerpc-e300c3-linux-gnu/10.0.0-alpha20191208/include
--datadir=/usr/share/gcc-data/powerpc-e300c3-linux-gnu/10.0.0-alpha20191208
--mandir=/usr/share/gcc-data/powerpc-e300c3-linux-gnu/10.0.0-alpha20191208/man
--infodir=/usr/share/gcc-data/powerpc-e300c3-linux-gnu/10.0.0-alpha20191208/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-e300c3-linux-gnu/10.0.0-alpha20191208/include/g++-v10
--with-python-dir=/share/gcc-data/powerpc-e300c3-linux-gnu/10.0.0-alpha20191208/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --disable-nls --enable-checking=yes
--disable-esp --enable-libstdcxx-time --enable-poison-system-directories
--with-sysroot=/usr/powerpc-e300c3-linux-gnu --disable-bootstrap
--enable-__cxa_atexit --enable-clocale=gnu --disable-multilib --disable-altivec
--disable-fixed-point --enable-targets=all --enable-libgomp
--disable-libmudflap --disable-libssp --disable-systemtap
--disable-vtable-verify --disable-libvtv --enable-lto --with-isl
--disable-isl-version-check --disable-libsanitizer
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.0-alpha20191208 20191208 (experimental) (GCC)

[Bug ipa/92794] [10 Regression] ICE in decide_about_value, at ipa-cp.c:5186

2019-12-15 Thread fxue at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92794

--- Comment #3 from fxue at gcc dot gnu.org ---
What's configure option for 32 be powerpc?

[Bug c++/92947] Parenthesized aggregate initialization doesn't work with the library types it's supposed to work with

2019-12-15 Thread ville.voutilainen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92947

--- Comment #1 from Ville Voutilainen  ---
Based on some debugging work, build_functional_cast_1 looks like a plausible
place where we might need to add understanding of parenthesized aggregates. The
previous bug report has an incomplete (because it's not c++2a-conditionalized)
patch for constructible_expr that looks like it takes care of
__is_constructible.

I entertained just making build_special_member_call do all this. That's not
quite straightforward because it wants a vector, and the build_aggr_init and
its friends want a tree.

[Bug target/92946] [9 Regression] [SH] Native GCC crashes when invoking with -m4 -m4-nofpu -pipe

2019-12-15 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92946

--- Comment #2 from John Paul Adrian Glaubitz  ---
Filed as: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=946792

[Bug target/92946] [9 Regression] [SH] Native GCC crashes when invoking with -m4 -m4-nofpu -pipe

2019-12-15 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92946

--- Comment #1 from John Paul Adrian Glaubitz  ---
Michael Karcher has figured out that this might be a bug in Debian's gcc-9, in
particular the patch
https://sources.debian.org/src/gcc-9/9.2.1-21/debian/patches/gcc-search-prefixed-as-ld.diff.

CC'ing Matthias Klose.

[Bug tree-optimization/92949] New: bswap/store merging does not handle BIT_INSERT_EXPR

2019-12-15 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92949

Bug ID: 92949
   Summary: bswap/store merging does not handle BIT_INSERT_EXPR
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

While working on lowering bit-field accesses (to allow better optimizations on
the tree level rather than just on the RTL level), I find the bswap/store
merging passes don't handle BIT_INSERT_EXPR.  So we don't transform some things
now.
I could not understand how symbolic_number works so I am filing this bug.

[Bug c++/92947] Parenthesized aggregate initialization doesn't work with the library types it's supposed to work with

2019-12-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92947

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-12-15
 CC||mpolacek at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug c++/92948] New: internal compiler error: in tsubst_copy, at cp/pt.c:15788

2019-12-15 Thread piotrsiupa at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92948

Bug ID: 92948
   Summary: internal compiler error: in tsubst_copy, at
cp/pt.c:15788
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: piotrsiupa at gmail dot com
  Target Milestone: ---

Created attachment 47500
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47500=edit
Minimal and complete example

There is a bug in the experimental feature "non-type template parameters of
class type" in c++2a.

class Aaa
{
public:
constexpr Aaa(const int) {}
};
template
class Bbb_
{
public:
using ZZZ = unsigned;
};
template
using Bbb = Bbb_;
template::ZZZ>
int foo()
{
return 0;
}

The error is:

./crash-the-compiler.cpp: In substitution of 'template using Bbb =
Bbb_<((const Aaa)AAA)> [with Aaa AAA = ((Aaa*)(void)0)->Aaa::Aaa(XXX)]':
./crash-the-compiler.cpp:17:50:   required from here
./crash-the-compiler.cpp:15:18: internal compiler error: in tsubst_copy, at
cp/pt.c:15788
   15 | using Bbb = Bbb_;
  |  ^~~
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

The bug seems to be pretty consistent on different versions of GCC.
I've found it in 9.2.0 but I can reproduce it in 9.1.0 and every 10.0.0 version
that I've found on https://godbolt.org/.
Even avr-g++ has the same exact problem.

[Bug c++/92947] New: Parenthesized aggregate initialization doesn't work with the library types it's supposed to work with

2019-12-15 Thread ville.voutilainen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92947

Bug ID: 92947
   Summary: Parenthesized aggregate initialization doesn't work
with the library types it's supposed to work with
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ville.voutilainen at gmail dot com
  Target Milestone: ---

struct aggressive_aggregate
{
int a;
int b;
};

int main()
{
static_assert(__is_constructible(aggressive_aggregate, int, int)); // fails
decltype(aggressive_aggregate(1,2)) foo; // ill-formed
bool b = noexcept(aggressive_aggregate(1,2)); // ill-formed
}

All of those things should work. The __is_constructible should be true,
and the decltype and noexcept should be well-formed.

[Bug target/92946] New: [9 Regression] [SH] Native GCC crashes when invoking with -m4 -m4-nofpu -pipe

2019-12-15 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92946

Bug ID: 92946
   Summary: [9 Regression] [SH] Native GCC crashes when invoking
with -m4 -m4-nofpu -pipe
   Product: gcc
   Version: 9.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: glaubitz at physik dot fu-berlin.de
CC: gcc-bugzilla at mkarcher dot dialup.fu-berlin.de,
jrtc27 at jrtc27 dot com, oleg.e...@t-online.de
  Target Milestone: ---
Target: sh*-*-*

In Debian, kernel builds have been failing for a while now on sh4 due to gcc-9
crashing when run natively on sh4:

root@tirpitz:~> gcc-9 -m4 -m4-nofpu -pipe -c -x c /dev/null
malloc(): corrupted top size
Aborted
root@tirpitz:~>

See
https://buildd.debian.org/status/fetch.php?pkg=linux=sh4=5.3.15-1=1575738446=0
for a log file.

The issue does not reproduce on earlier GCC versions:

root@tirpitz:~> gcc-5 -m4 -m4-nofpu -pipe -c -x c /dev/null
root@tirpitz:~> gcc-6 -m4 -m4-nofpu -pipe -c -x c /dev/null
root@tirpitz:~> gcc-7 -m4 -m4-nofpu -pipe -c -x c /dev/null
root@tirpitz:~> gcc-8 -m4 -m4-nofpu -pipe -c -x c /dev/null
root@tirpitz:~>

[Bug fortran/91651] [F03] Implement KIND argument for INDEX

2019-12-15 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91651

Thomas Koenig  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-12-15
 CC||tkoenig at gcc dot gnu.org
Summary|ICE in  |[F03] Implement KIND
   |gfc_trans_assignment_1, at  |argument for INDEX
   |fortran/trans-expr.c:11010  |
 Ever confirmed|0   |1

--- Comment #2 from Thomas Koenig  ---
INDEX is actually an elemental function (I had to look that up
to make sure), and it has had a KIND argument since F2003.

[Bug fortran/87103] [OOP] ICE in gfc_new_symbol() due to overlong symbol name

2019-12-15 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87103

Thomas Koenig  changed:

   What|Removed |Added

 CC||andreas at skeidsvoll dot no

--- Comment #5 from Thomas Koenig  ---
*** Bug 91773 has been marked as a duplicate of this bug. ***

[Bug fortran/91773] Buffer overflow for long module/submodule names

2019-12-15 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91773

Thomas Koenig  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||tkoenig at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #2 from Thomas Koenig  ---
Let's mark it as such, then.

*** This bug has been marked as a duplicate of bug 87103 ***

[Bug target/91534] some defined builtins are not usable

2019-12-15 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91534

Bill Schmidt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||wschmidt at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #2 from Bill Schmidt  ---
For clarity, many of these interfaces are only used internally as part of
mappings from overloaded builtins to builtins for a specific set of vector type
arguments.  Ultimately the interface that the user sees will be something like
vec_madd.  These internal tables are not intended to be a source of all
possible interfaces that users can access.

Accepted vector interfaces are defined in Appendix A of the Power ELF v2 ABI. 
Better documentation of them is in progress and should become available in
1H2020.  Overhauling the whole Power-specific builtin system is on my list for
GCC 11 if I can make the time.

[Bug fortran/92913] Add argument-mismatch check for INTERFACE for non-module procedures in the same file

2019-12-15 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92913

Thomas Koenig  changed:

   What|Removed |Added

  Attachment #47483|0   |1
is obsolete||

--- Comment #3 from Thomas Koenig  ---
Created attachment 47499
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47499=edit
Patch that tries to use gfc_compare_interfaces

... but there seem to be too many special cases, and there
is also double reporting of errors, leading to regressions.

Maybe this is not the way.

[Bug c++/92878] Parenthesized aggregate initialization doesn't work in a new-expression

2019-12-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92878

--- Comment #10 from Marek Polacek  ---
(In reply to Ville Voutilainen from comment #7)
> __is_constructible is incorrectly false for such an aggregate:
> 
> struct aggressive_aggregate
> {
> int a;
> int b;
> };
> 
> int main()
> {
> static_assert(__is_constructible(aggressive_aggregate, int, int));
> }
> 
> Do you want a new bug report, or should this be handled as a follow-up on
> this one?

Please open a new PR.

[Bug middle-end/92945] -O2 -floop-nest-optimize crashes gccin isl_basic_map_underlying_set ()

2019-12-15 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92945

Arseny Solokha  changed:

   What|Removed |Added

 CC||asolokha at gmx dot com

--- Comment #3 from Arseny Solokha  ---
It looks like a duplicate of PR90004, or at least PR90004 comment 2.

[Bug target/89597] Inconsistent vector calling convention on windows with Clang and MSVC

2019-12-15 Thread agner at agner dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89597

Agner Fog  changed:

   What|Removed |Added

 CC||agner at agner dot org

--- Comment #1 from Agner Fog  ---
I can confirm this. 

When compiling for a Win64 target, gcc version 9.2.0 (and earlier) returns
128-bit intrinsic vectors in XMM0, while 256-bit and 512-bit intrinsic vectors
are returned through a pointer. Clang, MS and Intel compilers return all these
vectors in registers.

The Microsoft Windows documentation for x64 calling convention says:

"Non-scalar types including floats, doubles, and vector types such as __m128,
__m128i, __m128d are returned in XMM0."
(https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#return-values)

Obviously, this document needs to be updated, but the only logical
interpretation is that the wording "vector types such as __m128" includes
larger intrinsic vectors, which must necessarily be returned in YMM0 or ZMM0.

Test case:
```
__m128 square_x (__m128 x) {
return _mm_mul_ps( x , x);
}

__m256 square_y (__m256 y) {
return _mm256_mul_ps( y , y);
}

__m512 square_z (__m512 z) {
return _mm512_mul_ps( z , z);
}
```

Disassembly (Intel syntax):
```
_Z8square_xDv4_f:; Function begin
vmovaps xmm0, oword [rcx]
vmulps  xmm0, xmm0, xmm0 
ret  
; _Z8square_xDv4_f End of function


_Z8square_yDv8_f:; Function begin
vmovaps ymm0, yword [rdx]
vmulps  ymm0, ymm0, ymm0 
mov rax, rcx 
vmovaps yword [rcx], ymm0
vzeroupper   
ret  
; _Z8square_yDv8_f End of function


_Z8square_zDv16_f:; Function begin
vmovaps zmm0, zword [rdx]
vmulps  zmm0, zmm0, zmm0 
mov rax, rcx 
vmovaps zword [rcx], zmm0
vzeroupper   
ret  
; _Z8square_zDv16_f End of function

```

Same, compiled with Clang, MS or Intel compilers:

```
_Z8square_yDv8_f:; Function begin
vmovaps ymm0, yword [rcx]
vmulps  ymm0, ymm0, ymm0 
ret  
; _Z8square_yDv8_f End of function


_Z8square_zDv16_f:; Function begin
vmovaps zmm0, zword [rcx]
vmulps  zmm0, zmm0, zmm0 
ret  
; _Z8square_zDv16_f End of function
```

... And while we are at it: It would be nice if you could support __vectorcall
for win64 targets (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89485)

[Bug middle-end/92945] -O2 -floop-nest-optimize crashes gccin isl_basic_map_underlying_set ()

2019-12-15 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92945

--- Comment #2 from Sergei Trofimovich  ---
Rebuilt isl with debugging symbols. gdb says 'bmap' is NULL:

Thread 2.1 "f951" received signal SIGSEGV, Segmentation fault.
[Switching to process 1944963]
isl_basic_map_underlying_set (bmap=0x0) at ../isl-0.22/isl_map.c:5515
5515space = isl_space_underlying(space, bmap->n_div);
(gdb) bt
#0  isl_basic_map_underlying_set (bmap=0x0) at ../isl-0.22/isl_map.c:5515
#1  0x77e56431 in isl_basic_map_is_empty (bmap=0x28185f0) at
../isl-0.22/isl_map.c:8988
#2  isl_basic_map_is_empty (bmap=0x28185f0) at ../isl-0.22/isl_map.c:8958
#3  0x77e578da in map_product (map1=0x27eb7b0, map2=0x27f4a90,
space_product=,
basic_map_product=0x77e4dfa0 ,
remove_duplicates=1) at ../isl-0.22/isl_map.c:10468
#4  0x77e2ede3 in coscheduled_source (acc=acc@entry=0x27f8c20,
old_map=0x27f4a90, pos=pos@entry=8, depth=) at
../isl-0.22/isl_flow.c:941
#5  0x77e3115c in handle_coscheduled (flow=0x28455d0,
may_rel=0x2845370, must_rel=0x2845330, acc=0x27f8c20) at
../isl-0.22/isl_flow.c:1034
#6  compute_val_based_dependences (acc=) at
../isl-0.22/isl_flow.c:1238
#7  access_info_compute_flow_core (acc=, acc@entry=0x27f8c20) at
../isl-0.22/isl_flow.c:1338
#8  0x77e326ac in compute_single_flow (data=0x7fffcfc0,
sink=, uf=0x27d6410) at ../isl-0.22/isl_flow.c:3082
#9  compute_flow_schedule (access=0x28ab6e0) at ../isl-0.22/isl_flow.c:3166
#10 isl_union_access_info_compute_flow (access=0x28ab6e0) at
../isl-0.22/isl_flow.c:3217
#11 0x017b5b9e in scop_get_dependences(scop*) ()
#12 0x017b603b in apply_poly_transforms(scop*) ()
#13 0x017b02d2 in graphite_transform_loops() ()
#14 0x017b0809 in (anonymous
namespace)::pass_graphite_transforms::execute(function*) ()
#15 0x00c0caed in execute_one_pass(opt_pass*) ()
#16 0x00c0d50f in execute_pass_list_1(opt_pass*) ()
#17 0x00c0d521 in execute_pass_list_1(opt_pass*) ()
#18 0x00c0d521 in execute_pass_list_1(opt_pass*) ()
#19 0x00c0d521 in execute_pass_list_1(opt_pass*) ()
#20 0x00c0d53a in execute_pass_list(function*, opt_pass*) ()
#21 0x0080d779 in cgraph_node::expand() ()
#22 0x0080f2c2 in symbol_table::compile() ()
#23 0x008117ba in symbol_table::finalize_compilation_unit() ()
#24 0x00cfab26 in compile_file() ()
#25 0x00cfd784 in toplev::main(int, char**) ()
#26 0x018f083c in main ()
(gdb) list
5510!isl_space_is_named_or_nested(bmap->dim, isl_dim_in) &&
5511!isl_space_is_named_or_nested(bmap->dim, isl_dim_out))
5512return bset_from_bmap(bmap);
5513bmap = isl_basic_map_cow(bmap);
5514space = isl_basic_map_take_space(bmap);
5515space = isl_space_underlying(space, bmap->n_div);
5516bmap = isl_basic_map_restore_space(bmap, space);
5517if (!bmap)
5518return NULL;
5519bmap->extra -= bmap->n_div;
(gdb) print bmap
$1 = (isl_basic_map *) 0x0

[Bug middle-end/92945] -O2 -floop-nest-optimize crashes gccin isl_basic_map_underlying_set ()

2019-12-15 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92945

--- Comment #1 from Sergei Trofimovich  ---
The crash happens somewhere in internals of isl-0.22:

Thread 2.1 "f951" received signal SIGSEGV, Segmentation fault.
[Switching to process 1919154]
0x77e50a74 in isl_basic_map_underlying_set () from
/usr/lib64/libisl.so.22
(gdb) bt
#0  0x77e50a74 in isl_basic_map_underlying_set () from
/usr/lib64/libisl.so.22
#1  0x77e56431 in isl_basic_map_is_empty () from
/usr/lib64/libisl.so.22
#2  0x77e578da in ?? () from /usr/lib64/libisl.so.22
#3  0x77e2ede3 in ?? () from /usr/lib64/libisl.so.22
#4  0x77e3115c in ?? () from /usr/lib64/libisl.so.22
#5  0x77e326ac in isl_union_access_info_compute_flow () from
/usr/lib64/libisl.so.22
#6  0x017b5b9e in scop_get_dependences(scop*) ()
#7  0x017b603b in apply_poly_transforms(scop*) ()
#8  0x017b02d2 in graphite_transform_loops() ()
#9  0x017b0809 in (anonymous
namespace)::pass_graphite_transforms::execute(function*) ()
#10 0x00c0caed in execute_one_pass(opt_pass*) ()
#11 0x00c0d50f in execute_pass_list_1(opt_pass*) ()
#12 0x00c0d521 in execute_pass_list_1(opt_pass*) ()
#13 0x00c0d521 in execute_pass_list_1(opt_pass*) ()
#14 0x00c0d521 in execute_pass_list_1(opt_pass*) ()
#15 0x00c0d53a in execute_pass_list(function*, opt_pass*) ()
#16 0x0080d779 in cgraph_node::expand() ()
#17 0x0080f2c2 in symbol_table::compile() ()
#18 0x008117ba in symbol_table::finalize_compilation_unit() ()
#19 0x00cfab26 in compile_file() ()
#20 0x00cfd784 in toplev::main(int, char**) ()
#21 0x018f083c in main ()

[Bug middle-end/92945] New: -O2 -floop-nest-optimize crashes gccin isl_basic_map_underlying_set ()

2019-12-15 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92945

Bug ID: 92945
   Summary: -O2 -floop-nest-optimize crashes gccin
isl_basic_map_underlying_set ()
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at inbox dot ru
  Target Milestone: ---

Created attachment 47498
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47498=edit
scipy-graphite-ice.f

Initially reported as https://bugs.gentoo.org/702968 by wolfwood.
There fortran code (from scipy-1.1.0) and C++ code (from dav1d-0.5.2) crash
with -floop-nest-optimize.

I crash both gcc-9.2.0 and current gcc trunk.

Here is the reduced fortran example:

$ cat scipy-graphite-ice.f

SUBROUTINE CERZO(NT,ZO)
IMPLICIT DOUBLE PRECISION (E,P,W)
IMPLICIT COMPLEX *16 (C,Z)
DIMENSION ZO(NT)
DO 35 NR=1,NT
   PX=0.5*PU-0.5*DLOG(PV)/PU
   PY=0.5*PU+0.5*DLOG(PV)/PU
15 IT=IT+1
   CALL CERF(Z,ZF,ZD)
   DO 30 I=1,NR-1
  ZW=(1.0D0,0.0D0)
  DO 25 J=1,NR-1
 IF (J.EQ.I) GO TO 25
 ZW=ZW*(Z-ZO(J))
25CONTINUE
30ZQ=ZQ+ZW
   ZGD=(ZD-ZQ*ZFD)/ZP
   Z=Z-ZFD/ZGD
   IF (IT.LE.50.AND.DABS((W-W0)/W).GT.1.0D-11) GO TO 15
35 ZO(NR)=Z
IF (B.NE.INT(B)) THEN
ENDIF
END

$ LANG=C ./gfortran -B . -O2 -floop-nest-optimize -c scipy-graphite-ice.f
scipy-graphite-ice.f:16:72:

   16 | 30ZQ=ZQ+ZW
  |   
1
Warning: Fortran 2018 deleted feature: DO termination statement which is not
END DO or CONTINUE with label 30 at (1)
scipy-graphite-ice.f:20:72:

   20 | 35 ZO(NR)=Z
  |   
1
Warning: Fortran 2018 deleted feature: DO termination statement which is not
END DO or CONTINUE with label 35 at (1)
during GIMPLE pass: graphite
scipy-graphite-ice.f:1:0:

1 | SUBROUTINE CERZO(NT,ZO)
  |
internal compiler error: Segmentation fault
0x7fda773f51cf ???
   
/usr/src/debug/sys-libs/glibc-2.30-r3/glibc-2.30/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x7fda773def1a __libc_start_main
../csu/libc-start.c:308
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ LANG=C ./gfortran -B . -v
Reading specs from ./specs
COLLECT_GCC=./gfortran
COLLECT_LTO_WRAPPER=./lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --enable-languages=c,c++ --disable-bootstrap
--with-multilib-list=m64
--prefix=/home/slyfox/dev/git/gcc-fortran-and-isl/../gcc-native-quick-installed
--disable-nls --without-isl --disable-libsanitizer --disable-libvtv
--disable-libgomp --disable-libstdcxx-pch --disable-libunwind-exceptions
CFLAGS='-O1 ' CXXFLAGS='-O1 ' --with-sysroot=/usr/x86_64-HEAD-linux-gnu
--enable-languages=c,c++,fortran --with-isl
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.0 20191215 (experimental) (GCC)

[Bug c++/92944] New: [concepts] redefinition error when using constrained structure template inside namespace

2019-12-15 Thread akaraevz at mail dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92944

Bug ID: 92944
   Summary: [concepts] redefinition error when using constrained
structure template inside namespace
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: akaraevz at mail dot ru
  Target Milestone: ---

The following piece of code compiles fine:

namespace N { template  struct Q; }
namespace N { template  requires false struct Q {}; }
namespace N { template  requires true struct Q {}; }


But this one is not:

namespace N { template  struct Q; }
template  requires false struct N::Q {};
template  requires true struct N::Q {}; // error: redefinition of
'struct N::Q'

Tested on https://godbolt.org/z/4kQMn9 (8.1, 8.2, 8.3, 9.1, 9.2, trunk)

[Bug target/83464] [SH] ICE: in final_scan_insn, at final.c:3025 with -mlra

2019-12-15 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83464

--- Comment #6 from John Paul Adrian Glaubitz  ---
(In reply to Oleg Endo from comment #5)
> I'm not sure this is an acceptable solution.  It disables various other
> optimizations and reduces in worse code than normally should be.  When you
> rebuild all of debian with LRA enabled, please make sure to take out any
> such hacks.

It's not a solution, it was a work-around.

FWIW, I have tried to build the package with the gcc-10 package from Debian
(which is a recent snapshot) and -mlra, but so far cmake doesn't want to accept
gcc-10 as the build compiler when set with export CC/CXX, so I have to poke
around what the problem is.

I will report back once I have a result.

[Bug target/83464] [SH] ICE: in final_scan_insn, at final.c:3025 with -mlra

2019-12-15 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83464

--- Comment #5 from Oleg Endo  ---
(In reply to John Paul Adrian Glaubitz from comment #4)
> 
> I have to try. I'll run a testbuild. Currently the package has the following
> workaround for PR/81426:
> 
> # See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81426
> ifneq (,$(filter $(DEB_HOST_ARCH_CPU),sh3 sh4))
>   export DEB_CXXFLAGS_MAINT_STRIP += -O2
>   export DEB_CXXFLAGS_MAINT_APPEND += -O1
> endif
> 

I'm not sure this is an acceptable solution.  It disables various other
optimizations and reduces in worse code than normally should be.  When you
rebuild all of debian with LRA enabled, please make sure to take out any such
hacks.