[Bug tree-optimization/102494] Failure to optimize vector reduction properly especially when using OpenMP

2021-09-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

--- Comment #5 from Hongtao.liu  ---
(In reply to Hongtao.liu from comment #4)
> > 
> > But for the case in PR, it's v8qi -> 2 v4hi, and no vector reduction for
> > v4hi.
> 
> We need add (define_expand "reduc_plus_scal_v4hi" just like (define_expand
> "reduc_plus_scal_v8qi" in mmx.md.

Also for reduc_{umax,umin,smax,smin}_scal_v4hi

[Bug other/102495] New: optimize some consecutive byte load pattern to word load

2021-09-26 Thread mytbk920423 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102495

Bug ID: 102495
   Summary: optimize some consecutive byte load pattern to word
load
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mytbk920423 at gmail dot com
  Target Milestone: ---

I use the following code get a 32-bit word from a byte array by loading each
byte and shifting them, but GCC doesn't optimize the code to a single word load
when I put the byte load in a loop.

Clang trunk can optimize all of the follows:
https://gcc.godbolt.org/z/KfWE67K5c


```
#define SHL(a,b) ((uint32_t)(a) << (b))

// both GCC and Clang optimize to *(uint32_t*)(vv)
uint32_t getword_b(const uint8_t *vv)
{
return SHL(vv[3], 24) | SHL(vv[2], 16) | SHL(vv[1], 8) | SHL(vv[0], 0);
}

// GCC cannot optimize this, Clang can
uint32_t getword_forloop(const uint8_t *vv)
{
uint32_t res = 0;
for (size_t i = 0; i < 4; i++) {
res |= SHL(vv[i], (i * 8));
}
return res;
}

// both GCC and Clang optimize to ((uint32_t*)(vec))[word_idx]
uint32_t getword_from_vec(const uint8_t *vec, size_t word_idx)
{
size_t byte_idx = word_idx * 4;
const uint8_t *vv = vec + byte_idx;
return SHL(vv[3], 24) | SHL(vv[2], 16) | SHL(vv[1], 8) | SHL(vv[0], 0);
}

// neither GCC nor Clang 12.0.1 can optimize this, Clang trunk can
uint32_t getword_from_vec_forloop(const uint8_t *vec, size_t word_idx)
{
size_t byte_idx = word_idx * 4;
uint32_t res = 0;
for (size_t i = 0; i < 4; i++) {
res |= SHL(vec[byte_idx + i], (i * 8));
}
return res;
}
```

[Bug tree-optimization/102494] Failure to optimize vector reduction properly especially when using OpenMP

2021-09-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

--- Comment #4 from Hongtao.liu  ---

> 
> But for the case in PR, it's v8qi -> 2 v4hi, and no vector reduction for
> v4hi.

We need add (define_expand "reduc_plus_scal_v4hi" just like (define_expand
"reduc_plus_scal_v8qi" in mmx.md.

[Bug tree-optimization/102494] Failure to optimize vector reduction properly especially when using OpenMP

2021-09-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

--- Comment #3 from Hongtao.liu  ---
(In reply to Hongtao.liu from comment #2)
> It seems x86 doesn't supports optab reduc_plus_scal_v8hi yet.
vectorizer does the work for backend. 

typedef short v8hi __attribute__((vector_size(16)));
short
foo1 (v8hi p, int n)
{
  short sum = 0;
  for (int i = 0; i != 8; i++)
sum += p[i];
  return sum;
}

  # sum_21 = PHI 
  # vect_sum_9.26_5 = PHI 
  _22 = (vector(8) unsigned short) vect_sum_9.26_5;
  _23 = VEC_PERM_EXPR <_22, { 0, 0, 0, 0, 0, 0, 0, 0 }, { 4, 5, 6, 7, 8, 9, 10,
11 }>;
  _24 = _23 + _22;
  _25 = VEC_PERM_EXPR <_24, { 0, 0, 0, 0, 0, 0, 0, 0 }, { 2, 3, 4, 5, 6, 7, 8,
9 }>;
  _26 = _25 + _24;
  _27 = VEC_PERM_EXPR <_26, { 0, 0, 0, 0, 0, 0, 0, 0 }, { 1, 2, 3, 4, 5, 6, 7,
8 }>;
  _28 = _27 + _26;
  stmp_sum_9.27_29 = BIT_FIELD_REF <_28, 16, 0>;


But for the case in PR, it's v8qi -> 2 v4hi, and no vector reduction for v4hi.

[Bug tree-optimization/80570] auto-vectorizing int->double conversion should use half-width memory operands to avoid shuffles, instead of load+extract

2021-09-26 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80570

--- Comment #3 from Peter Cordes  ---
(In reply to Andrew Pinski from comment #2)
> Even on aarch64:
> 
> .L2:
> ldr q0, [x1], 16
> sxtlv1.2d, v0.2s
> sxtl2   v0.2d, v0.4s
> scvtf   v1.2d, v1.2d
> scvtf   v0.2d, v0.2d
> stp q1, q0, [x0]
>
> But the above is decent really.

More that decent, that's what we *should* be doing, I think.

AArch64 has versions of most instructions that read the top of a vector, unlike
x86-64 where VPMOVZX / SX can only read from the bottom half.  That's the key
difference, and what makes this strategy good on ARM, bad on x86-64.

(On 32-bit ARM, you load a q register, then read the two halves separately as
64-bit d<0..31> registers.  AArch64 changed that so there are 32x 128-bit
vector regs, and no partial regs aliasing the high half.  But they provide OP,
OP2 versions of some instructions that widen or things like that, with the "2"
version accessing a high half.  Presumably part of the motivation is to make it
easier to port ARM NEON code that depended on accessing halves of a 128-bit q
vector using its d regs.  But it's a generally reasonable design and could also
be motivated by seeing how inconvenient things get in SSE and AVX for
pmovsx/zx.) 

 Anyway, AArch64 SIMD is specifically designed to make it fully efficient to do
wide loads and then unpack both halves, like is possible in ARM, but not
x86-64.  

It's also using a store (of a pair of regs) that's twice the width of the load.
 But even if it was using a max-width load of a pair of 128-bit vectors (and
having to store two pairs) that would be good, just effectively unrolling.  But
GCC sees it as one load and two separate stores, that it just happens to be
able to combine as a pair.

[Bug tree-optimization/102494] Failure to optimize out vector reduction properly especially when using OpenMP

2021-09-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

--- Comment #2 from Hongtao.liu  ---
It seems x86 doesn't supports optab reduc_plus_scal_v8hi yet.

[Bug tree-optimization/80570] auto-vectorizing int->double conversion should use half-width memory operands to avoid shuffles, instead of load+extract

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80570

Andrew Pinski  changed:

   What|Removed |Added

  Component|target  |tree-optimization

--- Comment #2 from Andrew Pinski  ---
  vect__4.5_24 = MEM  [(int *)ip_12 + ivtmp.15_28 * 1];
  vect_tmp_14.6_23 = [vec_unpack_float_lo_expr] vect__4.5_24;
  vect_tmp_14.6_22 = [vec_unpack_float_hi_expr] vect__4.5_24;
  MEM  [(double *)dp_10 + ivtmp.15_28 * 2] =
vect_tmp_14.6_23;
  MEM  [(double *)dp_10 + 32B + ivtmp.15_28 * 2] =
vect_tmp_14.6_22;

Even on aarch64:

.L2:
ldr q0, [x1], 16
sxtlv1.2d, v0.2s
sxtl2   v0.2d, v0.4s
scvtf   v1.2d, v1.2d
scvtf   v0.2d, v0.2d
stp q1, q0, [x0]
add x0, x0, 32
cmp x2, x1
bne .L2

But the above is decent really.

[Bug c++/100583] [modules] ICE when importing

2021-09-26 Thread johelegp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100583

--- Comment #1 from Johel Ernesto Guerrero Peña  ---
Please, add Bug 99227 to **Blocks:** for visibility.

[Bug target/102473] [12 Regression] 521.wrf_r 5% slower at -Ofast and generic x86_64 tuning after r12-3426-g8f323c712ea76c

2021-09-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102473

--- Comment #7 from Hongtao.liu  ---
> retired and clocksticks after my commit. And the regression comes from
> libc-2.31.so which shoud be the same.

difference in libc-2.31.so comes from frond-end bandwidth MITE, very low DSB
coverage.

[Bug target/102473] [12 Regression] 521.wrf_r 5% slower at -Ofast and generic x86_64 tuning after r12-3426-g8f323c712ea76c

2021-09-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102473

--- Comment #6 from Hongtao.liu  ---

> 
> | Symbol  | sys lib | Before | After | 
> diff | % |
> |-+-++---+---
> +---|
> | __logf_fma  | yes |  68882 | 68940 |  
> +58 | +0.08 |
> | __atanf | yes |  4 | 66196 | 
> -468 | -0.70 |
> | __module_advect_em_MOD_advect_scalar_pd | no  |  62286 | 62348 |  
> +62 | +0.10 |
> | __powf_fma  | yes |  56213 | 56127 |  
> -86 | -0.15 |
> | __module_mp_wsm5_MOD_nislfv_rain_plm| no  |  46990 | 48340 |
> +1350 | +2.87 |

Does it means cycles? 
Vtune data show __module_mp_wsm5_MOD_nislfv_rain_plm has less instructions
retired and clocksticks after my commit. And the regression comes from
libc-2.31.so which shoud be the same.

[Bug target/64960] [5 Regression] Inefficient address pre-computation in PIC mode

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64960

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
Summary|Inefficient address |[5 Regression] Inefficient
   |pre-computation in PIC mode |address pre-computation in
   ||PIC mode
 Resolution|--- |FIXED
   Target Milestone|--- |5.0

--- Comment #1 from Andrew Pinski  ---
Was fixed during the developement of GCC 5.  That is GCC 5.1.0 release did not
have this issue.

[Bug target/60996] Bad code (I.e. needless insns) with option momit-leaf-frame-pointer; side-effect on non-leaf functions

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60996

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-09-27
 Status|UNCONFIRMED |WAITING

--- Comment #1 from Andrew Pinski  ---
Can you attach the testcase?

[Bug tree-optimization/102494] Failure to optimize out vector reduction properly especially when using OpenMP

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64-*-*
   Keywords||missed-optimization

--- Comment #1 from Andrew Pinski  ---
Both with and without -fopenmp-simd works on aarch64-linux-gnu which has a
reduction addition.

Just looks like how reduction addition is handled for x86_64 really.

Also we have:
  MEM  [(short int *)] = vect__21.35_111;
  MEM  [(short int *) + 8B] = vect__21.35_112;
  vect__24.24_88 = MEM  [(short int *)];

[Bug tree-optimization/102494] New: Failure to optimize out vector reduction properly especially when using OpenMP

2021-09-26 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

Bug ID: 102494
   Summary: Failure to optimize out vector reduction properly
especially when using OpenMP
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

#include 
#include 

typedef int8_t simde_int8x8_t __attribute__((__vector_size__(8)));

int16_t
simde_vaddlv_s8(simde_int8x8_t a) {
int16_t r = 0;

#pragma omp simd reduction(+:r)
for (size_t i = 0 ; i < (sizeof(a) / sizeof(a[0])) ; i++) {
  r += a[i];
}

return r;
}

Compiled with -O3 -fopenmp-simd, this is the output on AMD64:

simde_vaddlv_s8(signed char __vector(8)):
pxorxmm1, xmm1
movdqa  xmm2, xmm0
pcmpgtb xmm1, xmm0
punpcklbw   xmm0, xmm1
punpcklbw   xmm2, xmm1
pshufd  xmm0, xmm0, 78
movqQWORD PTR [rsp-24], xmm2
movqQWORD PTR [rsp-16], xmm0
movdqa  xmm0, XMMWORD PTR [rsp-24]
psrldq  xmm0, 8
paddw   xmm0, XMMWORD PTR [rsp-24]
movdqa  xmm1, xmm0
psrldq  xmm1, 4
paddw   xmm0, xmm1
movdqa  xmm1, xmm0
psrldq  xmm1, 2
paddw   xmm0, xmm1
pextrw  eax, xmm0, 0
ret

This is what Clang manages:

simde_vaddlv_s8(signed char __vector(8)):
punpcklbw   xmm0, xmm0  # xmm0 =
xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7]
psraw   xmm0, 8
pshufd  xmm1, xmm0, 238 # xmm1 = xmm0[2,3,2,3]
paddw   xmm1, xmm0
pshufd  xmm0, xmm1, 85  # xmm0 = xmm1[1,1,1,1]
paddw   xmm0, xmm1
movdqa  xmm1, xmm0
psrld   xmm1, 16
paddw   xmm1, xmm0
movdeax, xmm1
ret

Weirdly enough, removing the `#pragma omp simd reduction(+r)` slightly improves
  GCC's output to this:

simde_vaddlv_s8(signed char __vector(8)):
pxorxmm1, xmm1
movdqa  xmm2, xmm0
pcmpgtb xmm1, xmm0
punpcklbw   xmm0, xmm1
punpcklbw   xmm2, xmm1
pshufd  xmm0, xmm0, 78
paddw   xmm0, xmm2
pextrw  edx, xmm0, 1
pextrw  eax, xmm0, 0
add eax, edx
pextrw  edx, xmm0, 2
add eax, edx
pextrw  edx, xmm0, 3
add eax, edx
ret

[Bug target/60889] -Os generate much bigger code

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60889

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
  Known to fail||6.1.0
  Known to work||7.1.0
   Target Milestone|--- |7.0
 Resolution|--- |FIXED

--- Comment #2 from Andrew Pinski  ---
In GCC 7+, we use TI to do the struct move.

So all fixed in GCC 7+.

[Bug target/40680] extra register move

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40680

--- Comment #3 from Andrew Pinski  ---
Looks like this was fixed in GCC 5+.

[Bug target/44883] Combine separate shift and add instructions into a single one

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44883

--- Comment #5 from Andrew Pinski  ---
In GCC 9+ (due to 2->2 combine) we get:
.L2:
cmp r4, r5
blt .L3
pop {r4, r5, r6, r7, r8, pc}
.L3:
ldr r3, [r6]
lslsr2, r4, #5
add r8, r3, r4, lsl #5
addsr4, r4, #1
ldr r0, [r3, r2]
bl  foo
str r7, [r8, #4]
b   .L2

[Bug target/62166] Poor code generation (x86-64)

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62166

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-26

--- Comment #1 from Andrew Pinski  ---
With -O2  -fschedule-insns in GCC 9+, I get decent code:

movq%rdx, %r8
movzbl  %ch, %eax
movsbq  %cl, %rdx
shrq$16, %rcx
addq%r8, %rdx
jmp *dispatch(,%rax,8)

Confirmed.

[Bug target/60778] shift not folded into shift on x86-64

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60778

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2014-04-08 00:00:00 |2021-9-26

--- Comment #2 from Andrew Pinski  ---
Trying 7 -> 8:
7: {r87:DI=r89:DI>>0x3;clobber flags:CC;}
  REG_DEAD r89:DI
  REG_UNUSED flags:CC
8: r88:DF=[r87:DI*0x8+`mem']
  REG_DEAD r87:DI
Failed to match this instruction:
(set (reg:DF 88 [ mem[_1] ])
(mem:DF (plus:DI (and:DI (reg:DI 89)
(const_int -8 [0xfff8]))
(symbol_ref:DI ("mem") [flags 0x2]  ))
[1 mem[_1]+0 S8 A64]))

We have 2->2 combine now but it looks like we don't try to split inside a mem
...

[Bug fortran/101334] gfortran fails to enforce C838 on disallowed uses of assumed-rank variable names + bogus errors

2021-09-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101334

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:fe2771b291c2c7c0ac37b75ec5b160937524b60c

commit r12-3890-gfe2771b291c2c7c0ac37b75ec5b160937524b60c
Author: Tobias Burnus 
Date:   Sun Sep 26 19:26:01 2021 +0200

Fortran: Fix associated intrinsic with assumed rank [PR101334]

ASSOCIATE (ptr, tgt) takes as first argument also an assumed-rank array;
however, using it together with a tgt (required to be non assumed rank)
had issues for both scalar and nonscalar tgt.

PR fortran/101334
gcc/fortran/ChangeLog:

* trans-intrinsic.c (gfc_conv_associated): Support assumed-rank
'pointer' with scalar/array 'target' argument.

libgfortran/ChangeLog:

* intrinsics/associated.c (associated): Also check for same rank.

gcc/testsuite/ChangeLog:

* gfortran.dg/associated_assumed_rank.f90: New test.

[Bug c++/102493] New: non-type template specialization for member pointer to field and function reports leads to unexpected conflict

2021-09-26 Thread marekr22 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102493

Bug ID: 102493
   Summary: non-type template specialization for member pointer to
field and function reports leads to unexpected
conflict
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marekr22 at wp dot pl
  Target Milestone: ---

Here is minimum complete verifiable example (C++17):
```cpp
#include 
#include 
#include 

template 
struct field_type;

template
struct field_type
{
using type = R;
};

template
struct field_type
{
using type = R;
}; 

template
using field_type_t = typename field_type::type;


class Foo {
public:
int x = 0;
double y = 0;
std::string s;
const int cx = 0;

Foo() = default;

void bar() {
std::cout << "bar\n";
}

int par(int z) {
std::cout << "bar\n";
return z;
}
};

template
constexpr bool test = std::is_same_v, T>;

static_assert(test<::x,int>, "");
static_assert(test<::cx,   const int>,   "");
static_assert(test<::s,std::string>, "");
static_assert(test<::y,double>,  "");
#ifndef HIDE_PROBLEM_ON_GCC_11
static_assert(test<::bar,  void>,"");
static_assert(test<::par,  int>, "");
#endif
```

This compile on all compilers https://godbolt.org/z/31svobz3z
except for gcc 11.1 and 11.2 (gcc 10.3 works).

Reported error is:
```
: In substitution of 'template using field_type_t = typename
field_type::type [with auto FP = ::bar]':
:44:28:   required from 'constexpr const bool test<::bar, void>'
:51:15:   required from here
:21:7: error: ambiguous template instantiation for 'struct
field_type<::bar>'
   21 | using field_type_t = typename field_type::type;
  |   ^~~~
:9:8: note: candidates are: 'template
struct field_type [with R = void(); T = Foo; R T::* FP = ::bar]'
9 | struct field_type
  |^~
:15:8: note: 'template struct field_type [with R = void; T = Foo;
Args = {}; R (T::* FP)(Args ...) = ::bar]'
   15 | struct field_type
  |^~
:21:7: error: invalid use of incomplete type 'struct
field_type<::bar>'
   21 | using field_type_t = typename field_type::type;
  |   ^~~~
:6:8: note: declaration of 'struct field_type<::bar>'
6 | struct field_type;
  |^~
```

[Bug fortran/102079] Misleading -Wlto-type-mismatch warning on wrong float type to C function

2021-09-26 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102079

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #5 from kargl at gcc dot gnu.org ---
(In reply to Thomas Koenig from comment #4)
> (In reply to Jan Hubicka from comment #3)
> > I think the problem here is that fortran uses "long int" while size_t
> > interoperate with "unsigned long int".
> > Does fortran standard promise that the value should inter-operate with
> > size_t as well?
> 
> This is a matter for the compiler.
> 
> In this case, the documentation states
> 
> For arguments of @code{CHARACTER} type, the character length is passed
> as a hidden argument at the end of the argument list.  For
> deferred-length strings, the value is passed by reference, otherwise
> by value.  The character length has the C type @code{size_t} (or
> @code{INTEGER(kind=C_SIZE_T)} in Fortran).
> 
> so the documentation is actually inconsistent (but then, interoperbility
> between Fortran integers and C unsigned types is sort of assumed).
> The library uses gfc_charlen_type, which is a typedef to size_t.
> 
> So, we can fix this either way.  It would probably be best to have
> size_t in the character lengths in the front end as well.

The Fortran 2018 standard in section 18 specific discusses unsigned
integers.  Table 18.2 shows that CHARACTER(KIND=C_SIGNED_CHAR)
interoperates with with C's signed char or unsigned char.  It further
shows that C_SIZE_T is interoperates with C's size_t. The Fortran
standard does not have a C_SSIZE_T kind type parameter.  There is,
however, a non-normative note. From 18-007r1.pdf (aka Fortran 2021),
page 475

   NOTE 1

   ISO/IEC 9899:2011 specifies that the representations for nonnegative
   signed integers are the same as the corresponding values of unsigned
   integers.  Because Fortran does not provide direct support for unsigned
   kinds of integers, the ISO_C_BINDING module does not make accessible
   named constants for their kind type parameter values. A user can use
   the signed kinds of integers to interoperate with the unsigned types
   and all their qualified versions as well. This has the potentially
   surprising side effect that the C type unsigned char is interoperable
   with the type integer with a kind type parameter of C_SIGNED_CHAR.

Now, for the problem at hand, the hidden argument could be passed as
a c_ssize_t.  A Fortan string can never have a negative length. The
only thing that might be of a concern is breaking the libgfortran ABI,
and causing mixed-language programmers who do not use the features of
Section 18 some heartburn.

[Bug target/102491] [12 Regression] Assembler messages: Error: no such instruction: `vmovw %xmm0,%eax'

2021-09-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102491

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from H.J. Lu  ---
Binutils 2.38 is needed for AVX512FP16.

[Bug target/102491] [12 Regression] Assembler messages: Error: no such instruction: `vmovw %xmm0,%eax'

2021-09-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102491

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
I don't see a bug here.  If you enable AVX512FP16 or -march which implies it,
you need binutils that support such ISA.  At least on x86 GCC has behaved that
way for years with new ISAs, there intentionally is not a strict detection on
which ISAs the assembler can handle, because the binutils used at compile time
could be newer than binutils used at compiler configure time.
I think AVX512FP16 support went into binutils in early August this year, while
2.37 is from mid July.

[Bug tree-optimization/102492] New: [12 Regression] ICE in scan_sharing_clauses, at omp-low.c:1205

2021-09-26 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102492

Bug ID: 102492
   Summary: [12 Regression] ICE in scan_sharing_clauses, at
omp-low.c:1205
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code, openmp
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

Created attachment 51510
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51510=edit
Testcase

g++-12.0.0-alpha20210919 snapshot (g:32731fa5b0abf092029b8e2be64319b978bda514)
ICEs when compiling the attached testcase, partially reduced from
libstdc++-v3/testsuite/26_numerics/pstl/numeric_ops/reduce.cc, w/ -fopenmp:

% g++-12.0.0 -fopenmp -c vksbmhow.cc
vksbmhow.cc: In function 'void __simd_transform_reduce(_Size, _Tp,
_BinaryOperation) [with _Size = int; _Tp = Number; _BinaryOperation = int]':
vksbmhow.cc:25:9: error: no matching function for call to 'Number::Number()'
   25 | #pragma omp simd
  | ^~~
vksbmhow.cc:2:3: note: candidate: 'Number::Number(int)'
2 |   Number(int);
  |   ^~
vksbmhow.cc:2:3: note:   candidate expects 1 argument, 0 provided
vksbmhow.cc:1:8: note: candidate: 'constexpr Number::Number(const Number&)'
1 | struct Number {
  |^~
vksbmhow.cc:1:8: note:   candidate expects 1 argument, 0 provided
vksbmhow.cc:1:8: note: candidate: 'constexpr Number::Number(Number&&)'
vksbmhow.cc:1:8: note:   candidate expects 1 argument, 0 provided
during GIMPLE pass: omplower
vksbmhow.cc:25:9: internal compiler error: in scan_sharing_clauses, at
omp-low.c:1205
   25 | #pragma omp simd
  | ^~~
0x7c90d0 scan_sharing_clauses
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/omp-low.c:1205
0x1014514 scan_omp_for
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/omp-low.c:2820
0x1015c00 scan_omp_1_stmt
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/omp-low.c:4137
0xe7deea walk_gimple_stmt(gimple_stmt_iterator*, tree_node*
(*)(gimple_stmt_iterator*, bool*, walk_stmt_info*), tree_node* (*)(tree_node**,
int*, void*), walk_stmt_info*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/gimple-walk.c:602
0xe7e120 walk_gimple_seq_mod(gimple**, tree_node* (*)(gimple_stmt_iterator*,
bool*, walk_stmt_info*), tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/gimple-walk.c:51
0xe7dfd5 walk_gimple_stmt(gimple_stmt_iterator*, tree_node*
(*)(gimple_stmt_iterator*, bool*, walk_stmt_info*), tree_node* (*)(tree_node**,
int*, void*), walk_stmt_info*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/gimple-walk.c:711
0xe7e120 walk_gimple_seq_mod(gimple**, tree_node* (*)(gimple_stmt_iterator*,
bool*, walk_stmt_info*), tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/gimple-walk.c:51
0xe7dfd5 walk_gimple_stmt(gimple_stmt_iterator*, tree_node*
(*)(gimple_stmt_iterator*, bool*, walk_stmt_info*), tree_node* (*)(tree_node**,
int*, void*), walk_stmt_info*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/gimple-walk.c:711
0xe7e120 walk_gimple_seq_mod(gimple**, tree_node* (*)(gimple_stmt_iterator*,
bool*, walk_stmt_info*), tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/gimple-walk.c:51
0x10213e5 scan_omp
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/omp-low.c:4241
0x10213e5 execute_lower_omp
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/omp-low.c:14292
0x10213e5 execute
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/omp-low.c:14350

[Bug target/102491] New: [12 Regression] Assembler messages: Error: no such instruction: `vmovw %xmm0,%eax'

2021-09-26 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102491

Bug ID: 102491
   Summary: [12 Regression] Assembler messages: Error: no such
instruction: `vmovw %xmm0,%eax'
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: assemble-failure, wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---
Target: x86_64-unknown-linux-gnu

GNU as 2.37 emits the following error:

/tmp/ccp9YvMM.s: Assembler messages:
/tmp/ccp9YvMM.s:279: Error: no such instruction: `vmovw %xmm0,%eax'

when compiling the following testcase w/ gcc-12.0.0-alpha20210919 snapshot
(g:32731fa5b0abf092029b8e2be64319b978bda514) w/ -march=sapphirerapids -O1
-funroll-loops:

int x;

__attribute__ ((simd))
short int
foo (void)
{
  x = 0;

  return 0;
}

[Bug fortran/102079] Misleading -Wlto-type-mismatch warning on wrong float type to C function

2021-09-26 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102079

Thomas Koenig  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-26
 Ever confirmed|0   |1

--- Comment #4 from Thomas Koenig  ---
(In reply to Jan Hubicka from comment #3)
> I think the problem here is that fortran uses "long int" while size_t
> interoperate with "unsigned long int".
> Does fortran standard promise that the value should inter-operate with
> size_t as well?

This is a matter for the compiler.

In this case, the documentation states

For arguments of @code{CHARACTER} type, the character length is passed
as a hidden argument at the end of the argument list.  For
deferred-length strings, the value is passed by reference, otherwise
by value.  The character length has the C type @code{size_t} (or
@code{INTEGER(kind=C_SIZE_T)} in Fortran).

so the documentation is actually inconsistent (but then, interoperbility
between Fortran integers and C unsigned types is sort of assumed).
The library uses gfc_charlen_type, which is a typedef to size_t.

So, we can fix this either way.  It would probably be best to have
size_t in the character lengths in the front end as well.

[Bug c++/102490] New: Erroneous optimization of default constexpr operator== of struct with bitfields

2021-09-26 Thread luc.briand35 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102490

Bug ID: 102490
   Summary: Erroneous optimization of default constexpr operator==
of struct with bitfields
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.briand35 at gmail dot com
  Target Milestone: ---

Hello,

For the following code, gcc version 10 and up wrongly optimizes the default
operator==().
This occurs for O1 and up.
Removing the 'constexpr' qualifier fixes everything.
The size of the bitfields doesn't matter.
No warnings are appear with "-Wall -Wextra".


Godbolt link: https://gcc.godbolt.org/z/j4fG3sKze


struct A
{
unsigned char foo : 1;
unsigned char bar : 1;

constexpr bool operator==(const A&) const = default;
};


int main()
{
A a{}, b{};

a.bar = 0b1;

return a == b;
}



With the options "-std=c++2a -O1", the assembly generated is simply:
main:
mov eax, 1
ret



In this similar example, we can see that the generated assembly for the
equality operator ignores the 'bar' bitfield (Godbolt link:
https://gcc.godbolt.org/z/3K75xx1on) :


struct A
{
unsigned char foo : 3;
unsigned char bar : 1;

constexpr bool operator==(const A&) const = default;
};

void change(A& a);

int main()
{
A a{}, b{};

change(a);

return a == b;
}


The assembly for gcc version 10.X and 11.X is a bit different, but have the
same problem.

[Bug tree-optimization/102486] __builtin_popcount(y&-y) is not optimized to y!=0

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102486

--- Comment #2 from Andrew Pinski  ---
(In reply to Luc Van Oostenryck from comment #1)
> when y != 0

Right. So it should be optimize to y!=0 then.

[Bug fortran/102079] Misleading -Wlto-type-mismatch warning on wrong float type to C function

2021-09-26 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102079

--- Comment #3 from Jan Hubicka  ---
I think the problem here is that fortran uses "long int" while size_t
interoperate with "unsigned long int".
Does fortran standard promise that the value should inter-operate with size_t
as well?

[Bug tree-optimization/102486] __builtin_popcount(y&-y) is not optimized to 1

2021-09-26 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102486

Luc Van Oostenryck  changed:

   What|Removed |Added

 CC||luc.vanoostenryck at gmail dot 
com

--- Comment #1 from Luc Van Oostenryck  ---
when y != 0

[Bug c++/102482] Winit-list-lifetime false positive for temporaries with std::initializer_list

2021-09-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102482

--- Comment #3 from Jonathan Wakely  ---
And the fabulous manual:

> warn about uses of "std::initializer_list" that are likely to result in
> dangling pointers

This is behaving exactly as documented:

> *   When a list constructor stores the "begin" pointer from the
> "initializer_list" argument, this doesn't extend the lifetime of
> the array, so if a class variable is constructed from a temporary
> "initializer_list", the pointer is left dangling by the end of the
> variable declaration statement.

[Bug c++/102482] Winit-list-lifetime false positive for temporaries with std::initializer_list

2021-09-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102482

--- Comment #2 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #1)
> > so I assume the warning is std::initializer_list specific.
> 
> Yes. std::initializer_list is a magic language type that the compiler knows
> all about. std::vector is just opaque C++ code that the compiler knows
> nothing about.

The clue is in the name: -Winit-list-lifetime

[Bug c++/102482] Winit-list-lifetime false positive for temporaries with std::initializer_list

2021-09-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102482

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #1 from Jonathan Wakely  ---
(In reply to Federico Kircheis from comment #0)
> While the warning is true, the code is completely safe, but the warnings
> seems to imply that foo will access a dangling pointer.

In this case it's safe because the span obejct has the same lifetime as the
initializer_list, but in general that's not true. The warning comes from the
constructor, and is independent of the context in which it's being constructed.

> so I assume the warning is std::initializer_list specific.

Yes. std::initializer_list is a magic language type that the compiler knows all
about. std::vector is just opaque C++ code that the compiler knows nothing
about.

[Bug c++/102489] New: [12 Regression] ICE in is_this_parameter, at cp/semantics.c:11273

2021-09-26 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102489

Bug ID: 102489
   Summary: [12 Regression] ICE in is_this_parameter, at
cp/semantics.c:11273
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

g++-12.0.0-alpha20210919 snapshot (g:32731fa5b0abf092029b8e2be64319b978bda514)
ICEs when compiling the following testcase, reduced from
gcc/testsuite/g++.dg/coroutines/pr95736.C, w/ -O1 -fcoroutines:

#include 

struct footask {
  struct promise_type {
std::suspend_never initial_suspend();
std::suspend_never final_suspend() noexcept;
void unhandled_exception();
void get_return_object();
  };

  std::suspend_always foo;

  footask taskfun() { co_await foo; }
};

% g++-12.0.0 -O1 -fcoroutines -c c3iikget.C
c3iikget.C: In function 'void
footask::taskfun(footask::taskfun()::_ZN7footask7taskfunEv.Frame*)':
c3iikget.C:13:37: internal compiler error: in is_this_parameter, at
cp/semantics.c:11273
   13 |   footask taskfun() { co_await foo; }
  | ^
0x710dd0 is_this_parameter(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/semantics.c:11273
0x96f9ec potential_constant_expression_1
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/constexpr.c:8393
0x970c87 potential_constant_expression_1
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/constexpr.c:8200
0x971aa6 potential_constant_expression_1(tree_node*, bool, bool, bool, int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/constexpr.c:9062
0x971aa6 is_constant_expression(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/constexpr.c:9119
0x971aa6 is_nondependent_constant_expression(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/constexpr.c:9156
0x9729ff maybe_constant_value(tree_node*, tree_node*, bool)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/constexpr.c:7628
0x9a042d cp_fold
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2656
0x9a0d0c cp_fold_maybe_rvalue(tree_node*, bool)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2127
0x99ee00 cp_fold
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2361
0x9a0d0c cp_fold_maybe_rvalue(tree_node*, bool)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2127
0x99f6b0 cp_fold_rvalue(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2150
0x99f6b0 cp_fold
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2253
0x9a0d0c cp_fold_maybe_rvalue(tree_node*, bool)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2127
0x99f1ed cp_fold_rvalue(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2150
0x99f1ed cp_fold
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:2509
0x9a06de cp_fold_r
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/cp/cp-gimplify.c:847
0x146728a walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/tree.c:11016
0x1467ab8 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/tree.c:11132
0x1467ab8 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha20210919/work/gcc-12-20210919/gcc/tree.c:11132

[Bug c++/102488] New: ICE with default constexpr operator== on class with bitfield

2021-09-26 Thread luc.briand35 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102488

Bug ID: 102488
   Summary: ICE with default constexpr operator== on class with
bitfield
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.briand35 at gmail dot com
  Target Milestone: ---

Hello,

The following code causes an ICE, the culprit being the constexpr operator==.
This only happens on the trunk version (12.0) of gcc.
When the bitfield is smaller than 8 bits, there is no crash.

Godbolt link: https://gcc.godbolt.org/z/eh57da3cE


struct A 
{
unsigned int a : 8; /* no crash for sizes from 1 to 7 and 32 */

constexpr bool operator==(const A&) const = default;
};


int main()
{
A a{}, b{};

return a == b;
}


Full error message (command: "gcc -c ./test_bug.cpp -std=c++2a"):


./test_bug.cpp: In member function ‘constexpr bool A::operator==(const A&)
const’:
./test_bug.cpp:7:20: error: type mismatch in ‘component_ref’
7 | constexpr bool operator==(const A&) const = default;
  |^~~~
unsigned int

unsigned char

_1 = this->a;
./test_bug.cpp:7:20: error: type mismatch in ‘component_ref’
unsigned int

unsigned char

_2 = D.2080->a;
./test_bug.cpp:7:20: internal compiler error: ‘verify_gimple’ failed
0x100bd2d verify_gimple_in_seq(gimple*)
../../gcc-trunk/gcc/tree-cfg.c:5097
0xd40216 gimplify_body(tree_node*, bool)
../../gcc-trunk/gcc/gimplify.c:14876
0xd403f3 gimplify_function_tree(tree_node*)
../../gcc-trunk/gcc/gimplify.c:14966
0xb88907 cgraph_node::analyze()
../../gcc-trunk/gcc/cgraphunit.c:669
0xb8b44f analyze_functions
../../gcc-trunk/gcc/cgraphunit.c:1210
0xb8c032 symbol_table::finalize_compilation_unit()
../../gcc-trunk/gcc/cgraphunit.c:2956
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug c/94726] [10/11/12 Regression] ICE in uniform_vector_p, at tree.c:11214 since r10-2089-g21caa1a2649d586f

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94726

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2021-Septemb
   ||er/580264.html
   Keywords||patch

--- Comment #10 from Andrew Pinski  ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580264.html

[Bug middle-end/98713] Failure to generate branch version of abs if user requested it

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98713

--- Comment #10 from Andrew Pinski  ---
(In reply to Martin Liška from comment #1)
> I think it's fixed since r11-2588-gc072fd236dc08f99.

Oh this changed from the shift/xor/sub to using cmov ...

[Bug middle-end/98713] Failure to generate branch version of abs if user requested it

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98713

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64
   Keywords||missed-optimization

--- Comment #9 from Andrew Pinski  ---
This is basically a bug in PHI-OPT which assumes ABS_EXPR will always generate
better code than the conditional case.

Hmm, Why is x86_64 using a cmov here for ABS_EXPR instead of:
Shift
xor
sub
?

[Bug middle-end/98484] missing -Wstringop-overflow on invalid accesses to the same object by distinct functions

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98484

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-26
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
Confirmed. -Wsystem-headers enables all of the warnings ...

What is interesting is in GCC 10, we don't even get the warning for g0 without
-Wsystem-headers.
In GCC 9 -Wsystem-headers does not enable the warning for g1 or g2 either.

[Bug middle-end/98406] missing -Wmaybe-uninitialized passing a member by reference

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98406

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||23384
 Ever confirmed|0   |1
   Keywords||alias
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-26

--- Comment #1 from Andrew Pinski  ---
I think the no warning for g2 is correct as f could in theory get to 

But g1 should have a warning but requires flow sensative escape set which is PR
23384.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23384
[Bug 23384] escaped set should be flow sensitive

[Bug jit/101491] [11/12 regression] /usr/local/include/libgccjit++.h conflicts between different GCC installations

2021-09-26 Thread gerald at pfeifer dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101491

--- Comment #7 from Gerald Pfeifer  ---
(In reply to David Malcolm from comment #2)
> I'm using $(includedir).  What should I be using?  Thanks

(In reply to Richard Biener from comment #5)
> I think a more appropriate place would be where we also install 
> OpenMP omp.h to (libsubinclude_HEADERS)

David, any chance to can have a look following this recommendation?

It'd be good for 11.3 to address this - thank you!

[Bug bootstrap/81315] powerpc64 vs building lang/gcc7-devel (on FreeBSD head): xgcc gets segmentation fault

2021-09-26 Thread gerald at pfeifer dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81315

Gerald Pfeifer  changed:

   What|Removed |Added

 Resolution|--- |WORKSFORME
 Status|WAITING |RESOLVED

--- Comment #8 from Gerald Pfeifer  ---
Given GCC 7 is now two generations behind the latest supported release
stream (GCC 9) and the lang/gcc*-devel and regular lang/gcc* ports seem
to build fine on FreeBSD/powerpc* let me close this.

[Bug middle-end/69183] ICE when using OpenMP PRIVATE keyword in OMP DO loop not explicitly encapsulated in OMP PARALLEL region

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69183

Andrew Pinski  changed:

   What|Removed |Added

 CC||gerhard.steinmetz.fortran@t
   ||-online.de

--- Comment #11 from Andrew Pinski  ---
*** Bug 78368 has been marked as a duplicate of this bug. ***

[Bug middle-end/78368] ICE in lookup_decl, at omp-low.c:1071

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78368

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Andrew Pinski  ---
Dup of bug 69183.

*** This bug has been marked as a duplicate of bug 69183 ***

[Bug middle-end/69183] ICE when using OpenMP PRIVATE keyword in OMP DO loop not explicitly encapsulated in OMP PARALLEL region

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69183

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |5.5
   Keywords||ice-on-valid-code, openacc
  Known to work||5.5.0, 6.3.0

[Bug middle-end/102487] New: __builtin_popcount(y&3) is not optimized to (y&1)+((y&2)>>1) if don't have popcount optab (or expensive one)

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102487

Bug ID: 102487
   Summary: __builtin_popcount(y&3) is not optimized to
(y&1)+((y&2)>>1) if don't have popcount optab (or
expensive one)
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: x86_64-*-*

Take:
int f(unsigned y)
{
  return __builtin_popcount(y&3);
}

On x86_64 (without popcount optab enabled) this should be optimized just:
movl%edi, %eax
shrl%edi
andl$1, %edi
andl$1, %eax
addl%edi, %eax
ret

But we currently get:

.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
andl$3, %edi
call__popcountdi2
addq$8, %rsp
.cfi_def_cfa_offset 8
ret

For aarch64 we currently get:

and x0, x0, 3
fmovd0, x0
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret

vs:

and w1, w0, 1
ubfxx0, x0, 1, 1
add w0, w0, w1
ret

The second one is much much cheaper as you don't need to move between register
sets.

[Bug tree-optimization/102486] New: __builtin_popcount(y&-y) is not optimized to 1

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102486

Bug ID: 102486
   Summary: __builtin_popcount(y&-y) is not optimized to 1
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
int f(unsigned y)
{
  return __builtin_popcount(y&-y);
}

This should be optimized to just 1.

[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738

--- Comment #7 from Andrew Pinski  ---
Note we don't need to do y&-y only if we keep track of popcount of the
SSA_NAME.  But we don't have that yet.

[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-09-26
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW

--- Comment #6 from Andrew Pinski  ---
Confirmed.


I think x/(y&-y) should be expanded as x >> ctz (y&-y) + 1 (if ctz is an
opcode) but this should be done only at expand time (unless we get a "lower"
gimple phase).

[Bug middle-end/97425] bogus array bounds in -Warray-bounds for a function array parameter

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97425

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-09-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Andrew Pinski  ---
Confirmed.

[Bug middle-end/97393] missing -Walloca-larger-than on an excessive range

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97393

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
 Ever confirmed|0   |1
   Last reconfirmed||2021-09-26
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug middle-end/97374] missing essential detail in array parameter overflow warning

2021-09-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97374

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-09-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #3 from Andrew Pinski  ---
Confirmed.

[Bug target/102473] [12 Regression] 521.wrf_r 5% slower at -Ofast and generic x86_64 tuning after r12-3426-g8f323c712ea76c

2021-09-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102473

--- Comment #5 from Hongtao.liu  ---
Regression also exists for -march=x86-64 -msse3 -mtune=generic -Ofast.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2021-09-26 Thread mehdi.chinoune at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Chinoune  changed:

   What|Removed |Added

 CC||mehdi.chinoune at hotmail dot 
com

--- Comment #32 from Chinoune  ---
Still present in GCC 11.2.0