[Bug tree-optimization/107891] Redudant "double" permutation from SLP vectorization (PR97832)

2022-11-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107891

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||rguenth at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org
 Blocks||53947
  Component|middle-end  |tree-optimization
Summary|Redudant "double"   |Redudant "double"
   |permutation from PR97832|permutation from SLP
   ||vectorization (PR97832)
   Last reconfirmed||2022-11-28
 Status|UNCONFIRMED |NEW


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/107888] [12/13 Regression] Missed min/max transformation in phiopt due to VRP

2022-11-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107888

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.3

--- Comment #1 from Richard Biener  ---
which means we fail to optimize a > b ? 1 : b as well, no?

[Bug c/107890] UB on integer overflow impacts code flow

2022-11-27 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107890

Martin Uecker  changed:

   What|Removed |Added

 CC||muecker at gwdg dot de

--- Comment #3 from Martin Uecker  ---

Of course, instead of using the standard as an excuse, we could also try to
make the compiler less of a footgun. 

Even if this is standard conforming, it is still a severe usability issue with
safety implications and I do not think we should simply close such bugs.

[Bug demangler/107884] H8/300: cp-demangle.c fix warning related demangle.h

2022-11-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107884

Richard Biener  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org
 Target||h8

--- Comment #3 from Richard Biener  ---
Since there is I think no ABI constraints here simply using the appearant
unused bits to get them to fit into 16 bits looks possible?

Supposedly C defines literal suffixes for int32_t?  Otherwise using (1L << 17)
might work as well here.

[Bug rtl-optimization/107892] New: Unnecessary move between ymm registers in loop using AVX2 intrinsic

2022-11-27 Thread ebiggers3 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107892

Bug ID: 107892
   Summary: Unnecessary move between ymm registers in loop using
AVX2 intrinsic
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ebiggers3 at gmail dot com
  Target Milestone: ---

To reproduce with the latest trunk, compile the following .c file on x86_64 at
-O2:

#include 

int __attribute__((target("avx2")))
sum_ints(const __m256i *p, size_t n)
{
__m256i a = _mm256_setzero_si256();
__m128i b;

do {
a = _mm256_add_epi32(a, *p++);
} while (--n);

b = _mm_add_epi32(_mm256_extracti128_si256(a, 0),
  _mm256_extracti128_si256(a, 1));
b = _mm_add_epi32(b, _mm_shuffle_epi32(b, 0x31));
b = _mm_add_epi32(b, _mm_shuffle_epi32(b, 0x02));
return _mm_cvtsi128_si32(b);
}

The assembly that gcc generates is:

 :
   0:   c5 f1 ef c9 vpxor  %xmm1,%xmm1,%xmm1
   4:   0f 1f 40 00 nopl   0x0(%rax)
   8:   c5 f5 fe 07 vpaddd (%rdi),%ymm1,%ymm0
   c:   48 83 c7 20 add$0x20,%rdi
  10:   c5 fd 6f c8 vmovdqa %ymm0,%ymm1
  14:   48 83 ee 01 sub$0x1,%rsi
  18:   75 ee   jne8 
  1a:   c4 e3 7d 39 c1 01   vextracti128 $0x1,%ymm0,%xmm1
  20:   c5 f9 fe c1 vpaddd %xmm1,%xmm0,%xmm0
  24:   c5 f9 70 c8 31  vpshufd $0x31,%xmm0,%xmm1
  29:   c5 f1 fe c8 vpaddd %xmm0,%xmm1,%xmm1
  2d:   c5 f9 70 c1 02  vpshufd $0x2,%xmm1,%xmm0
  32:   c5 f9 fe c1 vpaddd %xmm1,%xmm0,%xmm0
  36:   c5 f9 7e c0 vmovd  %xmm0,%eax
  3a:   c5 f8 77vzeroupper
  3d:   c3  ret

The bug is that the inner loop contains an unnecessary vmovdqa:

   8:   vpaddd (%rdi),%ymm1,%ymm0
add$0x20,%rdi
vmovdqa %ymm0,%ymm1
sub$0x1,%rsi
jne8 

It should look like the following instead:

   8:   vpaddd (%rdi),%ymm0,%ymm0
add$0x20,%rdi
sub$0x1,%rsi
jne8 

Strangely, the bug goes away if the __v8si type is used instead of __m256i and
the addition is done using "+=" instead of _mm256_add_epi32():

int __attribute__((target("avx2")))
sum_ints_good(const __v8si *p, size_t n)
{
__v8si a = {};
__m128i b;

do {
a += *p++;
} while (--n);

b = _mm_add_epi32(_mm256_extracti128_si256((__m256i)a, 0),
  _mm256_extracti128_si256((__m256i)a, 1));
b = _mm_add_epi32(b, _mm_shuffle_epi32(b, 0x31));
b = _mm_add_epi32(b, _mm_shuffle_epi32(b, 0x02));
return _mm_cvtsi128_si32(b);
}

In the bad version, I noticed that the RTL initially has two separate insns for
'a += *p': one to do the addition and write the result to a new pseudo
register, and one to convert the value from mode V8SI to V4DI and assign it to
the original pseudo register.  These two separate insns never get combined. 
(That sort of explains why the bug isn't seen with the __v8si and += method;
gcc doesn't do a type conversion with that method.)  So, I'm wondering if the
bug is in the instruction combining pass.  Or perhaps the RTL should never have
had two separate insns in the first place?

[Bug analyzer/107882] [13 Regression] ICE in get_last_bit_offset, at analyzer/store.h:255 since 13-2582-g0ea5e3f4542832b8

2022-11-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107882

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.0

[Bug tree-optimization/107879] [13 Regression] ffmpeg-4 test suite fails on FPU arithmetics

2022-11-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107879

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/107876] [13 Regression] ICE in verify_dominators, at dominance.cc:1184 (error: dominator of 4 should be 14, not 16) since r13-3749-g7314b98b1bcd382c

2022-11-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107876

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #3 from Richard Biener  ---
Mine.

[Bug target/107863] [10/11/12/13 Regression] ICE with unrecognizable insn when using -funsigned-char with some SSE/AVX builtins

2022-11-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107863

--- Comment #9 from Hongtao.liu  ---
expand_expr_real_1 generates (const_int 255) without considering the target
mode.
I guess it's on purpose, so I'll leave that alone and only change the expander
in the backend. After applying convert_modes to (const_int 255), it's
transformed to (const_int -1) which should fix the issue.


---cut from expand_expr_real_1--
11010case INTEGER_CST:
11011  {
11012/* Given that TYPE_PRECISION (type) is not always equal to
11013   GET_MODE_PRECISION (TYPE_MODE (type)), we need to extend from
11014   the former to the latter according to the signedness of the
11015   type.  */
11016scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (type);
11017temp = immed_wide_int_const
11018  (wi::to_wide (exp, GET_MODE_PRECISION (int_mode)), int_mode);
11019return temp;
11020  }
---cut ends


Proposed patch:

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 0373c3614a4..c639ee3a9f7 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -12475,7 +12475,7 @@ ix86_expand_vec_set_builtin (tree exp)
   op1 = expand_expr (arg1, NULL_RTX, mode1, EXPAND_NORMAL);
   elt = get_element_number (TREE_TYPE (arg0), arg2);

-  if (GET_MODE (op1) != mode1 && GET_MODE (op1) != VOIDmode)
+  if (GET_MODE (op1) != mode1)
 op1 = convert_modes (mode1, GET_MODE (op1), op1, true);

   op0 = force_reg (tmode, op0);

[Bug middle-end/107891] Redudant "double" permutation from PR97832

2022-11-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107891

--- Comment #1 from Hongtao.liu  ---
commemt25 from PR97832

I guess that's possible but the SLP vectorizer has a permute optimization
phase (and SLP discovery itself), it would be nice to see why the former
doesn't elide the permutes here.

[Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3

2022-11-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832

--- Comment #26 from Hongtao.liu  ---

> I guess that's possible but the SLP vectorizer has a permute optimization
> phase (and SLP discovery itself), it would be nice to see why the former
> doesn't elide the permutes here.

I've opened PR107891 for it.

[Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3

2022-11-27 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832

--- Comment #25 from rguenther at suse dot de  ---
On Mon, 28 Nov 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
> 
> --- Comment #24 from Hongtao.liu  ---
>   _233 = {f_im_36, f_re_35, f_re_35, f_re_35};
>   _217 = {f_re_35, f_im_36, f_im_36, f_im_36};
> ...
> vect_x_re_55.15_227 = VEC_PERM_EXPR  { 0, 5, 6, 7 }>;
>   vect_x_re_55.23_211 = VEC_PERM_EXPR  vect_x_im_61.14_228, { 0, 5, 6, 7 }>;
> ...
>   vect_y_re_69.17_224 = .FNMA (vect_x_re_55.15_227, _233, vect_y_re_63.9_237);
>   vect_y_re_69.25_208 = .FNMA (vect_x_re_55.23_211, _217, 
> vect_y_re_69.17_224);
> 
> is equal to
> 
>   _233 = {f_im_36,f_im_36, f_im_36, f_im_36}
>   _217 = {f_re_35, f_re_35, f_re_35, f_re_35};
> ...
>   vect_y_re_69.17_224 = .FNMA (vect_x_im_61.14_228, _233, vect_y_re_63.9_237)
>   vect_y_re_69.25_208 = .FNMA (vect_x_im_61.13_230, _217, vect_y_re_69.17_224)
> 
> A simplication in match.pd?

I guess that's possible but the SLP vectorizer has a permute optimization
phase (and SLP discovery itself), it would be nice to see why the former
doesn't elide the permutes here.

[Bug c++/107889] Incorrect parsing of qualified friend function returning decltype(auto)

2022-11-27 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107889

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2022-11-28
 CC||marxin at gcc dot gnu.org

--- Comment #1 from Martin Liška  ---
Clang accepts the code.

[Bug fortran/107872] ICE on recursive DT with DTIO since r7-4096-gbf9f15ee55f5b291

2022-11-27 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107872

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org,
   ||pault at gcc dot gnu.org
Summary|ICE on recursive DT with|ICE on recursive DT with
   |DTIO|DTIO since
   ||r7-4096-gbf9f15ee55f5b291

--- Comment #2 from Martin Liška  ---
Started likely with r7-4096-gbf9f15ee55f5b291.

[Bug analyzer/107882] [13 Regression] ICE in get_last_bit_offset, at analyzer/store.h:255

2022-11-27 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107882

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org,
   ||tlange at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-11-28
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Started with r13-2582-g0ea5e3f4542832b8.

[Bug tree-optimization/107876] [13 Regression] ICE in verify_dominators, at dominance.cc:1184 (error: dominator of 4 should be 14, not 16) since r13-3749-g7314b98b1bcd382c

2022-11-27 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107876

Martin Liška  changed:

   What|Removed |Added

Summary|[13 Regression] ICE in  |[13 Regression] ICE in
   |verify_dominators, at   |verify_dominators, at
   |dominance.cc:1184 (error:   |dominance.cc:1184 (error:
   |dominator of 4 should be|dominator of 4 should be
   |14, not 16) |14, not 16) since
   ||r13-3749-g7314b98b1bcd382c
 CC||marxin at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
Started with r13-3749-g7314b98b1bcd382c.

[Bug middle-end/107891] New: Redudant "double" permutation from PR97832

2022-11-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107891

Bug ID: 107891
   Summary: Redudant "double" permutation from PR97832
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: crazylht at gmail dot com
  Target Milestone: ---

#include 

void foo1x1(double* restrict y, const double* restrict x, int clen)
{
  int xi = clen & 2;
  double f_re = x[0+xi+0];
  double f_im = x[4+xi+0];
  ptrdiff_t clen2 = (clen+xi) * 2;
  //#pragma GCC unroll 0
  for (ptrdiff_t c = 0; c < clen2; c += 8) {
// y[c] = y[c] - x[c]*conj(f);
//#pragma GCC  unroll 4
for (ptrdiff_t k = 0; k < 4; ++k) {
  double x_re = x[c+0+k];
  double x_im = x[c+4+k];
  double y_re = y[c+0+k];
  double y_im = y[c+4+k];
  y_re = y_re - x_re * f_re - x_im * f_im;;
  y_im = y_im + x_re * f_im - x_im * f_re;
  y[c+0+k] = y_re;
  y[c+4+k] = y_im;
}
  }
}

-Ofast -mavx2 -mfma generate extra blendpd compared to -O3 -mavx2 -mfma
and blendpd is redundant since there're "doube" permutations for mult operand
in FMA. 

They're computing the same thing since we also do the same "permutation" for
the invariants: f_re and f_imm, can we eliminate that in the vectorizer?

  _232 = {f_im_36, f_im_36, f_im_36, f_im_36};
  _231 = {f_im_36, f_re_35, f_re_35, f_re_35}; --- here
  _216 = {f_re_35, f_re_35, f_re_35, f_re_35};
  _215 = {f_re_35, f_im_36, f_im_36, f_im_36}; -- and here.
  ivtmp.36_221 = (unsigned long) y_41(D);
  ivtmp.38_61 = (unsigned long) x_33(D);

   [local count: 214748368]:
  # ivtmp.32_66 = PHI 
  # ivtmp.36_64 = PHI 
  # ivtmp.38_220 = PHI 
  # DEBUG c => NULL
  # DEBUG k => 0
  # DEBUG BEGIN_STMT
  # DEBUG BEGIN_STMT
  # DEBUG D#78 => D#79 * 8
  # DEBUG D#77 => x_33(D) + D#78
  _62 = (void *) ivtmp.38_220;
  vect_x_im_61.13_228 = MEM  [(const double *)_62];
  vect_x_im_61.14_226 = MEM  [(const double *)_62 +
32B];
  vect_x_re_55.15_225 = VEC_PERM_EXPR ; - here. 
  vect_x_re_55.23_209 = VEC_PERM_EXPR ;  - here
  # DEBUG D#76 => *D#77
  # DEBUG x_re => D#76
  # DEBUG BEGIN_STMT
  # DEBUG D#74 => (long unsigned int) D#75
  # DEBUG D#73 => D#74 * 8
  # DEBUG D#72 => x_33(D) + D#73
  # DEBUG D#71 => *D#72
  # DEBUG x_im => D#71
  # DEBUG BEGIN_STMT
  # DEBUG D#70 => y_41(D) + D#78
  _59 = (void *) ivtmp.36_64;
  vect_y_re_63.9_235 = MEM  [(double *)_59];
  vect_y_re_63.10_233 = MEM  [(double *)_59 + 32B];
  vect__42.18_219 = .FMA (vect_x_im_61.13_228, _232, vect_y_re_63.10_233);
  vect_y_re_69.17_222 = .FNMA (vect_x_re_55.15_225, _231, vect_y_re_63.9_235);
  vect_y_re_69.25_206 = .FNMA (vect_x_re_55.23_209, _215, vect_y_re_69.17_222);
  vect_y_re_69.25_205 = .FNMA (_216, vect_x_im_61.14_226, vect__42.18_219);




and

  _233 = {f_im_36, f_re_35, f_re_35, f_re_35};
  _217 = {f_re_35, f_im_36, f_im_36, f_im_36};
...
vect_x_re_55.15_227 = VEC_PERM_EXPR ;
  vect_x_re_55.23_211 = VEC_PERM_EXPR ;
...
  vect_y_re_69.17_224 = .FNMA (vect_x_re_55.15_227, _233, vect_y_re_63.9_237);
  vect_y_re_69.25_208 = .FNMA (vect_x_re_55.23_211, _217, vect_y_re_69.17_224);

is equal to

  _233 = {f_im_36,f_im_36, f_im_36, f_im_36}
  _217 = {f_re_35, f_re_35, f_re_35, f_re_35};
...
  vect_y_re_69.17_224 = .FNMA (vect_x_im_61.14_228, _233, vect_y_re_63.9_237)
  vect_y_re_69.25_208 = .FNMA (vect_x_im_61.13_230, _217, vect_y_re_69.17_224)

A simplication in match.pd?

[Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3

2022-11-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832

--- Comment #24 from Hongtao.liu  ---
  _233 = {f_im_36, f_re_35, f_re_35, f_re_35};
  _217 = {f_re_35, f_im_36, f_im_36, f_im_36};
...
vect_x_re_55.15_227 = VEC_PERM_EXPR ;
  vect_x_re_55.23_211 = VEC_PERM_EXPR ;
...
  vect_y_re_69.17_224 = .FNMA (vect_x_re_55.15_227, _233, vect_y_re_63.9_237);
  vect_y_re_69.25_208 = .FNMA (vect_x_re_55.23_211, _217, vect_y_re_69.17_224);

is equal to

  _233 = {f_im_36,f_im_36, f_im_36, f_im_36}
  _217 = {f_re_35, f_re_35, f_re_35, f_re_35};
...
  vect_y_re_69.17_224 = .FNMA (vect_x_im_61.14_228, _233, vect_y_re_63.9_237)
  vect_y_re_69.25_208 = .FNMA (vect_x_im_61.13_230, _217, vect_y_re_69.17_224)

A simplication in match.pd?

[Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3

2022-11-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832

--- Comment #23 from Hongtao.liu  ---

> the blends do not look like no-ops so I wonder if this is really computing
> the same thing ... (it swaps lane 0 from the two loads from x but not the
> stores)

They're computing the same thing since we also do the same "permutation" for
the invariants: f_re and f_imm, can we eliminate that in the vectorizer?

  _232 = {f_im_36, f_im_36, f_im_36, f_im_36};
  _231 = {f_im_36, f_re_35, f_re_35, f_re_35}; --- here
  _216 = {f_re_35, f_re_35, f_re_35, f_re_35};
  _215 = {f_re_35, f_im_36, f_im_36, f_im_36}; -- and here.
  ivtmp.36_221 = (unsigned long) y_41(D);
  ivtmp.38_61 = (unsigned long) x_33(D);

   [local count: 214748368]:
  # ivtmp.32_66 = PHI 
  # ivtmp.36_64 = PHI 
  # ivtmp.38_220 = PHI 
  # DEBUG c => NULL
  # DEBUG k => 0
  # DEBUG BEGIN_STMT
  # DEBUG BEGIN_STMT
  # DEBUG D#78 => D#79 * 8
  # DEBUG D#77 => x_33(D) + D#78
  _62 = (void *) ivtmp.38_220;
  vect_x_im_61.13_228 = MEM  [(const double *)_62];
  vect_x_im_61.14_226 = MEM  [(const double *)_62 +
32B];
  vect_x_re_55.15_225 = VEC_PERM_EXPR ;
  vect_x_re_55.23_209 = VEC_PERM_EXPR ;
  # DEBUG D#76 => *D#77
  # DEBUG x_re => D#76
  # DEBUG BEGIN_STMT
  # DEBUG D#74 => (long unsigned int) D#75
  # DEBUG D#73 => D#74 * 8
  # DEBUG D#72 => x_33(D) + D#73
  # DEBUG D#71 => *D#72
  # DEBUG x_im => D#71
  # DEBUG BEGIN_STMT
  # DEBUG D#70 => y_41(D) + D#78
  _59 = (void *) ivtmp.36_64;
  vect_y_re_63.9_235 = MEM  [(double *)_59];
  vect_y_re_63.10_233 = MEM  [(double *)_59 + 32B];
  vect__42.18_219 = .FMA (vect_x_im_61.13_228, _232, vect_y_re_63.10_233);
  vect_y_re_69.17_222 = .FNMA (vect_x_re_55.15_225, _231, vect_y_re_63.9_235);
  vect_y_re_69.25_206 = .FNMA (vect_x_re_55.23_209, _215, vect_y_re_69.17_222);
  vect_y_re_69.25_205 = .FNMA (_216, vect_x_im_61.14_226, vect__42.18_219);

[Bug target/104271] [12/13 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-11-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271

--- Comment #12 from cuilili  ---
This regression caused by the store forwarding issue, we eliminate the
redundant two pairs of loads and stores which have store forwarding issue by
inlining. 

This regression has been fixed by 

https://gcc.gnu.org/g:1b9a5cc9ec08e9f239dd2096edcc447b7a72f64a

[Bug debug/105145] dropped DWARF location information at -O1/-O2/-O3 upon ftree-dse

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105145

--- Comment #2 from Andrew Pinski  ---
*** Bug 105248 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/105248] gimple level DSE does not add DEBUG statement when deleting store to ADDRESSABLE local decl

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105248

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Andrew Pinski  ---
The problem is exactly the same as PR 105145 so closing as a dup.

*** This bug has been marked as a duplicate of bug 105145 ***

[Bug middle-end/107494] -ffinite-loops does not show it is enabled with --help by default for C++11+

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107494

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
Summary|-ffinite-loops is not   |-ffinite-loops does not
   |enable by default   |show it is enabled with
   ||--help by default for
   ||C++11+
   Last reconfirmed||2022-11-28
 Status|UNCONFIRMED |NEW

--- Comment #2 from Andrew Pinski  ---
Confirmed.

  /* Exit early if we can (e.g. -help).  */
  if (!exit_after_options)
{
  /* Just in case lang_hooks.post_options ends up calling a debug_hook.
 This can happen with incorrect pre-processed input. */
  debug_hooks = _nothing_debug_hooks;
  /* Allow the front end to perform consistency checks and do further
 initialization based on the command line options.  This hook also
 sets the original filename if appropriate (e.g. foo.i -> foo.c)
 so we can correctly initialize debug output.  */
  bool no_backend = lang_hooks.post_options (_input_filename);


So the language hook that does the SET_OPTION_IF_UNSET is not called at all.

[Bug tree-optimization/91882] boolean XOR tautology missed optimisation

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91882

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.0

[Bug tree-optimization/107887] (bool0 > bool1) | bool1 is not optimized to bool0 | bool1

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #2 from Andrew Pinski  ---
Hmm, the code of reassociation is somewhat hard to follow. So I am not going to
work on this.

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #8 from Andrew Pinski  ---
Hmm, the code of reassociation is somewhat hard to follow. So I am not going to
work on this.

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #7 from Andrew Pinski  ---
Mine.

[Bug tree-optimization/107887] (bool0 > bool1) | bool1 is not optimized to bool0 | bool1

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2022-11-28
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=107881

--- Comment #1 from Andrew Pinski  ---
There is some discussion about this in bug 107881 comment #6 on how to
implement this inside reassociation .

I am going to try to figure out how to handle this there.

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

--- Comment #6 from Andrew Pinski  ---
I was thinking about having reassociation changing bool == bool, bool < bool,
and bool <= bool into ~(bool ^ bool), !bool & bool, !bool | bool to
"linearizing" so then reassociation can handle the rest (with the xor patch
still? or we change ^ to how we expand xor like it is done in the patch) and
then when finalizing, we simplify back to ==, <, and <= (and ^).

[Bug target/107748] [13 Regression] Isn't _mm_cvtsbh_ss incorrect?

2022-11-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107748

--- Comment #11 from Hongtao.liu  ---
Fixed in GCC13.

[Bug target/107748] [13 Regression] Isn't _mm_cvtsbh_ss incorrect?

2022-11-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107748

--- Comment #10 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:a1ecc5600464f6a62faab246d522b6328badda90

commit r13-4314-ga1ecc5600464f6a62faab246d522b6328badda90
Author: liuhongt 
Date:   Wed Nov 23 21:58:09 2022 +0800

Fix incorrect _mm_cvtsbh_ss.

After supporting real __bf16, the implementation of _mm_cvtsbh_ss went
wrong.

The patch add a builtin to generate pslld for the intrinsic, also
extendbfsf2 is supported with pslld when !HONOR_NANS (BFmode).

truncsfbf2 is supported with vcvtneps2bf16 when
!HONOR_NANS (BFmode) && flag_unsafe_math_optimizations.

gcc/ChangeLog:

PR target/107748
* config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Refined.
* config/i386/i386-builtin-types.def (FLOAT_FTYPE_BFLOAT16):
New function type.
* config/i386/i386-builtin.def (BDESC): New builtin.
* config/i386/i386-expand.cc (ix86_expand_args_builtin):
Handle the builtin.
* config/i386/i386.md (extendbfsf2): New expander.
(extendbfsf2_1): New define_insn.
(truncsfbf2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: Scan pslld.
* gcc.target/i386/extendbfsf.c: New test.

[Bug c/107890] UB on integer overflow impacts code flow

2022-11-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107890

--- Comment #2 from Jonathan Wakely  ---
You should read https://blog.regehr.org/archives/213

[Bug c/107890] UB on integer overflow impacts code flow

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107890

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
>I was under the impression that this kind of undefined behavior essentially 
>meant that the value of that integer could become unreliable.

Your impression is incorrect. Once undefined behavior happens, anything can
happen. 

This is why things like -fsanitize=undefined is there now.

[Bug c/107890] New: UB on integer overflow impacts code flow

2022-11-27 Thread gcc at pkh dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107890

Bug ID: 107890
   Summary: UB on integer overflow impacts code flow
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at pkh dot me
  Target Milestone: ---

Following is a code that is sensible to a signed integer overflow. I was under
the impression that this kind of undefined behavior essentially meant that the
value of that integer could become unreliable. But apparently this is not
limited to the value of said integer, it can also dramatically impact the code
flow.

Here is the pathological code:

#include 
#include 
#include 

uint8_t tab[0x1ff + 1];

uint8_t f(int32_t x)
{
if (x < 0)
return 0;
int32_t i = x * 0x1ff / 0x;
if (i >= 0 && i < sizeof(tab)) {
printf("tab[%d] looks safe because %d is between [0;%d[\n", i, i,
(int)sizeof(tab));
return tab[i];
}

return 0;
}

int main(int ac, char **av)
{
return f(atoi(av[1]));
}

Triggering an overflow actually enters the printf/dereference scope, violating
the protective condition and thus causing a crash:

% cc -Wall -O2 overflow.c -o overflow && ./overflow 5000
tab[62183] looks safe because 62183 is between [0;512[
zsh: segmentation fault (core dumped)  ./overflow 5000

I feel extremely uncomfortable about an integer overflow actually impacting
something else than the integer itself. Is it expected or is this a bug?

[Bug analyzer/107807] gcc.dg/analyzer/errno-1.c FAILs

2022-11-27 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107807

--- Comment #7 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #6 from Rainer Orth  ---
> It did in last night's Solaris bootstraps (sparc and x86).  macOS bootstraps
> are
> super-slow, so I'll wait for tomorrow night's weekly bootstraps there and
> report
> back when they are finished.

The Mac OS X 10.7 have finished now and as expected, the failures are gone.

[Bug fortran/107874] merge not using all its arguments

2022-11-27 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107874

--- Comment #5 from Steve Kargl  ---
On Sun, Nov 27, 2022 at 08:00:35PM +, anlauf at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107874
> 
> --- Comment #3 from anlauf at gcc dot gnu.org ---
> (In reply to kargl from comment #2)
> > Harald, you are likely right the patch can be moved down.  I'll programmed
> > up the example from the Fortran 2018 standard, which works as expected.  So,
> > there is definitely something about a scalar mask choosing the actual
> > argument before both are evaluated.
> > 
> >program foo
> 
> Steve,
> 
> this example from the standard seems to be working down to 7.5 for me.
> Am I missing something?  Do we need this in the testsuite?

You are not missing anything.  I wanted an example that works
with or without the patch John included, so that we don't 
accidently introduce a regression.

> I'd say it's rather the following two lines replacing the loop in the
> reproducer in comment#0:
> 
>   print *, merge(tstuff(),fstuff(),.true.)
>   print *, merge(tstuff(),fstuff(),.false.)
> 
> This is mis-simplified in simplify.cc:4909

Good find!  This may indeed be a source of the issue.

[Bug libstdc++/107815] 20_util/to_chars/float128_c++23.cc FAILs

2022-11-27 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107815

--- Comment #16 from dave.anglin at bell dot net ---
This is what the test prints:
6.47518e-4966 6e-4966
xxx.cc:79: void test(std::chars_format): Assertion 'ec4 == std::errc() && ptr4
== ptr1' failed.
ABORT instruction (core dumped)

[Bug fortran/107874] merge not using all its arguments

2022-11-27 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107874

--- Comment #4 from anlauf at gcc dot gnu.org ---
The following patch fixes comment#3:

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index 9c2fea8c5f2..2f69c4369ab 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -4913,6 +4914,11 @@ gfc_simplify_merge (gfc_expr *tsource, gfc_expr
*fsource, gfc_expr *mask)

   if (mask->expr_type == EXPR_CONSTANT)
 {
+  /* The standard requires evaluation of all function arguments.
+Simplify only when TSOURCE, FSOURCE are constant expressions.  */
+  if (!gfc_is_constant_expr (tsource) || !gfc_is_constant_expr (fsource))
+   return NULL;
+
   result = gfc_copy_expr (mask->value.logical ? tsource : fsource);
   /* Parenthesis is needed to get lower bounds of 1.  */
   result = gfc_get_parentheses (result);

This leads to a "regression" for gfortran.dg/merge_init_expr_2.f90,
which is due to the pattern matching the old, faulty simplification result.
That's trivial to fix, though.

[Bug libstdc++/107815] 20_util/to_chars/float128_c++23.cc FAILs

2022-11-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107815

--- Comment #15 from Jakub Jelinek  ---
(In reply to dave.anglin from comment #14)
> /home/dave/gnu/gcc/gcc/libstdc++-v3/testsuite/20_util/to_chars/
> float128_c++23.cc
> :77: void test(std::chars_format): Assertion 'ec4 == std::errc() && ptr4 ==
> ptr1
> ' failed.
> FAIL: 20_util/to_chars/float128_c++23.cc execution test

Can you provide more info?
E.g. try to run the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107815#c5
program and attach here what it prints, uncomment the
//std::cout << u << ' ' << std::string_view (str1, ptr1) << '\n';
line at least to see which test it is (if also the max() or some other one)?
Thanks.

[Bug libstdc++/107815] 20_util/to_chars/float128_c++23.cc FAILs

2022-11-27 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107815

--- Comment #14 from dave.anglin at bell dot net ---
/home/dave/gnu/gcc/gcc/libstdc++-v3/testsuite/20_util/to_chars/float128_c++23.cc
:77: void test(std::chars_format): Assertion 'ec4 == std::errc() && ptr4 ==
ptr1
' failed.
FAIL: 20_util/to_chars/float128_c++23.cc execution test

[Bug fortran/107819] ICE in gfc_check_argument_var_dependency, at fortran/dependency.cc:978

2022-11-27 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107819

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |anlauf at gcc dot 
gnu.org

--- Comment #13 from anlauf at gcc dot gnu.org ---
Submitted: https://gcc.gnu.org/pipermail/fortran/2022-November/058556.html

[Bug libstdc++/107886] Problem witch std::latch, std::binary_semaphores in C++20

2022-11-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
   Last reconfirmed||2022-11-27

--- Comment #7 from Jonathan Wakely  ---
(In reply to Jamaika from comment #0)
> https://github.com/meganz/mingw-std-threads/issues/67

Please read https://gcc.gnu.org/bugs/ and provide the requested info, not just
a URL.

[Bug fortran/107874] merge not using all its arguments

2022-11-27 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107874

--- Comment #3 from anlauf at gcc dot gnu.org ---
(In reply to kargl from comment #2)
> Harald, you are likely right the patch can be moved down.  I'll programmed
> up the example from the Fortran 2018 standard, which works as expected.  So,
> there is definitely something about a scalar mask choosing the actual
> argument before both are evaluated.
> 
>program foo

Steve,

this example from the standard seems to be working down to 7.5 for me.
Am I missing something?  Do we need this in the testsuite?

I'd say it's rather the following two lines replacing the loop in the
reproducer in comment#0:

  print *, merge(tstuff(),fstuff(),.true.)
  print *, merge(tstuff(),fstuff(),.false.)

This is mis-simplified in simplify.cc:4909

gfc_expr *
gfc_simplify_merge (gfc_expr *tsource, gfc_expr *fsource, gfc_expr *mask)
{
  gfc_expr * result;
  gfc_constructor *tsource_ctor, *fsource_ctor, *mask_ctor;

  if (mask->expr_type == EXPR_CONSTANT)
{
  result = gfc_copy_expr (mask->value.logical ? tsource : fsource);
  /* Parenthesis is needed to get lower bounds of 1.  */
  result = gfc_get_parentheses (result);
  gfc_simplify_expr (result, 1);
  return result;
}

So unless tsource and fsource are both constant, we have to give up here.

[Bug c++/107889] New: Incorrect parsing of qualified friend function returning decltype(auto)

2022-11-27 Thread gcc at nospam dot scs.stanford.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107889

Bug ID: 107889
   Summary: Incorrect parsing of qualified friend function
returning decltype(auto)
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at nospam dot scs.stanford.edu
  Target Milestone: ---

G++ 12.2.0 rejects a valid friend declaration for a fully-qualified function
returning `decltype(auto)`.  To reproduce the problem, you can try to compile
the following code with `g++ -std=c++20 -c bug.cc`:

decltype(auto)
f()
{
}

struct S {
  friend decltype(auto) ::f();
};


This results in the following error:

$ c++ -std=c++20 -c bug.cc
bug.cc:7:27: error: 'decltype(auto)' is not a class type
7 |   friend decltype(auto) ::f();
  |   ^
bug.cc:7:27: error: 'decltype(auto)' is not a class type
bug.cc:7:27: error: 'decltype(auto)' is not a class type
bug.cc:7:29: error: 'decltype(auto)' is not a class type
7 |   friend decltype(auto) ::f();
  | ^
bug.cc:7:10: error: ISO C++ forbids declaration of 'f' with no type
[-fpermissive]
7 |   friend decltype(auto) ::f();
  |  ^~
bug.cc:7:29: error: invalid use of 'decltype(auto)'
7 |   friend decltype(auto) ::f();


A similar problem was reported in bug #59766 for friend functions returning
auto.  It seems to have been mostly fixed, but the combination of
decltype(auto) and the function name being qualified (::f) is still a problem.

[Bug tree-optimization/107888] New: [12/13 Regression] Missed min/max transformation in phiopt due to VRP

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107888

Bug ID: 107888
   Summary: [12/13 Regression] Missed min/max transformation in
phiopt due to VRP
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
#define bool _Bool
int maxbool(bool ab, bool bb)
{
  int a = ab;
  int b = bb;
  int c;
  if (a > b)
c = a;
  else
c = b;
  return c;
}
```

We miss that c is max of a and b because VRP decides to change the phi.
We get out of VRP:
```
  if (a_3 > b_5)
goto ; [INV]
  else
goto ; [INV]

   :

   :
  # c_1 = PHI <1(2), b_5(3)>
```

What VRP is doing is correct just is harder to optimize to a max (and then a |
).

In the above case we could optimize `bool0 ? 1 : bool1` to `bool0 | bool1` But
then we end up with PR 107887 too.

You can also end up with the above issue where you know the only overlap
between the two arguments is [5,6] :
```
int max(int ab, int bb)
{
  if (ab < 5)  __builtin_trap();
  if (bb > 6)  __builtin_trap();
  int a = ab;
  int b = bb;
  int c;
  if (a >= b)
c = a;
  else
c = b;
  return c;
}
```
Which we cannot optimize based on zero/one any more. (note this version of max
has been an issue since at least GCC 4.1, I suspect since VRP was added).

[Bug tree-optimization/107887] (bool0 > bool1) | bool1 is not optimized to bool0 | bool1

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/107887] New: (bool0 > bool1) | bool1 is not optimized to bool0 | bool1

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887

Bug ID: 107887
   Summary: (bool0 > bool1) | bool1 is not optimized to bool0 |
bool1
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
_Bool max(_Bool aa, _Bool bb)
{
  bool t = aa > bb;
  return t | bb;
}
```
This should be optimized to just `return aa | bb;`
I accidently found this while working on PR 101805 .

The original testcase which I found it:
```
int ii(_Bool aa, _Bool bb)
{
  int c;
  int a = aa;
  int b = bb;
  if (a > b)
c = a;
  else
c = b;
  if (c)
return 100;
  return c;
}
```

[Bug libstdc++/107886] Problem witch std::latch, std::binary_semaphores in C++20

2022-11-27 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

--- Comment #6 from Jamaika  ---
I don't understand something. Why _GLIBCXX_HAS_GTHREADS works for std::jthread
but not for std::latch
```
#if defined _GLIBCXX_HAS_GTHREADS || defined _GLIBCXX_HAVE_LINUX_FUTEX
# define __cpp_lib_atomic_wait 201907L
# if __cpp_aligned_new
# define __cpp_lib_barrier 201907L
# endif
#endif
```

[Bug libstdc++/107886] Problem witch std::latch, std::binary_semaphores in C++20

2022-11-27 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

--- Comment #5 from Jamaika  ---
I test gcc 13.0.0. No change

[Bug libstdc++/107886] Problem witch std::latch, std::binary_semaphores in C++20

2022-11-27 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

--- Comment #4 from Jamaika  ---
I test gcc 13.0.0. No change

[Bug libstdc++/107886] Problem witch std::latch, std::binary_semaphores in C++20

2022-11-27 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

--- Comment #3 from Jamaika  ---
(In reply to Andrew Pinski from comment #2)
> Also it might be the case mingw work is needed to support
> __cpp_lib_atomic_wait and all.

I test gcc 13.0.0. No change.
http://msystem.waw.pl/x265/mingw-gcc1300-20221124.7z

[Bug libstdc++/107886] Problem witch std::latch, std::binary_semaphores in C++20

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

--- Comment #2 from Andrew Pinski  ---
Also it might be the case mingw work is needed to support __cpp_lib_atomic_wait
and all.

[Bug libstdc++/107886] Problem witch std::latch, std::binary_semaphores in C++20

2022-11-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

--- Comment #1 from Andrew Pinski  ---
Have you tried GCC 12? As C++20 support was barely there for GCC 11.
For an example r12-10-gb52aef3a8cbcc8 improved latch support.

[Bug c++/107886] New: Problem witch std::latch, std::binary_semaphores in C++2a

2022-11-27 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107886

Bug ID: 107886
   Summary: Problem witch std::latch, std::binary_semaphores in
C++2a
   Product: gcc
   Version: 11.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lukaszcz18 at wp dot pl
  Target Milestone: ---

https://github.com/meganz/mingw-std-threads/issues/67

[Bug c++/99576] [coroutines] destructor of a temporary called too early within co_await expression

2022-11-27 Thread adrian.perl at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99576

--- Comment #11 from Adrian Perl  ---
Yeah, my mistake. My IDE failed to look up the function and a short search on
the internet revealed only builtin_trap
(https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html)

You just saved me hours with the -j hint! I assumed it was not applicable as it
is not used in the guide (https://gcc.gnu.org/contribute.html).

Thanks

[Bug c++/99576] [coroutines] destructor of a temporary called too early within co_await expression

2022-11-27 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99576

--- Comment #10 from Iain Sandoe  ---
(In reply to Adrian Perl from comment #9)
> Thanks for the advice.
> 
> I hope you meant __builtin_trap() as I can't find a __builtin_abort()
> function.

hmm .. I meant __builtin_abort () ... it is widely used in the testsuite for
the reasons mentioned (try grepping for it in gcc/testsuite/gcc.dg to see some
examples).

> I have now written test applications for all relevant bug reports (99576,
> 100611, 101976, 101367). 

great!

> I also verified that it fixes 107288, but did not add a test as it requires 
> boost asio.

The way to deal with cases like that is to take the .ii file (so that the
dependencies on external headers are removed) and then reduce it to something
usable as a test.  Such reductions vary in difficulty (using tools like c-vise
or creduce can help, sometimes it's possible to do it manually too).  [I'm not
asking you to do this right now, but mentioning that this is the approach used
in such cases].

> Unfortunately I was wrong that the patch will fix 102217 and 101244. They
> use similar examples but also the ternary operator, which still leads to an
> invalid statement error when used in co_awaits.

Yes, this is a different problem for which I have some work in progress, but
not ready for publication just yet.

> I will send the patch together with the testfiles as soon as the testsuite
> has finished. Is it normal that it takes more than 6 hours to complete?

depends on your hardware .. my fastest box takes about 2 hours, my slowest
nearly a week :) .. so long as you are using "-jN" on the make line where N ≈
the number of threads your hardware will accommodate, that's about the best you
can do.

If you plan on working more with GCC there is also the option to get an account
on the "compile farm" which gives you access to more platform versions and some
quite powerful hardware.

[Bug c++/99576] [coroutines] destructor of a temporary called too early within co_await expression

2022-11-27 Thread adrian.perl at web dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99576

--- Comment #9 from Adrian Perl  ---
Thanks for the advice.

I hope you meant __builtin_trap() as I can't find a __builtin_abort() function.

I have now written test applications for all relevant bug reports (99576,
100611, 101976, 101367). I also verified that it fixes 107288, but did not add
a test as it requires boost asio.

Unfortunately I was wrong that the patch will fix 102217 and 101244. They use
similar examples but also the ternary operator, which still leads to an invalid
statement error when used in co_awaits.

I will send the patch together with the testfiles as soon as the testsuite has
finished. Is it normal that it takes more than 6 hours to complete?

[Bug demangler/107884] H8/300: cp-demangle.c fix warning related demangle.h

2022-11-27 Thread uaa at mx5 dot nisiq.net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107884

--- Comment #2 from SASANO Takayoshi  ---
Hello, can you tell me more details to do?
I think "some better way" seems to be one of them as follows.

1) change "#if __INT_WIDTH__ > 16 ~ #else ~ #endif" to
   "#if defined(__INT_WIDTH__) && (__INT_WIDTH__ <= 16) ~ #else ~ #endif"
   to safer choice.

2) remove "#define DMGL_OPT_BIT(x)", all "#define DMGL_..." uses (1 << x).

3) abandon remap bit position for int=16bit architecture,
   modify codes that can pass 32bit-value option.

4) others (I have no idea...)

please tell me.