date:20240313

[Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a))

2024-03-13 Thread xxs_chy at outlook dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331

Bug ID: 114331
   Summary: Missed optimization: indicate knownbits from
dominating condition switch(trunc(a))
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xxs_chy at outlook dot com
  Target Milestone: ---

Godbolt link: https://godbolt.org/z/dso53ndTo
For code like:

int src(int num) {
switch((short)num){
case 111:
  return num & 0xfffe;
case 267:
case 204:
case 263:
  return 0;
default:
  dummy();
  return 0;
}
}

"num & 0xfffe" can be folded to "110". But both LLVM and GCC fail to fold it.

[Bug libgcc/114327] `-CST % 1` is wrong for _BitInt()

2024-03-13 Thread zsojka at seznam dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327

--- Comment #3 from Zdenek Sojka  ---
It's not only % 1; wrong results are also for:
  B x = foo (3, -0x9e9b9fe60);

or for

B
foo (char c, B b)
{
  return b / c;
}


  B x = foo (-0x6, 0); /* 0 / -6 = 0 */

in all these cases, the result is the same: -1 << 64.

[Bug libfortran/114304] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-03-13 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #18 from Jeffrey A. Law  ---
I don't have an opinion on the Fortran patch -- I think it's up to the Fortran
front-end maintainers to make that decision.

Given there's still a regression here, I'll put the marker back.

[Bug driver/114330] needs_preprocessing field of struct compiler is unused

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-03-14
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #4 from Andrew Pinski  ---
(In reply to Sam James from comment #2)
> git log -G needs_preprocessing -p indicates r0-102965-gc3224d6f70eefb

Oh yes when -combine support was removed. It was added in -combine support was
added in r0-57561-g0855eab7a30bb9 . combinable field was added at the same time
but combinable was used afterwards for go (and D and a few others).

So I will handle this for GCC 15. I thought it was added much earlier.

[Bug driver/114330] needs_preprocessing field of struct compiler is unused

2024-03-13 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330

--- Comment #3 from Sam James  ---
(I think it was dead before, but it should've been removed by then)

[Bug driver/114330] needs_preprocessing field of struct compiler is unused

2024-03-13 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #2 from Sam James  ---
git log -G needs_preprocessing -p indicates r0-102965-gc3224d6f70eefb

[Bug libgcc/114327] `-CST % 1` is wrong for _BitInt()

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327

--- Comment #2 from Andrew Pinski  ---
For (ignore the strict aliasing issue):
```
typedef signed _BitInt(256) B;

[[gnu::noinline]]
B
foo (signed char c, B b)
{
  return b % c;
}

int
main (void)
{
  B x = foo (1, -3); // -3 % 1 -> 0
 // if (x)
 //   __builtin_abort();
signed long *t = (signed long *)
  for(int i = 0;i < sizeof(B)/sizeof(long); i++ )
  {
__builtin_printf("%lx\n", t[i]);
  }
  return 0;
}
```
We get:
```
0



```

Which makes it seem like we are doing the sign extend when the value was the
result was 0.

Even:
> B x = foo (3, -3); // -3 % 3 -> 0

Gives the wrong similar result.

[Bug libgcc/114327] `-CST % 1` is wrong for _BitInt()

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327

Andrew Pinski  changed:

   What|Removed |Added

Summary|wrong code with _BitInt()   |`-CST % 1` is wrong for
   |modulo at -O0   |_BitInt()
 Status|UNCONFIRMED |NEW
 CC||pinskia at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2024-03-14

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread juzhe.zhong at rivai dot ai via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #28 from JuzheZhong  ---
The original cost model I did work for all cases but with some middle-end
changes
the cost model failed.

I don't have time to figure out what's going on here.

Robin may be interested at it.

[Bug driver/114330] needs_preprocessing field of struct compiler is unused

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330

--- Comment #1 from Andrew Pinski  ---
[apinski@xeond2 gcc]$ git grep needs_preprocessing
gcc.cc:  int needs_preprocessing;   /* If nonzero, source files need to
lto/lang-specs.h:   /*cpp_spec=*/NULL, /*combinable=*/1,
/*needs_preprocessing=*/0},

That is the only references that grep could find even for needs_preprocessing.

[Bug driver/114330] New: needs_preprocessing field of struct compiler is unused

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330

Bug ID: 114330
   Summary: needs_preprocessing field of struct compiler is unused
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: internal-improvement
  Severity: enhancement
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

I suspect needs_preprocessing  field became unused when the C preprocessor
became itergrated into the cc1.

while it does not hurt anything to have the field still around and only the
".c" sets it to true, it seems like a decent idea to remove the field.

Also note it might be useful to boolize combinable in struct compiler in gcc.cc
too.

[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-03-13
 Status|UNCONFIRMED |NEW

--- Comment #3 from Andrew Pinski  ---
(In reply to Pali Rohár from comment #2)
> Andrew, I do not know what is gcc driver nor what to do for it. But if you
> can show me some pointers, I can try it.
> 
> Or if you need more details about files, usage, etc... please let me know.

See gcc.cc (default_compilers).
It contains a mapping from suffix to language and language and how to
"compile/assemble" the files.

See also */lang-specs.h which are included via specs.h (specs.h is a generated
file while building, see the makefile there and depends on which language is
enabled).

Most likely you would add a new target macro which adds to that part of the
gcc.cc and define that macro in the mingw headers.

[Bug target/109317] -Os generates bigger code than -O2 on 32-bit ARM

2024-03-13 Thread pali at kernel dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109317

--- Comment #3 from Pali Rohár  ---
Do you need some more input or test data about this issue?

[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc

2024-03-13 Thread pali at kernel dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866

--- Comment #2 from Pali Rohár  ---
Andrew, I do not know what is gcc driver nor what to do for it. But if you can
show me some pointers, I can try it.

Or if you need more details about files, usage, etc... please let me know.

[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326

--- Comment #3 from Andrew Pinski  ---
(In reply to ptomsich from comment #2)
> To copy the last piece of info from our internal tracker...
> 
> LLVM learned this new trick only in the run-up to LLVM 18.
> Up until then, GCC and LLVM performed identically on this snippet.

Yes it looks like it is pattern matching what I suggested (well with and
without the and).

Note we do need another pattern, one without the bit_and:
(simplify
 (bit_ior
  (ne@n4 @0 @1)
  (cmp
   (bit_xor @0 @1)
   @2))
 (bit_ior @n4 
  (cmp { build_zero_cst (TREE_TYPE (@0)); } @2))
)

And we need one more for bit_ior:
(simplify
 (bit_ior
  (ne@n4 @0 @1)
  (cmp
   (bit_ior
(bit_xor @0 @1)
@2)
   @3))
 (bit_ior @n4 
  (cmp @2 @3))
)

Note it looks like clang does not handle non-contants that well, (they handle d
== 0 though).

E.g.:
```
int foo(void);
int cmp1(unsigned d1, unsigned d2, unsigned c, unsigned d) {
  int t = ((d1 ^ d2) & c ) == (d);
  int t1 = d1 != d2;
  int tt = t | t1;
  return tt;
}

```

Should be optimized to:
int foo(void);
int cmp1(unsigned d1, unsigned d2, unsigned c, unsigned d) {
  int t = 0 == d;
  int t1 = d1 != d2;
  int tt = t | t1;
  return tt;
}
```

[Bug target/108849] __declspec(code_seg("segname")) does not work

2024-03-13 Thread pali at kernel dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108849

--- Comment #3 from Pali Rohár  ---
Arsen, so based on my understooding (please correct me if I'm wrong), gcc's
"section" can be used on both code (functions) and data (global variables). And
ms's "code_seg" can be used only on code (functions).

So if gcc adds __declspec(code_seg("segname")) as alias to
__declspec(section("segname")) for TARGET_DECLSPEC then it should be OK for
valid source code. However it does not throws an compile error if
__declspec(code_seg("segname")) is specified on data. But I think it is
acceptable. Primary motivation is support for compiling valid source code.

Are you able to add this alias?

[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86

2024-03-13 Thread pali at kernel dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319

--- Comment #8 from Pali Rohár  ---
Thanks for quick response and fixup of this issue.

[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.

2024-03-13 Thread ptomsich at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326

--- Comment #2 from ptomsich at gcc dot gnu.org ---
To copy the last piece of info from our internal tracker...

LLVM learned this new trick only in the run-up to LLVM 18.
Up until then, GCC and LLVM performed identically on this snippet.

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #27 from Patrick O'Neill  ---
(In reply to Andrew Pinski from comment #26)
> (In reply to Edwin Lu from comment #25)
> > It's still persisting on trunk (at least for pr113281-1.c
> > https://godbolt.org/z/M9EK44hKe)
> 
> I looked into what the vectorizer produces:
>   vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8;
>   _22 = (int) a.4_25;
>   vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872
> } >> vect__22.13_31;
>   _12 = 32872 >> _22;
>   vect_b_7.15_34 = (vector(8) short int) vect__12.14_33;
> 
> that is valid thing to do. That is do the shift in `vector(8) int` and then
> do a truncation. The issue originally was about doing the shift in
> `vector(8) short` which is not happening here.

The regressed testcase looks like its testing if riscv vectorizes the code at
all (the first issue Juzhe noted in comment #3 and then fixed). So this is a
performance regression for risc-v, not correctness.

[Bug tree-optimization/114329] New: ICE: verify_gimple failed: 'bit_field_ref' of non-mode-precision operand with bitfield _BitInt()

2024-03-13 Thread zsojka at seznam dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114329

Bug ID: 114329
   Summary: ICE: verify_gimple failed: 'bit_field_ref' of
non-mode-precision operand with bitfield _BitInt()
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57690
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57690=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc testcase.c
testcase.c: In function 'foo':
testcase.c:6:1: error: 'bit_field_ref' of non-mode-precision operand
6 | foo(void)
  | ^~~
# .MEM_20 = VDEF <.MEM_19>
BIT_FIELD_REF  = _9;
during GIMPLE pass: bitintlower0
testcase.c:6:1: internal compiler error: verify_gimple failed
0x155f56d verify_gimple_in_cfg(function*, bool, bool)
/repo/gcc-trunk/gcc/tree-cfg.cc:5663
0x13ce234 execute_function_todo
/repo/gcc-trunk/gcc/passes.cc:2088
0x13ce78e execute_todo
/repo/gcc-trunk/gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-9454-20240313184120-g11caf47b599-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-9454-20240313184120-g11caf47b599-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240313 (experimental) (GCC)

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #26 from Andrew Pinski  ---
(In reply to Edwin Lu from comment #25)
> It's still persisting on trunk (at least for pr113281-1.c
> https://godbolt.org/z/M9EK44hKe)

I looked into what the vectorizer produces:
  vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8;
  _22 = (int) a.4_25;
  vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872 }
>> vect__22.13_31;
  _12 = 32872 >> _22;
  vect_b_7.15_34 = (vector(8) short int) vect__12.14_33;

that is valid thing to do. That is do the shift in `vector(8) int` and then do
a truncation. The issue originally was about doing the shift in `vector(8)
short` which is not happening here.

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread ewlu at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Edwin Lu  changed:

   What|Removed |Added

 CC||ewlu at rivosinc dot com

--- Comment #25 from Edwin Lu  ---
(In reply to Richard Sandiford from comment #24)
> Fixed on trunk so far, but it's latent on branches.  I'll see what
> the trunk fallout is like before asking about backports.

It looks like we have a regression for riscv 

I was going through the scan dump failures on trunk and ended up revisiting
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463 where
gcc.dg/vect/costmodel/riscv/rvv/pr113281-[125].c are failing the scan-dump
checks. I didn't realize at the time that the scan dumps were checking code
correctness and ended up ignoring it. 

It's still persisting on trunk (at least for pr113281-1.c
https://godbolt.org/z/M9EK44hKe)

A bisection on https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463
commit range suggests
https://gcc.gnu.org/g:1a8261e047f7a2c2b0afb95716f7615cba718cd1 introduced it.

# first bad commit: [1a8261e047f7a2c2b0afb95716f7615cba718cd1] vect: Tighten
vect_determine_precisions_from_range [PR113281]

Configuration
../configure --prefix=$(pwd) --with-multilib-generator="rv64gcv-lp64d--"
make stamps/build-gcc-linux-stage1 -j 32

Testing
./build-gcc-linux-stage1/gcc/cc1  
../gcc/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c 
-march=rv64gcv -mabi=lp64d -mtune=rocket -mcmodel=medlow  
-fdiagnostics-plain-output  -march=rv64gcv_zvl256b -mabi=lp64d -O3
-ftree-vectorize -ffat-lto-objects -fno-ident   -o pr113281-1.s

[Bug libstdc++/114325] [14 Regression] std::format gives incorrect results for negative numbers

2024-03-13 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325

--- Comment #2 from Jonathan Wakely  ---
Indeed. Here's the fix:

--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -4124,14 +4124,14 @@ namespace __format
__uval = make_unsigned_t<_Tp>(~__arg) + 1u;
  else
__uval = __arg;
- const auto __n = __detail::__to_chars_len(__uval) + __neg;
- if (auto __res = __sink_out._M_reserve(__n))
+ const auto __n = __detail::__to_chars_len(__uval);
+ if (auto __res = __sink_out._M_reserve(__n + __neg))
{
  auto __ptr = __res.get();
  *__ptr = '-';
  __detail::__to_chars_10_impl(__ptr + (int)__neg, __n,
   __uval);
- __res._M_bump(__n);
+ __res._M_bump(__n + __neg);
  __done = true;
}
}

[Bug target/114328] New: Using -march=armv9-a+nosve does not allow for vectorization

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114328

Bug ID: 114328
   Summary: Using -march=armv9-a+nosve does not allow for
vectorization
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
CC: mjr19 at cam dot ac.uk
Blocks: 53947
  Target Milestone: ---
Target: aarch64

Created attachment 57689
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57689=edit
Testcase

I noticed this while looking into PR 114324, the cost model for
-march=armv9-a+nosve causes this code not to be vectorized using ld2/st2 using
the SIMD (non-SVE) registers.

I don't understand why though because -march=armv8.4-a still does though.

Note this is all at -Ofast.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug libstdc++/114325] [14 Regression] std::format gives incorrect results for negative numbers

2024-03-13 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325

Jonathan Wakely  changed:

   What|Removed |Added

  Known to work||13.2.1
   Target Milestone|--- |14.0
   Last reconfirmed||2024-03-13
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
Summary|std::format gives incorrect |[14 Regression] std::format
   |results for negative|gives incorrect results for
   |numbers |negative numbers
  Known to fail||14.0
   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org

[Bug tree-optimization/114324] [13/14 Regression] AVX2 vectorisation performance regression with gfortran 13/14

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Target Milestone|--- |12.4
   Last reconfirmed||2024-03-13
 Status|UNCONFIRMED |NEW
Summary|AVX2 vectorisation  |[13/14 Regression] AVX2
   |performance regression with |vectorisation performance
   |gfortran 13/14  |regression with gfortran
   ||13/14
 Blocks||53947
  Component|target  |tree-optimization

--- Comment #1 from Andrew Pinski  ---
Definitely there is some vectorization changes happening. 
Confirmed.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug c/53548] allow flexible array members in unions like zero-length arrays

2024-03-13 Thread qinzhao at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53548

qinzhao at gcc dot gnu.org changed:

   What|Removed |Added

 CC||qinzhao at gcc dot gnu.org
 Status|RESOLVED|REOPENED
 Resolution|WONTFIX |---

--- Comment #9 from qinzhao at gcc dot gnu.org ---
I think that we need to add this support as an GCC extension

[Bug libfortran/114304] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-03-13 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

--- Comment #17 from Jerry DeLisle  ---
(In reply to Jeffrey A. Law from comment #16)
> Per c#12, c#13, c#14 & c#15, dropping the regression marker, but leaving
> open.

Interestingly, the remaining part of this bug is also a regression, it just
does not break LAPACK. Reverting this change fixes it which means the new test
for pr105473 will fail. I have an idea where to put this check in
read_complex() but I have not finished this and tested it.

Jeffrey, if you would like me to push this, let me know. We can mark
pr105473.f90 in the test suite to XFAIL or comment out the one check there that
fails.

diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index fb3f7dbc34d..c178acd61a5 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -471,8 +471,6 @@ eat_separator (st_parameter_dt *dtp)
 case ',':
   if (dtp->u.p.current_unit->decimal_status == DECIMAL_COMMA)
{
- generate_error (>common, LIBERROR_READ_VALUE,
-  "Comma not allowed as separator with DECIMAL='comma'");
  unget_char (dtp, c);
  break;
}

[Bug libgcc/114327] New: wrong code with _BitInt() modulo at -O0

2024-03-13 Thread zsojka at seznam dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327

Bug ID: 114327
   Summary: wrong code with _BitInt() modulo at -O0
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
CC: jakub at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57688
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57688=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc testcase.c
$ ./a.out 
Aborted

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-9441-20240312154250-gef79c64cb57-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-9441-20240312154250-gef79c64cb57-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240312 (experimental) (GCC)

[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-03-13

--- Comment #1 from Andrew Pinski  ---
  _1 = d1_5(D) ^ d2_6(D);
  _2 = _1 & 43981;
  _10 = d1_5(D) != d2_6(D);
  _11 = _2 == 0;
  _12 = _10 | _11;

(d1 != d2) | ((d1 ^ d2) & CST) == 0)

Confirmed.

Obvious if the first part is false then d1 ^ d2 will be 0.

This will work though maybe there is another place where this can be handled
...

(simplify
 (bit_ior
  (ne@n4 @0 @1)
  (cmp
   (bit_and
(bit_xor @0 @1)
@2)
   @3))
 (bit_ior @n4 
  (cmp { build_zero_cst (TREE_TYPE (@0)); } @3))
)

[Bug fortran/114023] complex part%ref of complex named constant array cannot be used in an initialization expression.

2024-03-13 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114023

--- Comment #3 from Steve Kargl  ---
On Wed, Mar 13, 2024 at 06:02:58PM +, jvdelisle at gcc dot gnu.org wrote:
> 
> --- Comment #2 from Jerry DeLisle  ---
> Steve, Anuj is interested in digging in on this one. This will be a learning
> experience.
> 

That's fine with.  If Anuj or you have questions or
want me to look at something, just ping me.

[Bug c++/102345] [modules] Cannot define a module interface unit for anything in

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102345

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
   Keywords|rejects-valid   |diagnostic

--- Comment #4 from Patrick Palka  ---
IIUC we're correct to reject since built-ins are implicitly attached to the
global module and here we're trying to redeclare one in another module?

Perhaps the diagnostic could be improved here though.  Clang gives

  error: declaration of 'operator new' in module newdel follows declaration in
the global module

[Bug c++/103524] [meta-bug] modules issue

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 101000, which changed state.

Bug 101000 Summary: ICE when trying to import the 
absl/container/flat_hash_map.h as a header module
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101000

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c++/101000] ICE when trying to import the absl/container/flat_hash_map.h as a header module

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101000

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED

--- Comment #2 from Patrick Palka  ---
This seems to work with GCC trunk now.

[Bug target/114310] [11/12/13/14 Regression] [aarch64] __sync_val_compare_and_swap fails on __int128_t with newval = 0

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114310

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||jakub at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Created attachment 57687
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57687=edit
gcc14-pr114310.patch

Untested fix.
The lack of on aarch64_reg_or_zero/rZ for the desired operand of
aarch64_compare_and_swapti_lse looks correct, because the instructions expect
a pair of registers, so one can't use there xzr, xzr.

[Bug fortran/114023] complex part%ref of complex named constant array cannot be used in an initialization expression.

2024-03-13 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114023

Jerry DeLisle  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-03-13

--- Comment #2 from Jerry DeLisle  ---
Steve, Anuj is interested in digging in on this one. This will be a learning
experience.

[Bug libgcc/111731] [13/14 regression] gcc_assert is hit at libgcc/unwind-dw2-fde.c#L291

2024-03-13 Thread dimitar.yordanov at sap dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111731

--- Comment #17 from Dimitar Yordanov  ---
I've executed more tests and see another one failing. This time "begin" is
inside of another range, not the one that gets calculated with this "begin". So
there is again an overlapping in the btree. Could we maybe use two trees, one
for "begin" and one for the ranges?

[Bug fortran/114001] is_contiguous considers unlimited polymorphic dummy always as contiguous

2024-03-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114001

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:11caf47b599568c6c6f5a12cf8e21f50778176d3

commit r14-9454-g11caf47b599568c6c6f5a12cf8e21f50778176d3
Author: Harald Anlauf 
Date:   Tue Mar 12 22:58:39 2024 +0100

Fortran: fix IS_CONTIGUOUS for polymorphic dummy arguments [PR114001]

gcc/fortran/ChangeLog:

PR fortran/114001
* expr.cc (gfc_is_simply_contiguous): Adjust logic so that CLASS
symbols are also handled.

gcc/testsuite/ChangeLog:

PR fortran/114001
* gfortran.dg/is_contiguous_4.f90: New test.

[Bug c++/99000] [modules] declaration std::__copy_move_a2 conflicts with import

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99000

Patrick Palka  changed:

   What|Removed |Added

 CC||iains at gcc dot gnu.org

--- Comment #3 from Patrick Palka  ---
*** Bug 110447 has been marked as a duplicate of this bug. ***

[Bug c++/110447] [modules] unexpected attachment of GMF decls to a named module.

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110447

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 CC||ppalka at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Patrick Palka  ---
dup of PR99000 AFAICT

*** This bug has been marked as a duplicate of bug 99000 ***

[Bug c++/103524] [meta-bug] modules issue

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 110447, which changed state.

Bug 110447 Summary: [modules] unexpected attachment of GMF decls to a named 
module.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110447

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/103524] [meta-bug] modules issue

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 106363, which changed state.

Bug 106363 Summary: [13 Regression] [modules] ICE using-declaration of imported 
name in the same namespace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106363

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c++/106363] [13 Regression] [modules] ICE using-declaration of imported name in the same namespace

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106363

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
 CC||ppalka at gcc dot gnu.org
   Target Milestone|13.3|14.0

--- Comment #8 from Patrick Palka  ---
IIUC this checking-only ICE is not actually a regression so let's mark this as
fixed for 14 only.

[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193

2024-03-13 Thread amacleod at redhat dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151

--- Comment #23 from Andrew Macleod  ---
Created attachment 57686
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57686=edit
another patch



(In reply to Richard Biener from comment #22)
> (In reply to Andrew Macleod from comment #21)

> > 
> > And have that all work with general trees expressions.. That would solve
> > much of this for you?
> 
> Yes, I wouldn't mind if range_on_{entry,exit} handle general tree
> expressions,
> there's enough APIs to be confused with already ;)
> 
> > 

I promoted range_on_exit and range_on_entry to be part of the API in this
patch. This brings valeu_query in line with rangers basic 5 routine API.   It
also tweaks rangers versions to handle tree expressions.  It bootstraps and
shows no regressions, with the caveat that I haven't actually tested the usage
of range_on_entry and exit with arbitrary trees.   As you can see, I didnt
change much... so it should work OK.

> > 
> > 
> > > 
> > > Interestingly enough we somehow still need the
> > > 
> > 
> > > 
> > > hunk of Andrews patch to do it :/
> > > 
> > 
> > That probably means there is another call somewhere in the chain with no
> > context. However, I will say that functionality is more important than it
> > seems. Should have been there from the start :-P.
> 
> Possibly yes.  It might be we fill rangers cache with VARYING and when
> we re-do the query as a dependent one but with context we don't recompute
> it?  I also only patched up a single place in SCEV with the context so
> I possibly missed some others that end up with a range query, for example
> through niter analysis that might be triggered.


My guess is the latter. Without a context and with that change, ranger
evaluates the definition with the context at the location of the def, then
simply uses that value.  If anything it is dependent on later changes, the
temporal cache should indicate it's out of date and trigger a new fold using
current values.

[Bug c++/114292] [11/12/13/14 Regression] ICE with a generic (templated) lambda capturing a constant for VLA allocation

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114292

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org,
   ||mpolacek at gcc dot gnu.org,
   ||ppalka at gcc dot gnu.org
   Priority|P3  |P2

--- Comment #4 from Jakub Jelinek  ---
void
foo (int c)
{
  constexpr int r = 4;
  [&] (auto) { int n = r * c; int t[n]; } (0);
  [&] (auto) { int t[c]; } (0);
  [&] (auto) { int t[r]; } (0);
  [&] (auto) { int t[c * 4]; } (0);
}

works fine though.

[Bug c++/114292] [11/12/13/14 Regression] ICE with a generic (templated) lambda capturing a constant for VLA allocation

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114292

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Started with r8-7213-g1577f10a637352b4fe7fb4a4c0fd672a96c84f58

[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652

--- Comment #9 from Jakub Jelinek  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #8)
> FWIW, the iconv conversion tables in /usr/lib/iconv can be regenerated
> from the OpenSolaris sources, modified not to do that '?' conversion.
> Worked for a quick check for the UTF-8 -> ASCII example, but the '?' is
> more prevalent and would need to be eradicated upstream.

If it is always '?' used instead of unknown character, we could also have some
hack on the libcpp side for it.
Like (but limited to Solaris hosts) in convert_using_iconv when converting from
SOURCE_CHARSET to some other character set don't try to convert the whole UTF-8
string at once, but split it into chunks at u'?' characters, so
foo???bar?baz?qux
would be iconv converted as
foo
???
bar
?
baz
?
qux
chunks.  And when converting the non-? chunks, it would after the conversion
check for the '?' character (in the destination character set - that is
something that perhaps could be queried during initialization after iconv_open)
and treat it as an error if it appeared there.  Or always convert also back to
UTF-8 and check if it has more '?' characters than the source.

[Bug ada/106037] internal error with Aggregate aspect on array type

2024-03-13 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106037

Eric Botcazou  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|--- |13.3
Summary|ICE with Aggregate aspect   |internal error with
   ||Aggregate aspect on array
   ||type
 Resolution|--- |FIXED

--- Comment #5 from Eric Botcazou  ---
commit ec48b99c24a422bf97af91e82203d23b69094e7c
Author: Marc Poulhiès 
Date:   Wed Mar 8 20:39:45 2023 +0100

ada: Fix error message for Aggregate aspect

The error message was wrongly using % instead of & in the format string,
causing the displayed message to refer to incorrect names in some cases.

gcc/ada/

* sem_ch13.adb (Check_Aspect_At_Freeze_Point): fix format string,
use existing local Ident.

commit 3da0e4ae25f15949f87e74aa96a03b47e51a9ff3
Author: Marc Poulhiès 
Date:   Mon Mar 6 12:15:13 2023 +0100

ada: Fix (again) incorrect handling of Aggregate aspect

Previous fix stopped the processing of the Aggregate aspect early,
skipping the call to Record_Rep_Item, making later call to
Resolve_Container_Aggregate fail.

Also, the previous fix would not handle correctly the case where the
type is private and the check for non-array type can only be done at the
freeze point with the full type.

Adapt the resolving of the aspect when the input is not correct and the
parameters can't be resolved.

gcc/ada/

* sem_ch13.adb (Analyze_One_Aspect): Call Record_Rep_Item.
   (Check_Aspect_At_Freeze_Point): Check the aspect is specified on
non-array type only...
(Analyze_One_Aspect): ... instead of doing it too early here.
* sem_aggr.adb (Resolve_Container_Aggregate): Do nothing in case
the parameters failed to resolve.

commit fd694822ca6eda8b08fea10fcabdb0ad508a963e
Author: Marc Poulhiès 
Date:   Tue Feb 28 17:10:29 2023 +0100

ada: Fix incorrect handling of Aggregate aspect

This change fixes 2 incorrect handlings of the aspect.
The arguments are now correctly resolved and the aspect is rejected on
non array types.

gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Mark Aggregate aspect as
needing delayed resolution and reject the aspect on non-array
type.

[Bug target/99829] MVE: ICE in lra_assign at -O3

2024-03-13 Thread vmakarov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99829

--- Comment #7 from Vladimir Makarov  ---
(In reply to Maxim Kuvyrkov from comment #5)
> 
> Where did you see the timeouts, btw?

Sorry, I glanced at c logs and interpreted it wrongly.  Please, discard my
previous comment.

I should been more accurate with reading the PR.  I've tried c compiler instead
of c++ one.  Therefore I did not reproduce the bug.  But the bug is really
present for c++ compiler.

I'll work on this PR and try to fix this on this or the next week.

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-03-13 Thread hubicka at ucw dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #57 from Jan Hubicka  ---
> So, we can punt on differences there (that is desirable for backporting and
> maybe GCC 14 too), or we could at that point populate an int vector, which 
> maps
Yep, that is what I do.
I had bug in that so I am re-running (forgot to check that callers and
callee argument count matches and this cuases ICE during LLVM LTO link).
It seems these extra checks makes no difference in practice. 
During bootstrap there are no pairs of functions during bootstrap where
we new checks punt on value range difference or jump function
difference that would be merged otherwise.

Most common case where we could merge but we don't are those triggered
by TBAA.
> the callee
> vector indexes to indexes in the callee vector in the other candidate 
> function.
> If unsuccessful, we just free the vector, if successful, we first walk all the
> callees and union stuff in there using that vector.
This is the plan for metadata merging. A small complication here is that
ICF works by comparing bodies to a leader of equivalence class but this
leader is not necessarilly the surviving function body.  So if we
compared A to L (leader) and B to L and then decided replace A by B, we
need to be able to combine the permutations so we know how to map call
sites in A to ones in B.  The same is true about SSA names and basic
blocks.  I have patch for that for next stage1.

[Bug ada/106037] ICE with Aggregate aspect

2024-03-13 Thread simon at pushface dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106037

simon at pushface dot org changed:

   What|Removed |Added

 CC||simon at pushface dot org

--- Comment #4 from simon at pushface dot org ---
This is illegal code: 'aspect "Aggregate" can only be applied to non-array
type'.

See https://groups.google.com/g/comp.lang.ada/c/FHWcqk1SWRM/m/sYTWUHQxAgAJ,
and the (slightly unemphatically worded) ARM 4.3.5(2), "For a type other
than an array type, the following type-related operational aspect may be
specified"

GNAT 14.0.1 20240223 (experimental)
Copyright 1992-2024, Free Software Foundation, Inc.


Compiling: container_aggregates.adb
Source file time stamp: 2024-03-13 15:04:00
Compiled at: 2024-03-13 15:04:53

 1. procedure Container_Aggregates is
 2.
 3.type Array_Type is
 4.  array (1 .. 10) of Integer
 5.with Aggregate => (Empty => Empty_Array);
12 3
>>> error: aspect "Aggregate" can only be applied to non-array type
>>> error: incomplete specification for aggregate
>>> error: object "Empty_Array" cannot be used before end of its
declaration
>>> error: improper aggregate operation for "Array_Type"

 6.
 7.Empty_Array : constant Array_Type := [1..10 => 123];
 8.
 9. begin
10.null;
11. end Container_Aggregates;
12.

 12 lines: 4 errors

[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1

2024-03-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261

--- Comment #10 from Alexander Monakov  ---
Indeed, but OTOH according to bug 84402 comment 58 it caused a noticeable hit
on gimple-match.cc compilation:

733a1b777f16cd397b43a242d9c31761f66d3da8 13th January 2023
sched-deps: do not schedule pseudos across calls [PR108117] (Alexander Monakov)
Stage 2: +14%
Stage 3: +9%


In any case, if the proposed band-aid is unnecessary, that's fine with me.

[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs

2024-03-13 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652

--- Comment #8 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #7 from Jakub Jelinek  ---
> (In reply to r...@cebitec.uni-bielefeld.de from comment #6)
>> > --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE > > Uni-Bielefeld.DE> ---
>> >> --- Comment #4 from Jakub Jelinek  ---
>> >> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1
>> >> that program is ill-formed if some character lacks encoding in the 
>> >> execution
>> >> character set, I'm afraid the Solaris iconv behavior results in violation 
>> >> of
>> 
>> Although I can barely wrap my head around the standardese there, I had a
>> look at n4928 (the last? C++23 draft), which has a different wording
>> here (p.25, 5.13.3):
>
> The testcase is for a C++26 feature, which made those ill-formed.

Should have been obvious from the pathname ;-(  N4971 has that wording...

>> The current Solaris iconv behaviour certainly isn't particularly
>> intuitive and I'll ask the Solaris engineers about it.  However, there's
>> the question what to do about the testcase?  Just xfail it on Solaris or
>> omit just the two affected subtests there?
>
> xfailing is one possibility, but then on Solaris we'll never support C++26
> properly.

I guess it's the best solution in the short term (GCC 14), though.

> Or require using GNU libiconv rather than Solaris iconv if it can't deal with
> that?

At least document the suggestion in install.texi; I wouldn't make it a
hard requirement yet.  I'll also wait what the Solaris engineers can
provide on background for the current behaviour.

FWIW, the iconv conversion tables in /usr/lib/iconv can be regenerated
from the OpenSolaris sources, modified not to do that '?' conversion.
Worked for a quick check for the UTF-8 -> ASCII example, but the '?' is
more prevalent and would need to be eradicated upstream.

[Bug ada/111909] Filename case sensitivity defaulted wrongly on macOS

2024-03-13 Thread simon at pushface dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111909

simon at pushface dot org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from simon at pushface dot org ---
Fixed on mainline.

[Bug libstdc++/114325] std::format gives incorrect results for negative numbers

2024-03-13 Thread mwd at md5i dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325

Michael Duggan  changed:

   What|Removed |Added

 CC||mwd at md5i dot com

--- Comment #1 from Michael Duggan  ---
I will note that, in experiments, this seems to solely happen with "{}".  If
anything else is in the format string, it works correctly.  This is probably a
bug in the fairly recent codepath that optimizes the "{}" case.

[Bug tree-optimization/94094] [meta-bug] store-merging and/or bswap load/store-merging missed optimizations

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94094
Bug 94094 depends on bug 114319, which changed state.

Bug 114319 Summary: htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||jakub at gcc dot gnu.org
 Status|NEW |RESOLVED

--- Comment #7 from Jakub Jelinek  ---
Fixed for GCC 14.

[Bug target/113618] [14 Regression] AArch64: memmove idiom regression

2024-03-13 Thread wilco at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113618

Wilco  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Wilco  ---
Fixed.

[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86

2024-03-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:74bca21db31e3f4ab6543b56c3f26b4dfe586fef

commit r14-9453-g74bca21db31e3f4ab6543b56c3f26b4dfe586fef
Author: Jakub Jelinek 
Date:   Wed Mar 13 15:34:59 2024 +0100

store-merging: Match bswap64 on 32-bit targets with bswapsi2 [PR114319]

gimple-ssa-store-merging.cc tests bswap_optab in 3 different places,
in 2 of them it has special exception for double-word bswap using pair
of word-mode bswap optabs, but in the last one it doesn't.

The following patch changes even the last spot.
We don't handle 128-bit bswaps in the passes at all, because currently we
just use uint64_t to represent the byte reshuffling (we'd need to use
offset_int or something like that instead) and we don't have
__builtin_bswap128 nor type-generic __builtin_bswap, so there is nothing
for 64-bit targets there.

2024-03-13  Jakub Jelinek  

PR middle-end/114319
* gimple-ssa-store-merging.cc
(imm_store_chain_info::try_coalesce_bswap): For 32-bit targets
allow matching __builtin_bswap64 if there is bswapsi2 optab.

* gcc.target/i386/pr114319.c: New test.

[Bug tree-optimization/114326] New: Missed optimization for A || B when !B implies A.

2024-03-13 Thread manolis.tsamis at vrull dot eu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326

Bug ID: 114326
   Summary: Missed optimization for A || B when !B implies A.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: manolis.tsamis at vrull dot eu
  Target Milestone: ---

The function below doesn't fold to return 0;

int cmp1(uint64_t d1, uint64_t d2) {
  if (((d1 ^ d2) & 0xabcd) == 0 || d1 != d2)
return 0;
  return foo();
}

while the following function does: 

int cmp2(uint64_t d1, uint64_t d2) {
  if (d1 != d2 || ((d1 ^ d2) & 0xabcd) == 0)
return 0;
  return foo();
}

The functions are equivalent since the lhs and rhs of || don't have side
effects.

In general, there pattern here is a side-effect free expression a || b where !b
implies a should be optimized to true. As in the testcase above, a doesn't
necessarily imply !b. Something similar could be stated for && expressions.

Complementary godbolt link: https://godbolt.org/z/qK5bYf36T

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #56 from Jakub Jelinek  ---
(In reply to Jan Hubicka from comment #55)
> It is however not hard to match the jump function while walking gimple
> bodies and comparing statements, which is backportable and localized. I am
> still waiting for my statistics to converge and will send it soon.

So, we can punt on differences there (that is desirable for backporting and
maybe GCC 14 too), or we could at that point populate an int vector, which maps
the callee
vector indexes to indexes in the callee vector in the other candidate function.
If unsuccessful, we just free the vector, if successful, we first walk all the
callees and union stuff in there using that vector.

[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1

2024-03-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261

--- Comment #9 from Richard Biener  ---
As far as I understand the testcase is from fuzzing so not "real", so I think
this proposed "fix" isn't necessary (and it's not a real fix, adding a
setjmp call at the end of the function will restore it).

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-03-13 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

--- Comment #55 from Jan Hubicka  ---
> Anyway, can we in the spot my patch changed just walk all 
> source->node->callees > cgraph_edges, for each of them find the corresponding 
> cgraph_edge in the alias > and for each walk all the jump_functions recorded 
> and union their m_vr?
> Or is that something that can't be done in LTO for some reason?

That was my fist idea too, but the problem is that icf has (very limited)
support for matching function which differ by order of the basic blocks: it
computes hash of every basic block and orders them by their hash prior
comparing. This seems half-finished since i.e. order of edges in PHIs has to
match exactly.

Callee lists are officially randomly ordered, but practically they follows the
order of basic blocks (as they are built this way).  However since BB orders
can differ, just walking both callee sequences and comparing pairwise does not
work. This also makes merging the information harder, since we no longer have
the BB map at the time decide to merge.

It is however not hard to match the jump function while walking gimple bodies
and comparing statements, which is backportable and localized. I am still
waiting for my statistics to converge and will send it soon.

[Bug libstdc++/114325] New: std::format gives incorrect results for negative numbers

2024-03-13 Thread luigighiron at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325

Bug ID: 114325
   Summary: std::format gives incorrect results for negative
numbers
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luigighiron at gmail dot com
  Target Milestone: ---

The following code generates an incorrect result with libstdc++:

std::format("{}",-100)

>From testing on godbolt this seems to generate the string "-1\", then when
printed it looks like -10. This seems exclusive to GCC 14, and happens for any
numbers less than -99.

[Bug middle-end/111523] Unexpected performance regression with -ftrivial-auto-var-init=zero for e.g. systemctl unmask

2024-03-13 Thread qinzhao at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111523

--- Comment #10 from qinzhao at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #9)

> Anways systemd has now changed the buffer to 256 which is much much smaller
> and for most calls enough in size before needing to reallocate the buffer
> that it has now become fast.
> 
> Anyways -ftrivial-auto-var-init=zero just exposed a performance (stack size)
> issue with already existing issue inside the systemd code. A good thing
> really. 
> 
> So closing as moved.

thanks a lot for the analysis and the solution of this performance issue.
really appreciate.

[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1

2024-03-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261

--- Comment #8 from Alexander Monakov  ---
If we want to get rid of the compilation time regression sooner rather than
later, I can suggest limiting my change only to functions that call setjmp:

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index c23218890f..ae23f55274 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -3695,7 +3695,7 @@ deps_analyze_insn (class deps_desc *deps, rtx_insn *insn)

   CANT_MOVE (insn) = 1;

-  if (!reload_completed)
+  if (!reload_completed && cfun->calls_setjmp)
{
  /* Scheduling across calls may increase register pressure by
extending
 live ranges of pseudos over the call.  Worse, in presence of
setjmp


That way we retain the "correctness fix" part of r13-5154-g733a1b777f1 and keep
the previous status quo on normal functions (quadraticness on asms like
demonstrated in comment #5 would also remain).

[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652

--- Comment #7 from Jakub Jelinek  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #6)
> > --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE  > Uni-Bielefeld.DE> ---
> >> --- Comment #4 from Jakub Jelinek  ---
> >> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1
> >> that program is ill-formed if some character lacks encoding in the 
> >> execution
> >> character set, I'm afraid the Solaris iconv behavior results in violation 
> >> of
> 
> Although I can barely wrap my head around the standardese there, I had a
> look at n4928 (the last? C++23 draft), which has a different wording
> here (p.25, 5.13.3):

The testcase is for a C++26 feature, which made those ill-formed.

> The current Solaris iconv behaviour certainly isn't particularly
> intuitive and I'll ask the Solaris engineers about it.  However, there's
> the question what to do about the testcase?  Just xfail it on Solaris or
> omit just the two affected subtests there?

xfailing is one possibility, but then on Solaris we'll never support C++26
properly.
Or require using GNU libiconv rather than Solaris iconv if it can't deal with
that?

[Bug libfortran/114304] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"

2024-03-13 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304

Jeffrey A. Law  changed:

   What|Removed |Added

Summary|[13/14 Regression]  |libgfortran I/O – bogus
   |libgfortran I/O – bogus |"Semicolon not allowed as
   |"Semicolon not allowed as   |separator with
   |separator with  |DECIMAL='point'"
   |DECIMAL='point'"|
 CC||law at gcc dot gnu.org

--- Comment #16 from Jeffrey A. Law  ---
Per c#12, c#13, c#14 & c#15, dropping the regression marker, but leaving open.

[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-13 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P3  |P2
 CC||law at gcc dot gnu.org

[Bug target/114323] [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98

2024-03-13 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P3  |P1
 CC||law at gcc dot gnu.org

[Bug c++/103524] [meta-bug] modules issue

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 98462, which changed state.

Bug 98462 Summary: [modules] ICE when making iomanip module and all modules 
after it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98462

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug c++/98462] [modules] ICE when making iomanip module and all modules after it

2024-03-13 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98462

Patrick Palka  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
 CC||ppalka at gcc dot gnu.org

--- Comment #1 from Patrick Palka  ---
Seems fixed even in GCC 11.

[Bug c++/111075] [14 Regression] ICE on g++.dg/torture/tail-padding1.C on darwin

2024-03-13 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111075

Marek Polacek  changed:

   What|Removed |Added

   Priority|P1  |P2
 CC||mpolacek at gcc dot gnu.org

--- Comment #2 from Marek Polacek  ---
darwin -> probably not P1.

[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs

2024-03-13 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652

--- Comment #6 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE  Uni-Bielefeld.DE> ---
>> --- Comment #4 from Jakub Jelinek  ---
>> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1
>> that program is ill-formed if some character lacks encoding in the execution
>> character set, I'm afraid the Solaris iconv behavior results in violation of

Although I can barely wrap my head around the standardese there, I had a
look at n4928 (the last? C++23 draft), which has a different wording
here (p.25, 5.13.3):

(3.1) — A character-literal with a c-char-sequence consisting of a
 single basic-c-char, simple-escape-sequence, or
 universal-character-name is the code unit value of the
 specified character as encoded in the literal’s associated
 character encoding.

 [Note 2 : If the specified character lacks representation in
 the literal’s associated character encoding or if it cannot be
 encoded as a single code unit, then the literal is a
 non-encodable character literal. —end note

> I've not yet tried to understand what either iconv(3) has to say on the
> matter.

Digging further, Solaris iconv(3C) has

   If  iconv()  encounters  a character in the input buffer that is legal,
   but for which an identical character does not exist in the target  code
   set,  iconv()  performs  an  implementation-defined  conversion on this
   character.

which exactly matches XPG7, so the behaviour seems to be in line with
the standards.

I've also found that Solaris 11 has iconvctl(3C) (obviously patterened
after GNU libiconv) with

   ICONV_SET_TRANSLITERATE

   With  this  request  and  a  pointer to a const int with a non-zero
   value, caller can instruct the current conversion to  transliterate
   non-identical characters from the input buffer during the code con-
   version  as  much  as it can. The value of zero, on the other hand,
   turns it off.

However,

int transliterate = 0;
iconvctl (cd, ICONV_SET_TRANSLITERATE, );

doesn't make a difference.

The current Solaris iconv behaviour certainly isn't particularly
intuitive and I'll ask the Solaris engineers about it.  However, there's
the question what to do about the testcase?  Just xfail it on Solaris or
omit just the two affected subtests there?

[Bug fortran/114324] New: AVX2 vectorisation performance regression with gfortran 13/14

2024-03-13 Thread mjr19 at cam dot ac.uk via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324

Bug ID: 114324
   Summary: AVX2 vectorisation performance regression with
gfortran 13/14
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mjr19 at cam dot ac.uk
  Target Milestone: ---

Created attachment 57685
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57685=edit
Test case of loop showing performance regression

The attached loop, when compiled with "-Ofast -mavx2" runs over 20% slower on
gfortran 13 or (pre-release) 14 than it does on 12.x. Precise versions tested
12.3.0, 13.1.0 and GCC 14 downloaded on 11th March.

Precise slowdown depends on CPU. Tested on Haswell and Kaby Lake desktops.

Adding "-fopenmp" changes the code produced, but 12.3 still beats later
compilers. The analysis below is without -fopenmp.

It appears (to me) that 12.x is using the full width of the ymm registers, and
has a loop of 17 vector instructions, and some scalar loop control, which
performs two iterations of the original Fortran loop.

13.x manages more aggressive unrolling, performing four iterations per pass,
but uses about 54 vector instructions, rather than the 34 one might naively
expect. More instructions does not necessarily mean slower, but here it does.

I attach the test case to which I refer. I would be happy to add the trivial
timing program to show how I have been timing it. The full code is an FFT, but
the test case has been reduced to functional nonsense.

(I note that in other areas there are pleasing performance gains in gfortran
13.x. It is a pity that this partially cancels them.)

[Bug target/114323] [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98

2024-03-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug target/114323] [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98

2024-03-13 Thread acoplan at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323

--- Comment #1 from Alex Coplan  ---
Hmm, so in 043t.mergephi1 we have:

uint32x4_t foo ()
{
  const uint32_t D.13439[4];
  uint32x4_t V0;

   :
  D.13439 = *.LC0;
  V0_3 = vld1q_u32 ();
  D.13439 ={v} {CLOBBER(eos)};
  return V0_3;

}

but then 044t.dse1 says:

  Deleted dead store: D.13439 = *.LC0;

leaving us with a load of uninitialized memory.

[Bug target/114323] New: [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98

2024-03-13 Thread acoplan at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323

Bug ID: 114323
   Summary: [14 Regression] MVE vector load intrinsic miscompiled
since r14-5622-g4d7647edfd7d98
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

The following testcase:

#include 

uint32x4_t foo (void) {
  uint32x4_t V0 = vld1q_u32(((const uint32_t[4]){1, 2, 3, 4}));
  return V0;
}

is miscompiled with -O2 -march=armv8.1-m.main+mve -mfloat-abi=hard on
arm-none-eabi.  Since r14-5622-g4d7647edfd7d985fbefe13de03c8bc2e3a74fc61 we
generate:

foo:
sub sp, sp, #16
vldrw.32q0, [sp]
add sp, sp, #16
bx  lr

i.e. we do a vector load from uninitialized stack memory.  GCC 13 used to give:

foo:
sub sp, sp, #16
mov ip, sp
ldr r3, .L4
ldm r3, {r0, r1, r2, r3}
stm ip, {r0, r1, r2, r3}
vldrw.32q0, [ip]
add sp, sp, #16
bx  lr
.align  2
.L4:
.word   .LANCHOR0
.size   foo, .-foo
.section.rodata
.align  2
.set.LANCHOR0,. + 0
.word   1
.word   2
.word   3
.word   4

which, while not optimal, is at least correct.  Here is a full executable
testcase for the testsuite:

#include 

__attribute__((noipa))
uint32x4_t foo (void) {
  uint32x4_t V0 = vld1q_u32(((const uint32_t[4]){1, 2, 3, 4}));
  return V0;
}

int main(void)
{
  uint32_t buf[4];
  vst1q_u32 (buf, foo());

  for (int i = 0; i < 4; i++)
if (buf[i] != i+1)
  __builtin_abort ();
}

[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets

2024-03-13 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

--- Comment #8 from Maxim Kuvyrkov  ---
Patch posted:
https://patchwork.sourceware.org/project/gcc/patch/20240313105839.2785627-1-maxim.kuvyr...@linaro.org/

[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Last reconfirmed||2024-03-13
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.  The issue is we have

 { x_12(D), +, 1 } * stride_11(D)

which doesn't behave the same with respect to overflow as

 { x_12(D) * stride_11(D), +, stride_11(D) }

and because of that we analyze it as


 (int) {(unsigned) x_12(D) * (unsigned) stride_11(D), +, (unsigned)
stride_11(D) }

as it might wrap.  But then then sign-extension to long unsigned int is
no longer affine.

  _1 = x_12(D) + i_20;
  _2 = _1 * stride_11(D);
  _3 = (long unsigned int) _2;
  _4 = _3 * 2;
  _5 = A_13(D) + _4;
  _6 = *_5;

The problematical case is x == N < 0 where the last - N might now
overflow with the new SCEV.

The correctness means that we'll now more often run into these issues
for IVs smaller than pointer width.  With -m32 we can analyze the DR to

Creating dr for *_5
offset from base address: 0
constant offset from base address: 0
step: (ssizetype) ((unsigned int) stride_11(D) * 2)
base alignment: 2
base misalignment: 0
offset alignment: 256
step alignment: 2
base_object: *A_13(D) + (sizetype) ((unsigned int) stride_11(D) *
(unsigned int) x_12(D)) * 2
Access function 0: {0B, +, (unsigned int) stride_11(D) * 2}_1

If you had written

   sum += A[i*stride + x*stride];

it might have worked but unfortunately EVRP transforms this back to
(i+x)*stride because it knows stride isn't zero.

In the end this means it's our failure that we fail to handle

  2 * (unsigned long)({ x_12(D), +, 1 } * stride_11(D))

as valid evolution for further analysis - of course the multiplication
by two in an unsigned type might overflow as well.

[Bug tree-optimization/114322] New: [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033

2024-03-13 Thread hliu at amperecomputing dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322

Bug ID: 114322
   Summary: [14 Regression] SCEV analysis failed for bases like
A[(i+x)*stride] since r14-9193-ga0b1798042d033
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hliu at amperecomputing dot com
  Target Milestone: ---

Compile the following case with: gcc simp.c -Ofast -mcpu=neoverse-n1 -S \
 -fdump-tree-ifcvt -fdump-tree-vect-details-scev

int
foo (short *A, int x, int stride)
{
  int sum = 0;

  if (stride > 1)
{
  #pragma GCC unroll 1
  for (int i = 0; i < 1024; ++i)
sum += A[(i + x) * stride];
}

  return sum;
}

The gimple in the loop is:

  :
  # sum_19 = PHI 
  # i_20 = PHI 
  # ivtmp_37 = PHI 
  _1 = x_12(D) + i_20;
  _2 = _1 * stride_11(D);
  _3 = (long unsigned int) _2;
  _4 = _3 * 2;
  _5 = A_13(D) + _4;
  _6 = *_5;
  _7 = (int) _6;
  sum_15 = _7 + sum_19;


Before the commit (i.e., from pr114074 bug fix), it can be vectorized:

Creating dr for *_5
analyze_innermost: (analyze_scalar_evolution 
  (loop_nb = 1)
  (scalar = _5)
(get_scalar_evolution 
  (scalar = _5)
  (scalar_evolution = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) *
2, +, (long unsigned int) stride_11(D) * 2}_1))
)
success.
(analyze_scalar_evolution 
  (loop_nb = 1)
  (scalar = _5)
(get_scalar_evolution 
  (scalar = _5)
  (scalar_evolution = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) *
2, +, (long unsigned int) stride_11(D) * 2}_1))
)
(instantiate_scev 
  (instantiate_below = 5 -> 3)
  (evolution_loop = 1)
  (chrec = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) * 2, +,
(long unsigned int) stride_11(D) * 2}_1)
  (res = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) * 2, +, (long
unsigned int) stride_11(D) * 2}_1))
base_address: A_13(D) + (sizetype) (stride_11(D) * x_12(D)) * 2
offset from base address: 0
constant offset from base address: 0
step: (ssizetype) ((long unsigned int) stride_11(D) * 2)
base alignment: 2
base misalignment: 0
offset alignment: 128
step alignment: 2
base_object: *A_13(D) + (sizetype) (stride_11(D) * x_12(D)) * 2
Access function 0: {0B, +, (long unsigned int) stride_11(D) * 2}_1


After the commit, loop vectorized failed due to SCEV failure with *_5:

Creating dr for *_5
analyze_innermost: (analyze_scalar_evolution 
  (loop_nb = 1)
  (scalar = _5)
(get_scalar_evolution 
  (scalar = _5)
  (scalar_evolution = _5))
)
(analyze_scalar_evolution 
  (loop_nb = 1)
  (scalar = _5)
(get_scalar_evolution 
  (scalar = _5)
  (scalar_evolution = _5))
)
simp.c:11:10: missed:  failed: evolution of base is not affine.
..
  (res = scev_not_known))


To my understanding, '(i + x) * stride' is signed integer calculation, in which
overflow is undefined behavior and the case should be vectorized.

[Bug libstdc++/110167] excessive compile time for std::to_array with huge arrays

2024-03-13 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110167

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #15 from Jonathan Wakely  ---
Fixed for 13.3 and 12.4

[Bug target/112548] [14 regression] 5% exec time regression in 429.mcf on AMD zen4 CPU (since r14-5076-g01c18f58d37865)

2024-03-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112548

--- Comment #10 from Robin Dapp  ---
(In reply to Sam James from comment #9)
> (In reply to Filip Kastl from comment #8)
> > I'd like to help but I'm afraid I cannot send you the SPEC binaries with PGO
> > applied since SPEC is licensed nor can I give you access to a Zen4 computer.
> > I suppose someone else will have to analyze this bug.
> 
> Could you perhaps send only the gcda files so Robin can build again with
> -fprofile-use?

Yes, that would be helpful.

Or Filip builds the executables himself and posts (some of) the difference
here.  Maybe that also gets us a bit closer to the problem.

[Bug libstdc++/110167] excessive compile time for std::to_array with huge arrays

2024-03-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110167

--- Comment #14 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:ec5da76ad33dcba7858525fdb6b39288631fcd8a

commit r12-10206-gec5da76ad33dcba7858525fdb6b39288631fcd8a
Author: Jonathan Wakely 
Date:   Thu Jun 8 12:24:43 2023 +0100

libstdc++: Optimize std::to_array for trivial types [PR110167]

As reported in PR libstdc++/110167, std::to_array compiles extremely
slowly for very large arrays. It needs to instantiate a very large
specialization of std::index_sequence and then create a very large
aggregate initializer from the pack expansion. For trivial types we can
simply default-initialize the std::array and then use memcpy to copy the
values. For non-trivial types we need to use the existing
implementation, despite the compilation cost.

As also noted in the PR, using a generic lambda instead of the
__to_array helper compiles faster since gcc-13. It also produces
slightly smaller code at -O1, due to additional inlining. The code at
-Os, -O2 and -O3 seems to be the same. This new implementation requires
__cpp_generic_lambdas >= 201707L (i.e. P0428R2) but that is supported
since Clang 10 and since Intel icc 2021.5.0 (and since GCC 10.1).

libstdc++-v3/ChangeLog:

PR libstdc++/110167
* include/std/array (to_array): Initialize arrays of trivial
types using memcpy. For non-trivial types, use lambda
expressions instead of a separate helper function.
(__to_array): Remove.
* testsuite/23_containers/array/creation/110167.cc: New test.

(cherry picked from commit 960de5dd886572711ef86fa1e15e30d3810eccb9)

[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86

2024-03-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319

--- Comment #5 from Richard Biener  ---
Coalescing successful!
Merged into 1 stores
32 bit bswap implementation found at: _37

looks like we are only merging one store.  Note we cannot recognize
bswap to memory this is a known issue.  So for the bswap64 we need to
merge to a 64bit store which we never do on a 32bit platform.  We
could with SSE, but appearantly we don't try with the bswap trick
at least.  The bswap trick also doesn't seem to consider the split
64bit bswap.  Oddly enough we also fail to merge the other store
(maybe missing a val >> 32 pre-shift "trick").

Possibly could be shown to be a similar issue with a 126bit bswap
on x86_64 which we could emulate with two 64bit bswaps.

[Bug middle-end/114313] ICE: in limb_access_type, at gimple-lower-bitint.cc:591 with _BitInt() in a bitfield

2024-03-13 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114313

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jakub Jelinek  ---
Fixed.

[Bug middle-end/114313] ICE: in limb_access_type, at gimple-lower-bitint.cc:591 with _BitInt() in a bitfield

2024-03-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114313

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:0613b12dd7f6274a1aac07f295ed51d86c2c85f1

commit r14-9447-g0613b12dd7f6274a1aac07f295ed51d86c2c85f1
Author: Jakub Jelinek 
Date:   Wed Mar 13 10:19:04 2024 +0100

bitint: Fix up lowering of bitfield loads/stores [PR114313]

The following testcase ICEs, because for large/huge _BitInt bitfield
loads/stores we use the DECL_BIT_FIELD_REPRESENTATIVE as the underlying
"var" and indexes into it can be larger than the precision of the
bitfield might normally allow.

The following patch fixes that by passing NULL_TREE type in that case
to limb_access, so that we always return m_limb_type type and don't
do the extra assertions, after all, the callers expect that too.
I had to add the first hunk to avoid ICE, it was using type in one place
even when it was NULL.  But TYPE_SIZE (TREE_TYPE (var)) seems like the
right size to use anyway because the code uses VIEW_CONVERT_EXPR on it.

2024-03-13  Jakub Jelinek  

PR middle-end/114313
* gimple-lower-bitint.cc (bitint_large_huge::limb_access): Use
TYPE_SIZE of TREE_TYPE (var) rather than TYPE_SIZE of type.
(bitint_large_huge::handle_load): Pass NULL_TREE rather than
rhs_type to limb_access for the bitfield load cases.
(bitint_large_huge::lower_mergeable_stmt): Pass NULL_TREE rather
than
lhs_type to limb_access if nlhs is non-NULL.

* gcc.dg/torture/bitint-62.c: New test.

[Bug fortran/114283] [OpenMP] Dummy procedures/proc pointers and 'defaultmap', 'default', 'firstprivate' etc.

2024-03-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114283

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:c5037fcee2de438774466e78e46e6ab4df72a7fe

commit r14-9446-gc5037fcee2de438774466e78e46e6ab4df72a7fe
Author: Tobias Burnus 
Date:   Wed Mar 13 09:35:28 2024 +0100

OpenMP/Fortran: Fix defaultmap(none) issue with dummy procedures [PR114283]

Dummy procedures look similar to variables but aren't - neither in Fortran
nor in OpenMP. As the middle end sees PARM_DECLs, mark them as
predetermined
firstprivate for mapping (as already done in
gfc_omp_predetermined_sharing).

This does not address the isses related to procedure pointers, which are
still discussed on spec level [see PR].

PR fortran/114283

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_omp_predetermined_mapping): Map dummy
procedures as firstprivate.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/declare-target-indirect-4.f90: New
test.

[Bug sanitizer/112709] [13/14 Regression] address sanitize and returns_twice causes an ICE

2024-03-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112709

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:6586359e8e4c611dd96129b5d4f24023949ac3fc

commit r14-9445-g6586359e8e4c611dd96129b5d4f24023949ac3fc
Author: Jakub Jelinek 
Date:   Wed Mar 13 09:19:05 2024 +0100

asan: Fix ICE during instrumentation of returns_twice calls [PR112709]

The following patch on top of the previously posted ubsan/gimple-iterator
one handles asan the same.  While the case of returning by hidden reference
is handled differently because of the first recently posted asan patch,
this deals with instrumentation of the aggregates returned in registers
case as well as instrumentation of loads from aggregate memory in the
function arguments of returns_twice calls.

2024-03-13  Jakub Jelinek  

PR sanitizer/112709
* asan.cc (maybe_create_ssa_name, maybe_cast_to_ptrmode,
build_check_stmt, maybe_instrument_call, asan_expand_mark_ifn): Use
gsi_safe_insert_before instead of gsi_insert_before.

* gcc.dg/asan/pr112709-2.c: New test.

[Bug sanitizer/112709] [13/14 Regression] address sanitize and returns_twice causes an ICE

2024-03-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112709

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:364c684c474841e3c9c04e025a5c1bca49705c86

commit r14-9444-g364c684c474841e3c9c04e025a5c1bca49705c86
Author: Jakub Jelinek 
Date:   Wed Mar 13 09:16:45 2024 +0100

gimple-iterator, ubsan: Fix ICE during instrumentation of returns_twice
calls [PR112709]

ubsan, asan (both PR112709) and _BitInt lowering (PR113466) want to
insert some instrumentation or adjustment statements before some statement.
This unfortunately creates invalid IL if inserting before a returns_twice
call, because we require that such calls are the first statement in a basic
block and that we have an edge from the .ABNORMAL_DISPATCHER block to
the block containing the returns_twice call (in addition to other edge(s)).

The following patch adds helper functions for such insertions and uses it
for now in ubsan (I'll post a follow up which uses it in asan and will
work later on the _BitInt lowering PR).

In particular, if the bb with returns_twice call at the start has just
2 edges, one EDGE_ABNORMAL from .ABNORMAL_DISPATCHER and another
(non-EDGE_ABNORMAL/EDGE_EH) from some other bb, it just inserts the
statement or sequence on that other edge.
If the bb has more predecessor edges or the one not from
.ABNORMAL_DISPATCHER is e.g. an EH edge (this latter case likely shouldn't
happen, one would need labels or something like that), the patch splits the
block with returns_twice call such that there is just one edge next to
.ABNORMAL_DISPATCHER edge and adjusts PHIs as needed to make it happen.
The functions also replace uses of PHIs from the returns_twice bb with
the corresponding PHI arguments, because otherwise it would be invalid IL.

E.g. in ubsan/pr112709-2.c (qux) we have before the ubsan pass
   :
  # .MEM_5(ab) = PHI <.MEM_4(9), .MEM_25(ab)(11)>
  # _7(ab) = PHI <_20(9), _8(ab)(11)>
  # .MEM_21(ab) = VDEF <.MEM_5(ab)>
  _22 = bar (*_7(ab));
where bar is returns_twice call and bb 11 has .ABNORMAL_DISPATCHER call,
this patch instruments it like:
   :
  # .MEM_4 = PHI <.MEM_17(ab)(4), .MEM_10(D)(5), .MEM_14(ab)(8)>
  # DEBUG BEGIN_STMT
  # VUSE <.MEM_4>
  _20 = p;
  # .MEM_27 = VDEF <.MEM_4>
  .UBSAN_NULL (_20, 0B, 0);
  # VUSE <.MEM_27>
  _2 = __builtin_dynamic_object_size (_20, 0);
  # .MEM_28 = VDEF <.MEM_27>
  .UBSAN_OBJECT_SIZE (_20, 1024, _2, 0);

   :
  # .MEM_5(ab) = PHI <.MEM_28(9), .MEM_25(ab)(11)>
  # _7(ab) = PHI <_20(9), _8(ab)(11)>
  # .MEM_21(ab) = VDEF <.MEM_5(ab)>
  _22 = bar (*_7(ab));
The edge from .ABNORMAL_DISPATCHER is there just to represent the
returning for 2nd and later times, the instrumentation can't be
done at that point as there is no code executed during that point.
The ubsan/pr112709-1.c testcase includes non-virtual PHIs to cover
the handling of those as well.

2024-03-13  Jakub Jelinek  

PR sanitizer/112709
* gimple-iterator.h (gsi_safe_insert_before,
gsi_safe_insert_seq_before): Declare.
* gimple-iterator.cc: Include gimplify.h.
(edge_before_returns_twice_call, adjust_before_returns_twice_call,
gsi_safe_insert_before, gsi_safe_insert_seq_before): New functions.
* ubsan.cc (instrument_mem_ref, instrument_pointer_overflow,
instrument_nonnull_arg, instrument_nonnull_return): Use
gsi_safe_insert_before instead of gsi_insert_before.
(maybe_instrument_pointer_overflow): Use force_gimple_operand,
gimple_seq_add_seq_without_update and gsi_safe_insert_seq_before
instead of force_gimple_operand_gsi.
(instrument_object_size): Likewise.  Use gsi_safe_insert_before
instead of gsi_insert_before.

* gcc.dg/ubsan/pr112709-1.c: New test.
* gcc.dg/ubsan/pr112709-2.c: New test.

[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193

2024-03-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151

--- Comment #22 from Richard Biener  ---
(In reply to Andrew Macleod from comment #21)
> (In reply to Richard Biener from comment #19)
> 
> > 
> > While ranger has a range_on_exit API this doesn't work on GENERIC 
> > expressions
> > as far as I can see but only SSA names but I guess that could be "fixed"
> > given range_on_exit also looks at the last stmt and eventually defers to
> > range_of_expr (or range_on_entry), but possibly get_tree_range needs
> > variants for on_entry/on_exit (it doesn't seem to use it's 'stmt' context
> > very consistently, notably not for SSA_NAMEs ...).
> 
> That would appear to be an oversight. That API has not been used very much
> for arbitrary generic trees.  I think the original reason support for tree
> expressions was added was a "try this" for some other PR. It was simple to
> do so we lef tit in, but it never got any real traction.  At least as far as
> I can recall :-)
> 
> Currently, I think mosrt, if not all, uses of get_tree_range() are either
> !gimple_ssa_range_p() (commonly constants or unsupported types) or ssa_names
> on abnormal edges. 
> 
> For abnormal edges, we ought to be getting the global range directly these
> days instad of calling that routine.   Then in get_tree_range (), we ought
> to be calling range_of_expr for SSA_NAMES with the provided context.  I'll
> poke at that too. The support for general tree expressions changed the
> original intent of the function, and it should be adjusted. 
> 
> As for the on-exit/on-entry bits... we haven't had a need for entry/exit
> outside of ranger in the past.  I had toyed with exporting those routines
> and making them a part of the official API for value-query, but hadn't run
> across the need as yet.
> 
> Let me think about that for a minute. It can certainly be done. I guess we
> really only need an on-entry and on-exit version of range_of_expr to do
> everything.  So if we end up with something like:  
>   range_of_expr (r, expr, stmt)
>   range_of_expr_on_entry  (r, expr, bb)
>   range_of_expr_on_exit (r, expr, bb)
> 
> And have that all work with general trees expressions.. That would solve
> much of this for you?

Yes, I wouldn't mind if range_on_{entry,exit} handle general tree expressions,
there's enough APIs to be confused with already ;)

> 
> 
> 
> > 
> > Interestingly enough we somehow still need the
> > 
> 
> > 
> > hunk of Andrews patch to do it :/
> > 
> 
> That probably means there is another call somewhere in the chain with no
> context. However, I will say that functionality is more important than it
> seems. Should have been there from the start :-P.

Possibly yes.  It might be we fill rangers cache with VARYING and when
we re-do the query as a dependent one but with context we don't recompute
it?  I also only patched up a single place in SCEV with the context so
I possibly missed some others that end up with a range query, for example
through niter analysis that might be triggered.

[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets

2024-03-13 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

Maxim Kuvyrkov  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mkuvyrkov at gcc dot 
gnu.org

--- Comment #7 from Maxim Kuvyrkov  ---
Working on this, including reviewing gcc.dg/vect/, g++.dg/vect/ and
gfortran.dg/vect/ testsuites.

[Bug bootstrap/106472] No rule to make target '../libbacktrace/libbacktrace.la', needed by 'libgo.la'.

2024-03-13 Thread dilyan.palauzov at aegee dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106472

--- Comment #36 from Дилян Палаузов  ---
> maybe this ought to be a `depend=` entry in Makefile.def instead?

My understanding is that depend= only has effect for bootstrapped targets, and
there is no boot_language=yes in gcc/go/config-lang.in.

93 matches

Mail list logo