date:20230526

[Bug libobjc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Andrew Pinski  ---
PowerPC64el looks fine
https://gcc.gnu.org/pipermail/gcc-testresults/2023-May/785580.html




powerpc64-unknown-linux-gnu looks fine too.
https://gcc.gnu.org/pipermail/gcc-testresults/2023-May/785574.html

So closing as fixed.

Yes it was failing before:
https://gcc.gnu.org/pipermail/gcc-testresults/2023-May/784854.html

[Bug libstdc++/109965] rename 'Modules' to 'Categories' in tree-view of doxygen-generated libstdc++ documentation

2023-05-26 Thread saifi.khan at nishan dot io via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109965

--- Comment #5 from Saifi Khan  ---
raised the issue with doxygen project folks.

https://github.com/doxygen/doxygen/issues/10093

There is no direct solution or workaround as per response.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #28 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #26)
> (In reply to Andrew Pinski from comment #25)
> > Created attachment 55175 [details]
> > Patch which fixes `signed < 0`
> > 
> > This patch improves comment #20 .
> 
> Note this patch does not work for the case of normalizep == -1 but I have a
> fix for that.

Also note this patch improves (a lot):
```
unsigned char f(long long t)
{
return t < 0;
}
```
Down to:
```
f:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
mov r24,r25
rol r24
clr r24
rol r24
/* epilogue start */
ret
```
Rather than what it was in GCC 13:
```
f:
push r16
ldi r16,lo8(63)
rcall __lshrdi3
mov r24,r18
pop r16
ret
```

[Bug tree-optimization/109901] Optimization opportunity: ((((a) > (b)) - ((a) < (b))) < 0) -> ((a) < (b))

2023-05-26 Thread richard.yao at alumni dot stonybrook.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109901

--- Comment #8 from Richard Yao  ---
Created attachment 55177
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55177=edit
Source code for micro-benchmark.

Here is an example of how not having this optimization slows us down:

https://gcc.godbolt.org/z/nxdrb17G7

custom_binary_search_slow() and custom_binary_search_fast() are the same
function. The only difference is that I manually applied the a) > (b)) -
((a) < (b))) <= 0) -> ((a) <= (b)) transformation to see what GCC would
generate if it were able to do this transformation.

This optimization alone makes binary search ~78% faster on my Ryzen 7 5800X:

Benchmark: array size: 1024, runs: 1000, repetitions: 1, seed: 1685158101,
density: 10

Even distribution with 1024 32 bit integers, random access

|   Name |  Items |   Hits
| Misses |   Time |
| -- | -- | --
| -- | -- |
|  custom_binary_search_slow |   1024 |983
|   9017 |   0.000313 |
|  custom_binary_search_fast |   1024 |983
|   9017 |   0.000176 |

I modified the microbenchmark from scandum/binary_search to better suit a
workload that I am micro-optimizing:

https://github.com/scandum/binary_search

In specific, I wanted to test on small arrays, but avoid cache effects
contaminating the results. One could easily add the search functions from my
modified version into the original to get get numbers for bigger array sizes.

I have attached the source code for the modified micro-benchmark. The above run
was done after compiling with -O2 -fno-inline. Compiling with just -O2 does not
make much difference, since I deleted the code where -fno-inline makes a
difference from that file since it was not relevant to this issue.

[Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

--- Comment #4 from Andrew Pinski  ---
It is looking like a register allocation issue or something changed in
expanding to rtl. maybe just it was ok on accident before GCC 13.

[Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

--- Comment #3 from Andrew Pinski  ---
Created attachment 55176
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55176=edit
testcase

Next time please also attach the source (if it uses headers the preprocessed
source).

[Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64-linux-gnu

--- Comment #2 from Andrew Pinski  ---
Note mov are sometimes free and only take up decode space.

[Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
  Component|tree-optimization   |target

--- Comment #1 from Andrew Pinski  ---
The tree level is the same between 11.x, 12.x, 13.x and the trunk.

[Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search

2023-05-26 Thread richard.yao at alumni dot stonybrook.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Bug ID: 110001
   Summary: [13 regression] Suboptimal code generation for
branchless binary search
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: richard.yao at alumni dot stonybrook.edu
  Target Milestone: ---

GCC 12.3 generated beautiful code for this, with all but the last of the
unrolled loop iterations using only 3 instructions:

https://gcc.godbolt.org/z/eGbEj9YKd

Currently, GCC generates 4 instructions:

https://gcc.godbolt.org/z/Ebczq8jjx

This probably does not make a huge difference given the data hazard, but there
is something awe-inspiring from seeing GCC generate only 3 instructions per
unrolled loop iteration for binary search. It would be nice if future versions
went back to generating three instructions.

This function was inspired by this D code:

https://godbolt.org/z/5En7xajzc

The bsearch1000() function is entirely branchless and has no more than 2
instructions for every cmov, excluding ret. I wrote a more general version in C
that can handle variable array sizes, and to my pleasant surprise, GCC 12.3
generated a similar 3 instruction sequence for all but the last of the unrolled
loop iterations. I was saddened when I saw the output from GCC 13.1 and trunk.

Anyway, all recent versions of GCC that I cared to check generate a branch for
the last unrolled iteration on line 58. That branch is unpredictable, so GCC
would generate better code here if it used predication to avoid the branch. I
had been able to give GCC a hint to avoid a similar branch at the end using
__builtin_expect_with_probability(), but that trick did not work for line 58.

Also, if anyone is interested in where that D code originated, here is the
source:

https://muscar.eu/shar-binary-search-meta.html

[Bug sanitizer/109980] Bogus Wstringop-overflow and Wstringop-overread warnings when attribute `access` is applied to struct arg

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109980

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-05-27

--- Comment #1 from Andrew Pinski  ---
Confirmed.

Note the -O2 difference just comes from inlining.
So you could get away with just this for getting the warning:
```
typedef struct{
int value, decoy;
} S;

[[gnu::access(read_write, 1)]]
int S_rw(S *self);

[[gnu::access(read_only, 1)]]
int S_ro(const S *self);

int S_test(S *tmps){
return tmps[1].value && S_rw(tmps + 1) && S_ro(tmps + 1);
}
```

[Bug c++/109981] ICE encountered while generating header units in the given sequence in a script

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109981

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup of bug 99241.

*** This bug has been marked as a duplicate of bug 99241 ***

[Bug c++/99241] [modules] ICE in install_entity, at cp/module.cc:7584

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99241

Andrew Pinski  changed:

   What|Removed |Added

 CC||saifi.khan at nishan dot io

--- Comment #8 from Andrew Pinski  ---
*** Bug 109981 has been marked as a duplicate of this bug. ***

[Bug c++/103524] [meta-bug] modules issue

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 109981, which changed state.

Bug 109981 Summary: ICE encountered while generating header units in the given 
sequence in a script
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109981

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

[Bug preprocessor/109988] -iwithprefix doesn't add folder to end of search list

2023-05-26 Thread ivan.lazaric.gcc at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109988

--- Comment #3 from Ivan Lazaric  ---
Note that clang has the same flags and behaves according to the documentation,
might be some value in matching it.

If it's considered too breaking of a change, I would recommend introducing a
-iwithprefixafter flag that would add the directory to the end of the include
search list.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #27 from Andrew Pinski  ---
I should note the middle-end could also improve here:
  /* If we are comparing a double-word integer with zero or -1, we can
 convert the comparison into one involving a single word.  */
  if (is_int_mode (mode, _mode)
  && GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2
  && (!MEM_P (op0) || ! MEM_VOLATILE_P (op0)))

In the case of SImode, GET_MODE_BITSIZE is 32 while BITS_PER_WORD is just 8. We
could use a loop (to generate the and/ior) if GET_MODE_BITSIZE (int_mode) is a
(non-1) multiple of BITS_PER_WORD instead of the 1 expand_binop.

But that is left for another person to do. That would also improve the comment
#20 case too.

Avr might be the only target which supports a mode size that is *4 of the
BIT_PER_WORD fully.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #26 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #25)
> Created attachment 55175 [details]
> Patch which fixes `signed < 0`
> 
> This patch improves comment #20 .

Note this patch does not work for the case of normalizep == -1 but I have a fix
for that.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #25 from Andrew Pinski  ---
Created attachment 55175
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55175=edit
Patch which fixes `signed < 0`

This patch improves comment #20 .

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #24 from Andrew Pinski  ---
(In reply to Georg-Johann Lay from comment #23)
> Thank you so much for looking into this.
> 
> For the test case from comment #21 though, the problem is somewhere in tree
> optimizations.
> 
> > unsigned char lfsr32_mpp_ge0 (unsigned long number)
> > {
> >   unsigned char b = 0;
> >   if (number >= 0) b--;
> >   if (number & (1UL << 29)) b++;
> >   if (number & (1UL << 13)) b++;
> > 
> >   return b;
> > }
> 
> The -fdump-tree-optimized dump reads:
> 
> ;; Function lfsr32_mpp_ge0 (lfsr32_mpp_ge0, funcdef_no=0, decl_uid=1880,
> cgraph_uid=1, symbol_order=0)
> 
> unsigned char lfsr32_mpp_ge0 (long unsigned int number)
> {
>   unsigned char b;
>   long unsigned int _1;
>   long unsigned int _2;
>   _Bool _3;
>   unsigned char _8;
>   _Bool _9;
>   unsigned char _10;
>   unsigned char _11;
> 
>[local count: 1073741824]:
>   _1 = number_5(D) & 536870912;
>   _2 = number_5(D) & 8192;
>   if (_2 != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
>[local count: 536870912]:
>   _9 = _1 == 0;
>   _10 = (unsigned char) _9;
>   _11 = -_10;
>   goto ; [100.00%]
> 
>[local count: 536870913]:
>   _3 = _1 != 0;
>   _8 = (unsigned char) _3;
> 
>[local count: 1073741824]:
>   # b_4 = PHI <_11(3), _8(4)>
>   return b_4;
> }

Oh yes this is where my
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619068.html patch actually
solves.
It should be able to detect that _1 has a non-zero bits of just one bit set and
expand using single_bit_test for _3.  

For an example we get:
;; Generating RTL for gimple basic block 3

;; _11 = -_10;

(insn 10 9 11 (set (reg:QI 51)
(zero_extract:QI (subreg:QI (reg:SI 43 [ _1 ]) 3)
(const_int 1 [0x1])
(const_int 5 [0x5]))) "t21.c":5:6 -1
 (nil))

(insn 11 10 12 (set (reg:QI 53)
(const_int 1 [0x1])) "t21.c":5:6 -1
 (nil))

(insn 12 11 13 (set (reg:QI 52)
(xor:QI (reg:QI 51)
(reg:QI 53))) "t21.c":5:6 -1
 (nil))

(insn 13 12 0 (set (reg/v:QI 48 [  ])
(neg:QI (reg:QI 52))) "t21.c":5:6 -1
 (nil))


Which is exactly what you want right?

Overall we get:
```
lfsr32_mpp_ge0:
push r16
push r17
/* prologue: function */
/* frame size = 0 */
/* stack size = 2 */
.L__stack_usage = 2
mov r16,r22
mov r17,r23
mov r18,r24
mov r19,r25
mov r27,r19
mov r26,r18
mov r25,r17
mov r24,r16
clr r24
clr r25
clr r26
andi r27,32
bst r27,5
clr r24
bld r24,0
sbrs r17,5
subi r24,lo8(-(-1))
.L1:
/* epilogue start */
pop r17
pop r16
ret
```

Which is much better than it was before (still could be improved more though
but that is for a different time).

To finish this patch up, I am supposed to do some cost modelling and such which
I might get to this weekend.


Even for aarch64 we do slightly better:
```
lfsr32_mpp_ge0:
and x1, x0, 536870912
tbnzx0, 13, .L2
cmp x1, 0
csetm   w0, eq
and w0, w0, 255
ret
.L2:
cmp x1, 0
csetw0, ne
ret
```


```
lfsr32_mpp_ge0:
and x1, x0, 536870912
tbnzx0, 13, .L2
ubfxx0, x1, 29, 1
eor w0, w0, 1
neg w0, w0
and w0, w0, 255
ret
.L2:
ubfxx0, x1, 29, 1
and w0, w0, 255
ret
```

Though if there would be a way to remove the first and's, that would be best
(and the last and, the last one is still there because of fuzziness with
combine trying to do ne there still).

Note even though it does increase in size, the cost of cset on some processors
is 2 cycles rather than 1.

[Bug tree-optimization/109985] __builtin_prefetch ignored by GCC 12/13

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109985

--- Comment #4 from Andrew Pinski  ---
Hmm:
modref analyzing 'void boost::unordered::detail::foa::prefetch(const
void*)/3452' (ipa=0) (pure)
Analyzing flags of ssa name: p_1(D)
  Analyzing stmt: __builtin_prefetch (p_1(D));
  current flags of p_1(D) no_direct_clobber no_indirect_clobber
no_direct_escape no_indirect_escape not_returned_directly
not_returned_indirectly no_direct_read no_indirect_read
flags of ssa name p_1(D) no_direct_clobber no_indirect_clobber no_direct_escape
no_indirect_escape not_returned_directly not_returned_indirectly no_direct_read
no_indirect_read
Always executed bbbs (assuming return or EH): 2
 - Analyzing call:__builtin_prefetch (p_1(D));
 - ECF_CONST | ECF_NOVOPS, ignoring all stores and all loads except for args.
Function found to be const: void boost::unordered::detail::foa::prefetch(const
void*)/3452
Declaration updated to be const: void
boost::unordered::detail::foa::prefetch(const void*)/3452
 - modref done with result: tracked.
  loads:
  stores:
  Try dse
  parm 0 flags: not_returned_directly not_returned_indirectly no_direct_read
no_indirect_read
void boost::unordered::detail::foa::prefetch (const void * p)
{
   [local count: 1073741824]:
  __builtin_prefetch (p_1(D));
  return;

}


Maybe that explains it, 


DEF_GCC_BUILTIN(BUILT_IN_PREFETCH, "prefetch",
BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST)

[Bug middle-end/109996] csmith: -O2 -fno-strict-aliasing causing run time trouble

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109996

--- Comment #1 from Andrew Pinski  ---
There could be some alignment issues here ...

[Bug libstdc++/105562] [12 Regression] std::function::_M_invoker may be used uninitialized in std::regex move with -fno-strict-aliasing

2023-05-26 Thread urisimchoni at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105562

Uri Simchoni  changed:

   What|Removed |Added

 CC||urisimchoni at gmail dot com

--- Comment #21 from Uri Simchoni  ---
(In reply to Richard Biener from comment #18)
> (In reply to Sven Hesse from comment #17)
> > I still get this with gcc 12.2.0 (Gentoo 12.2.0 p9), but only when compiling
> > with (at least with) -O1 -fsanitize=address, in addition to any warning flag
> > that enables -Wmaybe-uninitialized (like -Wall, -Wextra or -Wuninitialized).
> > 
> > -O0 and/or no ASan, and the offending code compiles cleanly without any
> > warnings. Somehow, the combination of enabling ASan and optimization
> > (anything > -O0, but not -Os) triggers it again, it seems?
> > 
> > I can observe this with the testcase attached here in this bug report.
> 
> -fsanitize=address is likely to derail optimization enough to make such
> occurences more likely, I think we have plenty of duplicate bugreports
> for this.

So it seems this is still hapening with -O1 -fsanitize=address (occurring for
me too with GCC 13.1.0), yet this specific bug is marked as "fixed" and there's
a mention of duplicate bugreports (I can see one unconfirmed pointed-to by this
issue). Is opening of another bug, focusing on -O1 -fsanitize=address, going to
help get this fixed?

[Bug c++/109997] __is_assignable(int, IncompleteType) should be rejected

2023-05-26 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109997

--- Comment #4 from Jonathan Wakely  ---
Looks pretty similar, although I don't think we even had __is_assignable when
that was filed.

[Bug fortran/109948] [13/14 Regression] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()

2023-05-26 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948

--- Comment #13 from anlauf at gcc dot gnu.org ---
(In reply to anlauf from comment #12)
> +   && e->symtree->n.sym->assoc->target->ref
> +   && e->symtree->n.sym->assoc->target->ref->u.ar.type == AR_FULL
> +   && e->symtree->n.sym->assoc->target->ref->u.ar.as)
> + {
> +   e->rank = e->symtree->n.sym->assoc->target->ref->u.ar.as->rank;
> +   goto done;
> + }
> +

Maybe be just need to follow the refs and join the code with the later part.

[Bug c++/109997] __is_assignable(int, IncompleteType) should be rejected

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109997

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #2)
> Isn't this a dup of bug 92067?

Sorry I mean is_constructible is recorded as PR 92067. I was reading some other
bug headline and getting confused.

[Bug c++/109997] __is_assignable(int, IncompleteType) should be rejected

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109997

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=92067

--- Comment #2 from Andrew Pinski  ---
Isn't this a dup of bug 92067?

[Bug c/109970] -Wstringop-overflow should work with parameter forward declarations

2023-05-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109970

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Martin Uecker :

https://gcc.gnu.org/g:8d6bd830f5f9c939e8565c0341a0c6c588834484

commit r14-1304-g8d6bd830f5f9c939e8565c0341a0c6c588834484
Author: Martin Uecker 
Date:   Fri May 26 11:19:01 2023 +0200

c: -Wstringop-overflow for parameters with forward-declared sizes

Warnings from -Wstringop-overflow do not appear for parameters declared
as VLAs when the bound refers to a parameter forward declaration. This
is fixed by splitting the loop that passes through parameters into two,
first only recording the positions of all possible size expressions
and then processing the parameters.

PR c/109970

gcc/c-family:

* c-attribs.cc (build_attr_access_from_parms): Split loop to first
record all parameters.

gcc/testsuite:

* gcc.dg/pr109970.c: New test.

[Bug fortran/109948] [13/14 Regression] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()

2023-05-26 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948

--- Comment #12 from anlauf at gcc dot gnu.org ---
(In reply to anlauf from comment #11)
> I think it does not handle the following variation of the testcase from
> the blamed patch:

This one seems to be handled by the clumsy attempt:

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 75d61a18856..a5dcf07c1ee 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -5622,6 +5625,21 @@ gfc_expression_rank (gfc_expr *e)
 {
   if (e->expr_type == EXPR_ARRAY)
goto done;
+
+  /* Take rank from associate target.  */
+  if (e->symtree
+ && e->symtree->n.sym->as == NULL
+ && e->symtree->n.sym->assoc
+ && e->symtree->n.sym->assoc->target
+ && e->symtree->n.sym->assoc->rankguessed
+ && e->symtree->n.sym->assoc->target->ref
+ && e->symtree->n.sym->assoc->target->ref->u.ar.type == AR_FULL
+ && e->symtree->n.sym->assoc->target->ref->u.ar.as)
+   {
+ e->rank = e->symtree->n.sym->assoc->target->ref->u.ar.as->rank;
+ goto done;
+   }
+
   /* Constructors can have a rank different from one via RESHAPE().  */

   e->rank = ((e->symtree == NULL || e->symtree->n.sym->as == NULL)
@@ -5640,7 +5658,7 @@ gfc_expression_rank (gfc_expr *e)
   if (ref->type != REF_ARRAY)
continue;

-  if (ref->u.ar.type == AR_FULL)
+  if (ref->u.ar.type == AR_FULL && ref->u.ar.as)
{
  rank = ref->u.ar.as->rank;
  break;


Of course this does not address the point brought up by Mikael.

[Bug fortran/109948] [13/14 Regression] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()

2023-05-26 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948

--- Comment #11 from anlauf at gcc dot gnu.org ---
(In reply to Paul Thomas from comment #9)
> By the way, the patch regtests OK
> 
> Do you want to do the honours or shall I?
> 
> I think that this rates as an 'obvious' fix.

I think it does not handle the following variation of the testcase from
the blamed patch:


module mm
  implicit none
  interface operator(==)
 module procedure eq_1_2
  end interface operator(==)
  private :: eq_1_2
contains
  logical function eq_1_2 (x, y)
integer, intent(in) :: x(:)
real,intent(in) :: y(:,:)
eq_1_2 = .true.
  end function eq_1_2
end module mm

subroutine foo(k_2d)
  use mm
  implicit none
  integer :: k_2d(:)
  integer :: m(1) = 42
  real:: r(1,1) = 3.0
  print *, (m == r)
  associate (k=>k_2d)
print *, (k == r)   ! <-- fails
  end associate
  associate (k=>k_2d(:))
print *, (k == r)
  end associate
end subroutine foo


For the marked line, I see in the debugger that e->ref == NULL.
I've played with some modification of the related code block, but that
regressed on two of the associate testcases.

[Bug c++/110000] GCC should implement exclude_from_explicit_instantiation

2023-05-26 Thread nikolasklauser at berlin dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

--- Comment #4 from Nikolas Klauser  ---
(In reply to Andrew Pinski from comment #3)
> I am getting a feeling this attribute is well defined enough.
> 
> Is it really just supposed to block explicit instantiation of templates?
> Is there a decent set of testcases that can be used to match up the
> implementations here? Because I suspect without those it will be implemented
> slightly different.

The attribute was originally implemented in https://reviews.llvm.org/D51789.
There are also a few test cases, but I don't know if they are enough. If there
are any cases which aren't clear I'm happy to work them out (and maybe update
the implementation in clang if necessary).
Yes, it is only to block explicit instantiation of member functions (and to
tell the compiler that they have to be instantiated implicitly, obviously).

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #11 from H.J. Lu  ---
(In reply to H.J. Lu from comment #9)
> [hjl@gnu-cfl-3 pr109982]$ cat x.c 
> struct S0 {
>long long int f0;
> } __attribute__((aligned(128)));
> 
> int padding = 1;
> static struct S0 g_2415 __attribute__((aligned(4))) = {-1L};
> static struct S0 *g_2500 __attribute__((visibility("internal"), used)) =
> _2415;
> 

I think the code is invalid since g_2500 isn't pointed to properly aligned
data.

[Bug c++/110000] GCC should implement exclude_from_explicit_instantiation

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

--- Comment #3 from Andrew Pinski  ---
I am getting a feeling this attribute is well defined enough.

Is it really just supposed to block explicit instantiation of templates?
Is there a decent set of testcases that can be used to match up the
implementations here? Because I suspect without those it will be implemented
slightly different.

[Bug c++/110000] GCC should implement exclude_from_explicit_instantiation

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

--- Comment #2 from Andrew Pinski  ---
I am trying to understand the exact details here?
https://releases.llvm.org/9.0.0/tools/clang/docs/AttributeReference.html#exclude-from-explicit-instantiation

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

--- Comment #10 from H.J. Lu  ---
(In reply to H.J. Lu from comment #9)
> [hjl@gnu-cfl-3 pr109982]$ cat x.c 
> struct S0 {
>long long int f0;
> } __attribute__((aligned(128)));
> 
> int padding = 1;
> static struct S0 g_2415 __attribute__((aligned(4))) = {-1L};
> static struct S0 *g_2500 __attribute__((visibility("internal"), used)) =
> _2415;
> 

RTL expand has

(insn 7 6 8 (set (reg/f:DI 83) 
(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -8 [0xfff8])) [4 .result_ptr+0 S8 A64]))
"x.c":11:10 -1
 (nil))

(insn 8 7 9 (set (reg:OI 84) 
(mem:OI (reg/f:DI 82 [ g_2500.0_1 ]) [2 *g_2500.0_1+0 S32 A1024]))
"x.c":11:10 -1
 (nil))

Alignment is wrong.

[Bug c++/110000] GCC should implement exclude_from_explicit_instantiation

2023-05-26 Thread ldionne.2 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

Louis Dionne  changed:

   What|Removed |Added

 CC||ldionne.2 at gmail dot com

--- Comment #1 from Louis Dionne  ---
I implemented the attribute in Clang and it was pretty easy for me even as a
novice compiler person, so I expect it wouldn't be too hard to implement in GCC
either.

Removing always_inline does not only lead to better compile times, but also to
better code generation and obviously a better debugging experience. We
investigated various other alternatives that wouldn't require using the
attribute but we concluded that we really had to if we wanted to keep a tight
grip on our ABI surface while still allowing users to explicitly instantiate
stdlib classes (which they are allowed to as long as they provide at least one
user-defined type).

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||hjl.tools at gmail dot com
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-05-26

--- Comment #9 from H.J. Lu  ---
[hjl@gnu-cfl-3 pr109982]$ cat x.c 
struct S0 {
   long long int f0;
} __attribute__((aligned(128)));

int padding = 1;
static struct S0 g_2415 __attribute__((aligned(4))) = {-1L};
static struct S0 *g_2500 __attribute__((visibility("internal"), used)) =
_2415;

const struct S0 func_21 ()
{
  return *g_2500;
}

int
main ()
{
  func_21 ();
  return 0;
}
[hjl@gnu-cfl-3 pr109982]$ make
gcc -mtune=haswell -mavx -g -w   -c -o x.o x.c
gcc -mtune=haswell -mavx -g -w -o x x.o
./x
make: *** [Makefile:16: all] Segmentation fault (core dumped)
[hjl@gnu-cfl-3 pr109982]$

[Bug c++/110000] New: GCC should implement exclude_from_explicit_instantiation

2023-05-26 Thread nikolasklauser at berlin dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

Bug ID: 11
   Summary: GCC should implement
exclude_from_explicit_instantiation
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nikolasklauser at berlin dot de
  Target Milestone: ---

`exclude_from_explicit_instantiation` is an attribute implemented by clang. It
tells the compiler that a function should not be part of an explicit
instantiation. This allows libraries to have greater control over which
functions are part of their ABI and which aren't. It is used extensively in
libc++ to keep the ABI surface as small as possible. Currently, libc++ uses
always_inline if exclude_from_explicit_instantiation isn't available, resulting
in almost every function in the library being declared as always_inline.
Replacing always_inline with exclude_from_explicit_instantiation would
approximately halve the time it takes to run the libc++ test suite with GCC.
(removing always_inline brings the time down to about the same as clang takes)

[Bug c++/109991] stack-use-after-scope

2023-05-26 Thread igkper at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109991

--- Comment #5 from igk  ---
OK, becoming clearer, thanks. I'm just hoping for this to be diagnosed in some
way. IIUC basically GCC doesn't diagnose the UB so it proceeds with constexpr
eval just because it can, or so it thinks, and in the process makes it
impossible for sanitizer to catch anything. Assuming that gets fixed some day,
then GCC might as well diagnose the issue itself and hence no need for
sanitizer to do anything.

[Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983

--- Comment #8 from Andrew Pinski  ---
(In reply to Sergei Trofimovich from comment #7)
> Original packet-rnsap.c.i.xz takes 27 minutes to compile for me.
> 
> The hack below cuts this time down to 9 minutes (slashes 60% of runtime). 

Or maybe it should be moved over to use sbitmap rather than bitmap ...

[Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta

2023-05-26 Thread slyfox at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983

--- Comment #7 from Sergei Trofimovich  ---
Original packet-rnsap.c.i.xz takes 27 minutes to compile for me.

The hack below cuts this time down to 9 minutes (slashes 60% of runtime). 

The considerable amount of time is spent looking up the bitmaps for graph edges
to extract and solve PT facts.

I'd say there is a room for micro-optimization to turn bitmap to something
slightly smarter than a linked list. It will not improve the runtime too much.

Another option could be to put a limit on edge count (say, controlled by a
`param`) which `gcc` could use to fallback on conservative value.

--- a/gcc/bitmap.h
+++ b/gcc/bitmap.h
@@ -283,7 +283,7 @@ typedef unsigned long BITMAP_WORD;
 /* Number of words to use for each element in the linked list.  */

 #ifndef BITMAP_ELEMENT_WORDS
-#define BITMAP_ELEMENT_WORDS ((128 + BITMAP_WORD_BITS - 1) / BITMAP_WORD_BITS)
+#define BITMAP_ELEMENT_WORDS ((8192 + BITMAP_WORD_BITS - 1) /
BITMAP_WORD_BITS)
 #endif

 /* Number of bits in each actual element of a bitmap.  */

[Bug c++/109991] stack-use-after-scope

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109991

--- Comment #4 from Andrew Pinski  ---
(In reply to igk from comment #3)
> (In reply to Andrew Pinski from comment #2)
> > Dup of bug 98675.
> > 
> > *** This bug has been marked as a duplicate of bug 98675 ***
> 
> Thanks for looking into this. I haven't quite understood though. 

Let me reword of what is going on and why it is still is a dup. So the
constexpr should be ignored because it is undefined code. But since GCC does
not detect the undefineness yet (this is what PR 98675 is about), GCC decides
that it is still a constexpr and evaluates it at compile time and removes the
ability for the sanitizer to detect the undefinedness at runtime.

[Bug c++/109991] stack-use-after-scope

2023-05-26 Thread igkper at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109991

--- Comment #3 from igk  ---
(In reply to Andrew Pinski from comment #2)
> Dup of bug 98675.
> 
> *** This bug has been marked as a duplicate of bug 98675 ***

Thanks for looking into this. I haven't quite understood though. 

I'm trying to see if I can find what you're saying that it should be rejected
in the C++ 14 standard (the version I have). The closest things I can find are
the following. Are they the relevant parts?

```
For a non-template, non-defaulted constexpr function or a non-template,
non-defaulted, non-inheriting constexpr constructor, if no argument values
exist such that an invocation of the function or constructor could be an
evaluated subexpression of a core constant expression (5.19), the program is
ill-formed; no diagnostic required.
```
where (5.19) includes
```
A conditional-expression e is a core constant expression unless the evaluation
of e, following the rules of the
abstract machine (1.9), would evaluate one of the following expressions:
...
- an operation that would have undefined behavior,..
```

In my example, the function takes no arguments so there are no argument values
"such that an invocation of the function or constructor could be an evaluated
sub-expression of a core constant expression". This would make my program
"ill-formed, no diagnostic required". I interpret this as saying the compiler
isn't required to reject the code. Perhaps I'm on the wrong track, but I'm
wondering, isn't such UB something sanitizer aims to catch?

Also, (not an issue with sanitizer) to me it seems odd that GCC would do
constexpr evaluation when "BadWrapUse c;" is not declared as a constexpr
variable, rather than not avoiding it because it is not valid.

[Bug c/109999] [OpenMP] Bogus error message: talks about '"#pragma omp" clause' instead of '"target" clause

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10

--- Comment #1 from Andrew Pinski  ---
: In function 'test_allocate_on_device':
:27:43: error: expected '#pragma omp' clause before 'uses_allocators'
   27 | #pragma omp target map(tofrom: errors, A)
uses_allocators(omp_default_mem_alloc)
  |   ^~~


It is because uses_allocators is not implemented yet.

If you do this:
```
int test_allocate_on_device() {
#pragma omp target hhh
  for(int i = 0;i < 10;i++);
}
```
GCC will produce a similar error message.
If you replace hhh with simd, it will work.

I suspect the error message is correct in the sense an omp clause there is
still valid too. it does not know if it will be either a target or a normal
clause .

[Bug c/109999] New: [OpenMP] Bogus error message: talks about '"#pragma omp" clause' instead of '"target" clause

2023-05-26 Thread burnus at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10

Bug ID: 10
   Summary: [OpenMP] Bogus error message: talks about '"#pragma
omp" clause' instead of '"target" clause
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: diagnostic, openmp
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

I just spotted with gcc and g++; IMHO the error message is
misleading/wrong/odd, however, it does not seem to be a regression.

Namely I get:

tests/5.0/allocate/test_allocate_on_device.c:27:43:
error: expected ‘#pragma omp’ clause before ‘uses_allocators’
   27 | #pragma omp target map(tofrom: errors, A)
uses_allocators(omp_default_mem_alloc)
  |   ^~~


EXPECTED: instead of "expected '#pragma omp' clause"
it should be show:   "expected 'target' clause".

Found when compiling:
g++ --free-line-length-none -fopenmp -I ompvv
tests/5.0/allocate/test_allocate_on_device.c

which is part of https://github.com/SOLLVE/sollve_vv

[Bug fortran/109998] New: [OpenMP] TR12/5.0/5.1 - permit structure elements with '!$OMP ALLOCATORS' (and !$OMP ALLOCATE)

2023-05-26 Thread burnus at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109998

Bug ID: 109998
   Summary: [OpenMP] TR12/5.0/5.1 - permit structure elements with
'!$OMP ALLOCATORS' (and !$OMP ALLOCATE)
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---

Cf. r14-1301-gd64e8e1224708e7f5b87c531aeb26f1ed07f91ff and in particular
in openmp.cc the comment:

   Note that the executable ALLOCATE directive permits structure elements only
   in OpenMP 5.0 and 5.1 but not longer in 5.2.  See also the comment on the
   'omp allocators' directive below. The accidental change was reverted for
   OpenMP TR12, permitting them again. See also gfc_match_omp_allocators.

   Hence, structure elements are rejected for now, also to make resolving
   OMP_LIST_ALLOCATE simpler (check for duplicates, same symbol in
   Fortran allocate stmt).  TODO: Permit structure elements.


EXPECTED: What the TODO says.


For TR12 (OpenMP Spec Issue 3437), the description in the "allocators
directive" section was changed to state:

"The list items that appear in an *allocate* clause may include structure
elements."

(It does not talk about the *allocate* directive any more as TR11/TR12/6.0
removed deprecated features.)

[Bug c++/109997] __is_assignable(int, IncompleteType) should be rejected

2023-05-26 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109997

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org
   Last reconfirmed||2023-05-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Marek Polacek  ---
Same for std::is_constructible.  So presumably we want something like

--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2173,7 +2173,10 @@ constructible_expr (tree to, tree from)
 static tree
 is_xible_helper (enum tree_code code, tree to, tree from, bool trivial)
 {
-  to = complete_type (to);
+  to = complete_type_or_else (to, NULL_TREE);
+  from = complete_type_or_else (from, NULL_TREE);
+  if (!from || !to)
+return error_mark_node;
   deferring_access_check_sentinel acs (dk_no_deferred);
   if (VOID_TYPE_P (to) || ABSTRACT_CLASS_TYPE_P (to)
   || (from && FUNC_OR_METHOD_TYPE_P (from)

but I'd have to test std::is_constructible with a parameter pack as well.

[Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta

2023-05-26 Thread slyfox at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983

--- Comment #6 from Sergei Trofimovich  ---
Created attachment 55174
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55174=edit
packet-rnsap-shrunk-slightly.c.i.xz

packet-rnsap-shrunk-slightly.c.i.xz is a slightly shrunk version of the
original. 

It exhibits 10x slowdown and has a reasonable compilation completion time.
Might be useful to explore as is or bisect gcc:

$ gcc -O2 -c packet-rnsap-shrunk-slightly.c.i -o bug.o -fipa-pta
-Wno-deprecated-declarations -fno-ipa-pta >/dev/null 2>&1

real0m0,657s
user0m0,626s
sys 0m0,026s

$ gcc -O2 -c packet-rnsap-shrunk-slightly.c.i -o bug.o -fipa-pta
-Wno-deprecated-declarations -fipa-pta >/dev/null 2>&1

real0m6,120s
user0m6,065s
sys 0m0,045s

-ftime-report says 'ipa points-to' takes 88%.

-fdump-ipa-all-details creates 2.0G bug.i.092i.pta2 file (the rest of files are
unred 5M).

I suspect it's a pathology in solving a huge `proto_reg_handoff_rnsap()` graph.
Some variables have up to 5000 PT entries.

[Bug c++/109876] [10/11/12/13/14 Regression] initializer_list not usable in constant expressions in a template

2023-05-26 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876

--- Comment #11 from Marek Polacek  ---
We never instantiated fnc because mark_used checks

  /* Check this too in case we're within instantiate_non_dependent_expr.  */
  if (DECL_TEMPLATE_INFO (decl)
  && uses_template_parms (DECL_TI_ARGS (decl)))
return true;

and here uses_template_parms says yes because value_dependent_expression_p says
'a' is value-dep.  Note we can't use in_template_function in v_d_e_p.

[Bug libstdc++/71579] type_traits miss checks for type completeness in some traits

2023-05-26 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71579

--- Comment #25 from Jonathan Wakely  ---
Some missing completeness checks:

std::assignable
We don't enforce precondition that both types are complete types, cv void, or
arrays of unknown bound. Filed as PR c++/109997

std::common_type
Our impl is SFINAE-friendly, but the standard has a precondition that all types
in the pack are complete, cv void, or array of unknown bound.

[Bug c++/109997] New: __is_assignable(int, IncompleteType) should be rejected

2023-05-26 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109997

Bug ID: 109997
   Summary: __is_assignable(int, IncompleteType) should be
rejected
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
Blocks: 71579
  Target Milestone: ---

struct S;
bool b = __is_assignable(int, S);

This should be rejected:

The precondition for std::is_assignable is:

"T and U shall be complete types, cv void, or arrays of unknown bound."


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71579
[Bug 71579] type_traits miss checks for type completeness in some traits

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #8 from Alexander Monakov  ---
Also reproducible with -march=haswell, and works with 

  -mmove-max=128 -mstore-max=128 -mtune-ctrl=^sse_unaligned_store_optimal

added. I would guess the real culprit is commit  r12-2666-g29f0e955c97 ("x86:
Update piecewise move and store") like in PR 109780.

[Bug c/109996] New: csmith: -O2 -fno-strict-aliasing causing run time trouble

2023-05-26 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109996

Bug ID: 109996
   Summary: csmith: -O2 -fno-strict-aliasing causing run time
trouble
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

Created attachment 55173
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55173=edit
C source code

The attached C code does this:

[dcb38@fedora foundBugs]$ ../results/bin/gcc -w -O1 bug924.c
runData/keep/in.45.c: In function ‘func_5’:
runData/keep/in.45.c:434:18: note: the ABI for passing parameters with 32-byte
alignment has changed in GCC 4.6
[dcb38@fedora foundBugs]$ ./a.out
checksum = 7AACDAF2

[dcb38@fedora foundBugs]$ ../results/bin/gcc -w -O2 bug924.c
runData/keep/in.45.c: In function ‘func_5’:
runData/keep/in.45.c:434:18: note: the ABI for passing parameters with 32-byte
alignment has changed in GCC 4.6
[dcb38@fedora foundBugs]$ ./a.out
Segmentation fault (core dumped)

Normally, -fno-strict-aliasing helps, but not here:

[dcb38@fedora foundBugs]$ ../results/bin/gcc -w -O2 -fno-strict-aliasing
bug924.c
runData/keep/in.45.c: In function ‘func_5’:
runData/keep/in.45.c:434:18: note: the ABI for passing parameters with 32-byte
alignment has changed in GCC 4.6
[dcb38@fedora foundBugs]$ ./a.out
Segmentation fault (core dumped)

The bug seems to have existed for some time:

$ ../results.20220515/bin/gcc -w -g -O2 -fno-strict-aliasing bug924.c
runData/keep/in.45.c: In function ‘func_5’:
runData/keep/in.45.c:434:18: note: the ABI for passing parameters with 32-byte
alignment has changed in GCC 4.6
[dcb38@fedora foundBugs]$ ./a.out
Segmentation fault (core dumped)
[dcb38@fedora foundBugs]$

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction

2023-05-26 Thread olegendo at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #42 from Oleg Endo  ---
(In reply to Alexander Klepikov from comment #41)
> 
> Thank you! I have an idea. If it's impossible to defer initial optimization,
> maybe it's possible to emit some intermediate insn and catch it and optimize
> later?

This is basically what is supposed to be happening there already.

However, it's a bit of a dilemma.

1) If we don't have a dynamic shift insn and we smash the constant shift into
individual 
stitching shifts early, it might open some new optimization opportunities, e.g.
by sharing intermediate shift results.  Not sure though if that actually
happens in practice though.

2) Whether to use the dynamic shift insn or emit a function call or use
stitching shifts sequence, it all has an impact on register allocation and
other instruction use.  This can be problematic during the course of RTL
optimization passes.

3) Even if we have a dynamic shift, sometimes it's more beneficial to emit a
shorter stitching shift sequence.  Which one is better depends on the
surrounding code.  I don't think there is anything good there to make the
proper choice.

Some other shift related PRs: PR 54089, PR 65317, PR 67691, PR 67869, PR 52628,
PR 58017


> > BTW, have you tried it on a more recent GCC?  There have also been some
> > optimizations in the middle-end (a bit more backend independent) for this
> > kind of thing.
> 
> I tried 13.1 about week or two ago with the same result.
> 

Good to know.  Thanks for checking it!

[Bug preprocessor/109988] -iwithprefix doesn't add folder to end of search list

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109988

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-05-26
   Keywords||documentation
  Component|c++ |preprocessor
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
It has been the same as -isystem since at least r0-21114-g0b22d65c9a10ce (March
1999).
The documentation was changed (added to) at r0-35796-gf3c9b8530c78ce (June
2001) to specify the same as -idirafter even though the implementation was
something different 

I don't know what the correct thing to do really since it has been almost 22
years of having the documentation not match the implementation ...
Maybe just update the documentation 

Confirmed either way.

[Bug tree-optimization/109985] __builtin_prefetch ignored by GCC 12/13

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109985

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||hubicka at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org
 Status|UNCONFIRMED |NEW

--- Comment #3 from Jakub Jelinek  ---
Since r12-5236-g5aa91072e24c1e16 the -O3 assembly contains just 2 prefetches
rather than 4.

[Bug target/109984] FAIL: insn-modes.h: No such file or directory (x86_64-apple-darwin22.4.0)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109984

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
This is not the right place to ask about a bug in your front-end.

If you use coretypes.h you need to specify $(CORETYPES_H) as a depedency on
those object files.

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

David Binderman  changed:

   What|Removed |Added

 CC||jh at suse dot cz

--- Comment #7 from David Binderman  ---
As expected:

$ git bisect bad eef81eefcdc2a581
eef81eefcdc2a58111e50eb2162ea1f5becc8004 is the first bad commit
commit eef81eefcdc2a58111e50eb2162ea1f5becc8004
Author: Jan Hubicka 
Date:   Thu Dec 22 10:55:46 2022 +0100

Zen4 tuning part 2

[Bug tree-optimization/109985] __builtin_prefetch ignored by GCC 12/13

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109985

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|1   |0
 Status|WAITING |UNCONFIRMED

[Bug tree-optimization/109985] __builtin_prefetch ignored by GCC 12/13

2023-05-26 Thread christian.mazakas at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109985

Christian Mazakas  changed:

   What|Removed |Added

 CC||christian.mazakas at gmail dot 
com

--- Comment #2 from Christian Mazakas  ---
Created attachment 55172
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55172=edit
Preprocessed source from the relevant godbolt.org link

This is the preprocessed output on my machine, generated using the code from
the relevant benchmark and develop Branch of Unordered

Let me know if it doesn't provide enough information or if more is required.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #52 from Jakub Jelinek  ---
(In reply to H.J. Lu from comment #14)
> (In reply to jos...@codesourcery.com from comment #13)
> > https://gitlab.com/x86-psABIs/i386-ABI/-/issues/5 to request such an ABI 
> > for 32-bit x86.  I don't know if there are other psABIs with public issue 
> > trackers where such issues can be filed (but we'll need some sensible 
> > default anyway for architectures where we can't get an ABI properly 
> > specified in an upstream-maintained ABI document).
> 
> ia32 psABI will follow x86-64 psABI.

Is it a good idea to use 64-bit limbs and 64-bit alignment for the ia32 ABI?
I mean, it is fine to use that _BitInt(N) for N 33..64 has
size/alignment/passing of long long, but wonder if for N > 64 the ABI shouldn't
use 32-bit limbs, 32-bit alignments and passing as struct containing the 32-bit
limbs.

[Bug c++/109876] [10/11/12/13/14 Regression] initializer_list not usable in constant expressions in a template

2023-05-26 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876

--- Comment #10 from Marek Polacek  ---
So I have

--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27969,6 +27969,13 @@ value_dependent_expression_p (tree expression)
   else if (TYPE_REF_P (TREE_TYPE (expression)))
/* FIXME cp_finish_decl doesn't fold reference initializers.  */
return true;
+  /* We have a constexpr variable and we're processing a template.  When
+there's lifetime extension involved (for which finish_compound_literal
+used to create a temporary), we'll not be able to evaluate the
+variable until instantiating, so pretend it's value-dependent.  */
+  else if (DECL_DECLARED_CONSTEXPR_P (expression)
+  && !TREE_CONSTANT (expression))
+   return true;
   return false;

 case DYNAMIC_CAST_EXPR:

but that breaks

struct foo {  };

template  void fnc() { } 

void
test()
{
  static constexpr foo a;
  fnc();
}

with:

$ ./cc1plus -quiet nontype-auto16.C 
nontype-auto16.C:6:31: warning: ‘void fnc() [with const foo& F = a]’ used but
never defined
6 | template  void fnc() { }
  |   ^~~
nontype-auto16.C:13:1: internal compiler error: Segmentation fault
   13 | }
  | ^
0x19a5624 crash_signal
/home/mpolacek/src/gcc/gcc/toplev.cc:314
0x7fe161facb1f ???
   
/usr/src/debug/glibc-2.36-9.fc37.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0xcbfe74 tree_check(tree_node const*, char const*, int, char const*, tree_code)
/home/mpolacek/src/gcc/gcc/tree.h:3795
0x12c2224 symbol_table::decl_assembler_name_hash(tree_node const*)
/home/mpolacek/src/gcc/gcc/symtab.cc:84

The warning is obviously wrong and the cause for the ICE, I'd say.  test isn't
a function template but uses_template_parms / verify_unstripped_args set p_t_d,
so we still reach the new code.

[Bug middle-end/109995] Bogus warning about __builtin_memset, from -Wstringop-overflow

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109995

--- Comment #1 from Andrew Pinski  ---
do *++p = c; while (--n > 0);

is turned into memset during optimizations.

[Bug middle-end/109995] New: Bogus warning about __builtin_memset, from -Wstringop-overflow

2023-05-26 Thread bruno at clisp dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109995

Bug ID: 109995
   Summary: Bogus warning about __builtin_memset, from
-Wstringop-overflow
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bruno at clisp dot org
  Target Milestone: ---

Created attachment 55171
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55171=edit
test case bar.c

In the attached program, -Wall produces a warning "warning: ‘__builtin_memset’
specified bound 18446744073709551614 exceeds maximum object size
9223372036854775807 [-Wstringop-overflow=]", in a function that does not invoke
'memset' nor '__builtin_memset'.

With gcc 10.4.0:
$ gcc -O2 -Wall -S bar.c
In function ‘memset_small’,
inlined from ‘wrap’ at bar.c:242:1:
bar.c:249:17: warning: ‘__builtin_memset’ specified bound 18446744073709551614
exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
  249 | do *++p = c; while (--n > 0);
  |~^~~

With gcc 11.3.0, 12.3.0, 13.1.0:
$ gcc -O2 -Wall -S bar.c
In function ‘memset_small’,
inlined from ‘memset_small’ at bar.c:242:1,
inlined from ‘wrap’ at bar.c:590:19:
bar.c:249:17: warning: ‘__builtin_memset’ specified bound 18446744073709551614
exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
  249 | do *++p = c; while (--n > 0);
  |~^~~

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

--- Comment #6 from David Binderman  ---
This commit looks highly likely:

commit eef81eefcdc2a58111e50eb2162ea1f5becc8004
Author: Jan Hubicka 
Date:   Thu Dec 22 10:55:46 2022 +0100

Zen4 tuning part 2

[Bug preprocessor/109994] Issue a diagnostic when a C++ file defines a macro that hides a keyword

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109994

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Not just testcases, libgcc.h does that too (though, sure, that is C).

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

--- Comment #5 from David Binderman  ---
Current git range is g:193fccaa5c3525e9 .. g:5b30e9bc211fede0,
which is 8 commits.

[Bug preprocessor/109994] Issue a diagnostic when a C++ file defines a macro that hides a keyword

2023-05-26 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109994

--- Comment #2 from Jonathan Wakely  ---
(In reply to Andrew Pinski from comment #1)
> There are definitly testcases in GCC's testsuite which does this all the
> time.
> #define int ...

Yeah, it shouldn't be in -Wall, and it's not a required diagnostic for
conformance. But it might be nice. Not a priority though.

[Bug preprocessor/109994] Issue a diagnostic when a C++ file defines a macro that hides a keyword

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109994

--- Comment #1 from Andrew Pinski  ---
There are definitly testcases in GCC's testsuite which does this all the time.
#define int ...

[Bug middle-end/109990] [12/13/14 Regression] Bogus -Wuse-after-free warning after realloc

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109990

--- Comment #6 from Andrew Pinski  ---
(In reply to Bruno Haible from comment #4) 
> That is the only way of keeping track of pointers _into_ the string_space
> area, when it is reallocated. How else would you want to do it?

You could use intptr_t casting to do the subtraction ...

[Bug middle-end/109990] [12/13/14 Regression] Bogus -Wuse-after-free warning after realloc

2023-05-26 Thread bruno at clisp dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109990

--- Comment #5 from Bruno Haible  ---
Created attachment 55170
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55170=edit
test case bar2.c

Find attached a modified test case. I changed the code to

  map[i].alias = new_pool + (map[i].alias -
string_space);
  map[i].value = new_pool + (map[i].value -
string_space);

so that it subtracts pointers into the old string_space, producing an integer,
and adding that integer to new_pool.

It produces the same warning (even twice, apparently because there is no common
subexpression between the two lines any more):

$ gcc -Wall -O2 -S bar2.c
bar2.c: In function ‘read_alias_file’:
bar2.c:123:67: warning: pointer may be used after ‘realloc’ [-Wuse-after-free]
  123 |   map[i].value = new_pool + (map[i].value -
string_space);
  |
~~^~~
bar2.c:114:45: note: call to ‘realloc’ here
  114 |   char *new_pool = (char *) realloc (string_space,
new_size);
  |
^~~~
bar2.c:122:67: warning: pointer may be used after ‘realloc’ [-Wuse-after-free]
  122 |   map[i].alias = new_pool + (map[i].alias -
string_space);
  |
~~^~~
bar2.c:114:45: note: call to ‘realloc’ here
  114 |   char *new_pool = (char *) realloc (string_space,
new_size);
  |
^~~~

[Bug tree-optimization/109985] __builtin_prefetch ignored by GCC 12/13

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109985

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
   Last reconfirmed||2023-05-26

--- Comment #1 from Andrew Pinski  ---
There are only two __builtin_prefetch in .optimized for GCC 12.

This is definitely going to be hard to debug ...

Can you attach the preprocessed source?

[Bug preprocessor/109994] Issue a diagnostic when a C++ file defines a macro that hides a keyword

2023-05-26 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109994

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||mpolacek at gcc dot gnu.org
   Last reconfirmed||2023-05-26

[Bug preprocessor/109994] New: Issue a diagnostic when a C++ file defines a macro that hides a keyword

2023-05-26 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109994

Bug ID: 109994
   Summary: Issue a diagnostic when a C++ file defines a macro
that hides a keyword
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: enhancement
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

The C++ standard says this is undefined:

#define new foo

It might be nice if the preprocessor had a warning about it.


[macro.names]
A translation unit shall not #define or #undef names lexically identical to
keywords, to the identifiers listed in Table 4, or to the attribute-tokens
described in 9.12, except that the names likely and unlikely may be defined as
function-like macros (15.6).

[Bug middle-end/109990] [12/13/14 Regression] Bogus -Wuse-after-free warning after realloc

2023-05-26 Thread bruno at clisp dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109990

--- Comment #4 from Bruno Haible  ---
> > 
> >   char *new_pool = (char *) realloc (string_space, 
> > new_size);
> >   if (new_pool == ((void *)0))
> > goto out;
> >   if (__builtin_expect (string_space != new_pool, 0))
> > {
> >   size_t i;
> >   for (i = 0; i < nmap; i++)
> > {
> >   map[i].alias += new_pool - string_space;
> >   map[i].value += new_pool - string_space;
> > }
> > }
> >   string_space = new_pool;

> Also I think `new_pool - string_space` is undefined really.  That is
> subtracting two unrelated arrays is undefined. You can only compare equality
> on them.

That is the only way of keeping track of pointers _into_ the string_space area,
when it is reallocated. How else would you want to do it?

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread dcb314 at hotmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

--- Comment #4 from David Binderman  ---
Original git range was 123 commits.

Current bisect range is g:89ba8366fe12fd2d .. g:23be9d78f4bcd773,
which is 31 commits.

Trying 5b30e9bc211fede0.

[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};

2023-05-26 Thread muecker at gwdg dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956

--- Comment #14 from Martin Uecker  ---


Maybe. 

On the other hand, I wonder whether a struct with FAM should not rather always
have the same size, and alignment, and representation as the corresponding
struct with a conventional array. This would conceptually be cleaner, easier to
understand, and less error prone.

[Bug libstdc++/109993] New: std::regex("\\a", std::regex::basic) does not diagnose invalid BRE

2023-05-26 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109993

Bug ID: 109993
   Summary: std::regex("\\a", std::regex::basic) does not diagnose
invalid BRE
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
Blocks: 102445
  Target Milestone: ---

#include 
int main()
{
 std::regex("\\a", std::regex::basic);
}

This should throw a std::regex_error exception.

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03

"The interpretation of an ordinary character preceded by an unescaped
 ( '\\' ) is undefined, except for: [...]"


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #51 from Jakub Jelinek  ---
Note, I've only tested it so far on
_BitInt(256) a = 0x1234ab461289cdab8d111007b461289cdab8d1wb;
_BitInt(256) b = 0x2385eabcd072311074bcaa385eabcd07111007b46128wb;
_BitInt(384) c = (_BitInt(384)) 0x1234ab461289cdab8d111007b461289cdab8d1wb *
0x2385eabcd072311074bcaa385eabcd07111007b46128wb;
_BitInt(384) d;
extern void __mulbitint3 (unsigned long *, int, const unsigned long *, int,
const unsigned long *, int);

void
foo ()
{
  __mulbitint3 (, 384, , 256, , 196);
}
multiplication, nothing else, guess it will be easier to test it when we can
emit from the compiler.  And obviously no testing of the big endian limb
ordering handling until we add some arch that will support it (if we do that at
all).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55151|0   |1
is obsolete||

--- Comment #50 from Jakub Jelinek  ---
Created attachment 55169
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55169=edit
gcc14-bitint-wip.patch

Update, this time with addition of libgcc _BitInt multiplication libcall (but
not really wiring it on the compiler side yet, that would be part of the new
_BitInt lowering pass).

The function currently is
void __mulbitint3 (__bitint_limb *ret, int retprec, const __bitint_limb *u, int
uprec, const __bitint_limb *v, int vprec);
which allows mixing different precisions (at compile time, or at runtime using
the bitint_reduce_prec function); while in GIMPLE before _BitInt lowering pass
MULT_EXPR
will obviously have same precision for result and both operands, the lowering
pass could
spot zero or sign extensions from narrower _BitInts for the operands, or VRP
could figure out smaller ranges of values for the operands.
Negative uprec or vprec would mean the operand is sign extended from precision
-[uv]prec, positive it is zero extended from [uv]prec.
u/v could be the same or overlapping, but as the function writes result before
consuming all inputs, doesn't allow aliasing between operands and return value.
Also, at least in the x86-64 psABI, _BitInt(N) for N < 64 is special and it
isn't expected  this function would be really used for multiplication of such
_BitInts, but of course if say multiplicating say _BitInt(512) by _Bitint(24),
it is expected the lowering pass would push those 24 bits into a 64-bit 64-bit
aligned limb and pass 24 for that operand.
For inputs it assumes bits above precision but still within a limb are
uninitialized (and so zero or sign extends when reading it), for the output it
always writes full limb (with hopefully proper zero/sign extensions).
The implemented algorith is the base school book multiplication, if really
needed, we could do Karatsuba for larger inputs.

What do you think about this API?
Shall I continue and create similar API for divmod?

Also, wonder what to do about _BitInt(N) in __builtin_mul_overflow{,_p}.  One
option would be to say that negative retprec is a request to return a nonzero
result for the overflow case, but wonder how much larger the routine would be
in that case.  Or if we
should have two, one for multiplication and one for multiplication with
overflow checking.  Yet another possibility is to do a dumb thing on the
compiler side, call the multiplication with a temporary result as large that it
would never overflow and check for the overflow on the caller side.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-26 Thread gjl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #23 from Georg-Johann Lay  ---
Thank you so much for looking into this.

For the test case from comment #21 though, the problem is somewhere in tree
optimizations.

> unsigned char lfsr32_mpp_ge0 (unsigned long number)
> {
>   unsigned char b = 0;
>   if (number >= 0) b--;
>   if (number & (1UL << 29)) b++;
>   if (number & (1UL << 13)) b++;
> 
>   return b;
> }

The -fdump-tree-optimized dump reads:

;; Function lfsr32_mpp_ge0 (lfsr32_mpp_ge0, funcdef_no=0, decl_uid=1880,
cgraph_uid=1, symbol_order=0)

unsigned char lfsr32_mpp_ge0 (long unsigned int number)
{
  unsigned char b;
  long unsigned int _1;
  long unsigned int _2;
  _Bool _3;
  unsigned char _8;
  _Bool _9;
  unsigned char _10;
  unsigned char _11;

   [local count: 1073741824]:
  _1 = number_5(D) & 536870912;
  _2 = number_5(D) & 8192;
  if (_2 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  _9 = _1 == 0;
  _10 = (unsigned char) _9;
  _11 = -_10;
  goto ; [100.00%]

   [local count: 536870913]:
  _3 = _1 != 0;
  _8 = (unsigned char) _3;

   [local count: 1073741824]:
  # b_4 = PHI <_11(3), _8(4)>
  return b_4;
}

The ANDs are expanded by expand_binop() and later passes have to deal with the
32-bit arithmnetic.  combine finds one combination of andsi3 into
"*sbrx_and_branch_split" with mode=si, but apart from that the mess still
lives in asm.

[Bug tree-optimization/109992] Addition/subtraction to the last bitfield of a struct can be optimized

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109992

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-05-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Andrew Pinski  ---
After my lowering pass (little-endian) we have:
  _1 = BIT_FIELD_REF <_9, 29, 3>;
  _2 = (unsigned int) _1;
  _3 = _2 + add_7(D);
  _4 = () _3;
  _11 = BIT_INSERT_EXPR <_9, _4, 3 (29 bits)>;

Which I suspect we could pattern match to:
_t = add_7 << 3;
_11 = _9 + _t;

iff 3+29 == 32(int)

Big-endian (with fields a and b swapped order in the source):
  _9 = MEM[(struct foo *)p_6(D)];
  _1 = BIT_FIELD_REF <_9, 29, 0>;
  _2 = (unsigned int) _1;
  _3 = _2 + add_7(D);
  _4 = () _3;
  _11 = BIT_INSERT_EXPR <_9, _4, 0 (29 bits)>;


Similar pattern matching, just using 0 for the offset rather than 3 ...

[Bug tree-optimization/109992] Addition/subtraction to the last bitfield of a struct can be optimized

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109992

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
  Component|rtl-optimization|tree-optimization

[Bug rtl-optimization/109992] Addition/subtraction to the last bitfield of a struct can be optimized

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109992

--- Comment #1 from Andrew Pinski  ---
As an aside: it is funny how x86 does not have a bits insert instruction yet
(while almost all RISC targets have that now).

[Bug rtl-optimization/109992] New: Addition/subtraction to the last bitfield of a struct can be optimized

2023-05-26 Thread lh_mouse at 126 dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109992

Bug ID: 109992
   Summary: Addition/subtraction to the last bitfield of a struct
can be optimized
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lh_mouse at 126 dot com
  Target Milestone: ---

For an unsigned bit field:
```
struct foo
  {
unsigned a :  3;
unsigned b : 29;
  };

void
bad_add(struct foo* p, unsigned add)
  {
p->b += add;
  }
```

GCC:
```
bad_add:
mov eax, DWORD PTR [rdi]
mov edx, eax
and eax, 7
shr edx, 3
add edx, esi
sal edx, 3
or  eax, edx
mov DWORD PTR [rdi], eax
ret
```

Clang:
```
bad_add:# @bad_add
shl esi, 3
add dword ptr [rdi], esi
ret
```

It looks like GCC extracts the bitfield first, performs the addition, then
inserts it back.

The result is almost the same for a signed bitfield, but not exacting the
bitfield first is subject to overflows, so it may be a different story.

[Bug target/109982] csmith: x86_64: znver1 issues

2023-05-26 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109982

--- Comment #3 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #1)
> Also fails with "-mtune=znver1 -mavx":
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x004048ef in func_21 (p_22=0x41b330 , p_23=0, p_24=8) at
> runData/keep/in.11.c:597
> 597 in runData/keep/in.11.c
> (gdb) disass $pc-10, $pc+10
> Dump of assembler code from 0x4048e5 to 0x4048f9:
>0x004048e5 :   mov(%rax),%rdx
>0x004048e8 :   mov-0x1378(%rbp),%rax
> => 0x004048ef :   vmovdqa (%rdx),%ymm0
>0x004048f3 :   vmovdqa %ymm0,(%rax)
>0x004048f7 :   vmovdqa 0x20(%rdx),%ymm0
> End of assembler dump.
> (gdb) p/x $rdx
> $3 = 0x41a824
> 
> Unaligned access.

After some more analysis, the above *IS* unaligned access. At the end of
func_21, we have:

=> 0x004048ef <+8170>:  vmovdqa (%rdx),%ymm0
   0x004048f3 <+8174>:  vmovdqa %ymm0,(%rax)
   0x004048f7 <+8178>:  vmovdqa 0x20(%rdx),%ymm0
   0x004048fc <+8183>:  vmovdqa %ymm0,0x20(%rax)
   0x00404901 <+8188>:  vmovdqa 0x40(%rdx),%ymm0
   0x00404906 <+8193>:  vmovdqa %ymm0,0x40(%rax)
   0x0040490b <+8198>:  vmovdqa 0x60(%rdx),%ymm0
   0x00404910 <+8203>:  vmovdqa %ymm0,0x60(%rax)

which looks like a memory copy to me. Unfortunately, the address is unaligned:

(gdb) p/x $rdx
$2 = 0x41a824

Changing the above vmovdqa insns to vmovdqu results in a successful run.

[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907

--- Comment #22 from Andrew Pinski  ---
(In reply to Georg-Johann Lay from comment #20)
> What then happens is:
> 
> expr.cc::do_store_flag()
> expmed.cc::emit_store_flag_force()
> expmed.cc::emit_store_flag()
> expmed.cc::emit_store_flag_1()
> 
> the latter then does:
> 
>   if (STORE_FLAG_VALUE == 1 || normalizep)
> /* If we are supposed to produce a 0/1 value, we want to do
>a logical shift from the sign bit to the low-order bit; for
>a -1/0 value, we do an arithmetic shift.  */
> op0 = expand_shift (RSHIFT_EXPR, int_mode, op0,
> GET_MODE_BITSIZE (int_mode) - 1,
> subtarget, normalizep != -1);
> 
> "normalizep" is true because ops->type has a precision of 1, and
> STORE_FLAG_VALUE is the default of 1.
> 
> Nowhere is there any cost computation or consideration whether extzv could
> do the trick.

Thanks for tracking down where the shift is expanded to. Let me try to use
extract_bit_field there instead (which should produce the better code).

[Bug middle-end/109986] missing fold (~a | b) ^ a => ~(a & b)

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Last reconfirmed||2023-05-26
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
GCC does handle:
int f0(int a, int b)
{
return (a | b) ^ a;
}

And:
int f1(int a, int b)
{
return (a | ~b) ^ a;
}

[Bug c++/55004] [meta-bug] constexpr issues

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004
Bug 55004 depends on bug 109991, which changed state.

Bug 109991 Summary: stack-use-after-scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109991

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/98675] Accessing member of temporary outside its lifetime allowed in constexpr function

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98675

Andrew Pinski  changed:

   What|Removed |Added

 CC||igkper at gmail dot com

--- Comment #6 from Andrew Pinski  ---
*** Bug 109991 has been marked as a duplicate of this bug. ***

[Bug c++/109991] stack-use-after-scope

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109991

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Dup of bug 98675.

*** This bug has been marked as a duplicate of bug 98675 ***

[Bug c++/109991] stack-use-after-scope

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109991

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic
 Status|UNCONFIRMED |NEW
  Component|sanitizer   |c++
   Last reconfirmed||2023-05-26
 Ever confirmed|0   |1
 Blocks||55004

--- Comment #1 from Andrew Pinski  ---
That is because with constexpr, the code should have been rejected ...

Take this C++20 GCC accepts it (incorrectly) but clang rejects it:
```
using T = int;
struct Wrap
{
T const& v;
constexpr Wrap(T const& in) : v{in} {}
};

struct BadWrapUse final
{
T i{};
consteval BadWrapUse()
{
Wrap w{T{}};  // temporary T's lifetime ends after this expression
i = w.v;  // This should lead to stack-use-after-scope.
}
};

int main()
{
BadWrapUse c;
}
```

Note there might be a dup of it somewhere.
Basically in your original example, GCC is doing constexpr evulation but that
is not valid for constant expression evulation ...


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004
[Bug 55004] [meta-bug] constexpr issues

[Bug sanitizer/109991] New: stack-use-after-scope

2023-05-26 Thread igkper at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109991

Bug ID: 109991
   Summary: stack-use-after-scope
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: igkper at gmail dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Hi,

I believe the below code should result in sanitizer complaining about
stack-use-after-scope, but it does not. I've noted that clang catches this but
not gcc. I've annotated where I've noted it seems to depend on whether or not
constexpr is used. See  https://godbolt.org/z/Y3YKcfGda.

using T = int;

struct Wrap
{
T const& v;

// Shouldn't extend lifetime of temporary
constexpr Wrap(T const& in) : v{in} {}
};

struct BadWrapUse final
{
T i{};

constexpr BadWrapUse()  // issue not caught with constexpr
// BadWrapUse()  // issue caught without constexpr
{
Wrap w{T{}};  // temporary T's lifetime ends after this expression
i = w.v;  // This should lead to stack-use-after-scope.
}
};

int main()
{
BadWrapUse c;
}

[Bug middle-end/109990] [12/13/14 Regression] Bogus -Wuse-after-free warning after realloc

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109990

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> ```
> 
>   char *new_pool = (char *) realloc (string_space, new_size);
>   if (new_pool == ((void *)0))
> goto out;
>   if (__builtin_expect (string_space != new_pool, 0))
> {
>   size_t i;
>   for (i = 0; i < nmap; i++)
> {
>   map[i].alias += new_pool - string_space;
>   map[i].value += new_pool - string_space;
> }
> }
>   string_space = new_pool;
> ```
> 
> Hmmm

Also I think `new_pool - string_space` is undefined really.  That is
subtracting two unrelated arrays is undefined. You can only compare equality on
them.

[Bug middle-end/109990] [12/13/14 Regression] Bogus -Wuse-after-free warning after realloc

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109990

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=104215

--- Comment #2 from Andrew Pinski  ---
See also the discussion starting at bug 104215 comment #2.

[Bug middle-end/109990] [12/13/14 Regression] Bogus -Wuse-after-free warning after realloc

2023-05-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109990

--- Comment #1 from Andrew Pinski  ---
```

  char *new_pool = (char *) realloc (string_space, new_size);
  if (new_pool == ((void *)0))
goto out;
  if (__builtin_expect (string_space != new_pool, 0))
{
  size_t i;
  for (i = 0; i < nmap; i++)
{
  map[i].alias += new_pool - string_space;
  map[i].value += new_pool - string_space;
}
}
  string_space = new_pool;
```

Hmmm

[Bug middle-end/109990] New: [12 Regression] Bogus -Wuse-after-free warning after realloc

2023-05-26 Thread bruno at clisp dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109990

Bug ID: 109990
   Summary: [12 Regression] Bogus -Wuse-after-free warning after
realloc
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bruno at clisp dot org
  Target Milestone: ---

Created attachment 55168
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55168=edit
test case bar.c

Compiling the attached file produces a warning that is not justified:

$ gcc -Wall -O2 -S bar.c
bar.c: In function ‘read_alias_file’:
bar.c:122:52: warning: pointer may be used after ‘realloc’ [-Wuse-after-free]
  122 |   map[i].alias += new_pool - string_space;
  |   ~^~
bar.c:114:45: note: call to ‘realloc’ here
  114 |   char *new_pool = (char *) realloc (string_space,
new_size);
  |
^~~~

The warning is not justified because only the pointer 'string_space' is used
here; it is not being dereferenced.

Seen with gcc 12.3.0 and 13.1.0.

[Bug rtl-optimization/60749] combine is overly cautious when operating on volatile memory references

2023-05-26 Thread lis8215 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60749

Siarhei Volkau  changed:

   What|Removed |Added

 CC||lis8215 at gmail dot com

--- Comment #2 from Siarhei Volkau  ---
Created attachment 55167
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55167=edit
allow combine ld/st of volatile mem with any_extend op

Is anyone bothering on that? I'm, as embedded engineer, sadly looking on that
long standing issue.

I can propose a quick patch which enables combining volatile mem ld/st with
any_extend for most cases. And it seems, like platform specific test results
remain the same with it (arm/aarch64/mips were tested).

Post it in hope it can help for anyone who needs it.

[Bug tree-optimization/109989] New: RISC-V: Missing sign extension with int to float conversion with 64bit soft floats

2023-05-26 Thread joseph.faulls at imgtec dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109989

Bug ID: 109989
   Summary: RISC-V: Missing sign extension with int to float
conversion with 64bit soft floats
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: joseph.faulls at imgtec dot com
  Target Milestone: ---

Created attachment 55166
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55166=edit
Preprocessed test

Hi,

This bug was discovered when running test
gcc.target/riscv/promote-type-for-libcall.c on O1 for march rv64imac.

There are a few moving parts to this, and I haven't been able to track down
where the bug lies due to not being at all familiar with gcc. But I've managed
to reduce the test to the following criteria:

Compilation flags:
-march=rv64imac -mabi=lp64 -O1 -ftree-slp-vectorize -funroll-loops

march can be any 64bit without f/d extension.
Removal of any of the other flags (with the given test case) will not cause the
bug.

I have confirmed the only difference between 12.1 and 13.1 is the missing sign
extension before the call to __floatsisf

Inlining the test file here for added comments:

#include 
#include 
volatile float f[2];

int x[2] ;

int main() {
  int i;
  x[0] = -1;
  x[1] = 2; // Removal of this line avoids the bug

  for (i=0;i<1;++i){
f[i] = x[i]; // Any attempt to printf x[i] here avoids the bug
  }

  if (f[0] != -1.0f) {
abort();
  }
  return 0;
}

Thanks!

[Bug target/100811] Consider not omitting frame pointers by default on targets with many registers

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100811

--- Comment #11 from Jakub Jelinek  ---
DWARF unwinding works properly, just in Linux kernel they decided they don't
want it in the kernel (I think they had some non-perfect implementation in the
past and it got removed).

[Bug target/100811] Consider not omitting frame pointers by default on targets with many registers

2023-05-26 Thread xry111 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100811

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #10 from Xi Ruoyao  ---
Frankly I've seen too much "slowing down everyone's system just for some
debugging/profiling/whatever tools" thing.  So I'd say a clear "no".

You may argue 1% performance degradation is acceptable, but the change would be
a bad start to justify other changes and at last we'll accumulate a 9.5%
degradation with 10 such changes one day.

If DWARF unwinding does not work properly, try to fix it or revise the DWARF
specification, instead of making every system slower.

[Bug c++/109988] -iwithprefix doesn't add folder to end of search list

2023-05-26 Thread ivan.lazaric.gcc at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109988

--- Comment #1 from Ivan Lazaric  ---
In `gcc/c-family/c-opts.cc`:
```
case OPT_iwithprefix:
  add_prefixed_path (arg, INC_SYSTEM);
  break;
```

Should `INC_SYSTEM` actually be `INC_AFTER` ?

1 2 >

1 - 100 of 147 matches

Mail list logo