[Bug gcov-profile/113765] ICE: autofdo: val-profiler-threads-1.c compilation, error: probability of edge from entry block not initialized

2024-02-05 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113765

--- Comment #3 from Andi Kleen  ---
-O1 fixes it, so an easy patch would be 

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 63d0c3dc36df..180ed7a8260f 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1758,7 +1758,7 @@ public:
   bool
   gate (function *) final override
   {
-return flag_auto_profile;
+return flag_auto_profile && optimize > 0;
   }
   unsigned int
   execute (function *) final override

[Bug gcov-profile/113765] autofdo: val-profiler-threads-1.c compilation, error: probability of edge from entry block not initialized

2024-02-05 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113765

--- Comment #1 from Andi Kleen  ---
Seems to be a regression, I tested the same setup on gcc 13 and the test passes
there:

55:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c compilation, 
-fprofile-generate -D_PROFILE_GENERATE
59:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c execution,   
-fprofile-generate -D_PROFILE_GENERATE
62:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c compilation,  -fprofile-use
-D_PROFILE_USE
66:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c execution,-fprofile-use
-D_PROFILE_USE
76:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c compilation,  -g
-DFOR_AUTOFDO_TESTING
108:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c execution,-g
-DFOR_AUTOFDO_TESTING
111:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c compilation, 
-fauto-profile -DFOR_AUTOFDO_TESTING -fearly-inlining
115:PASS: gcc.dg/tree-prof/val-profiler-threads-1.c execution,   
-fauto-profile -DFOR_AUTOFDO_TESTING -fearly-inlining

[Bug gcov-profile/113765] New: autofdo: val-profiler-threads-1.c compilation, error: probability of edge from entry block not initialized

2024-02-05 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113765

Bug ID: 113765
   Summary: autofdo: val-profiler-threads-1.c compilation,  error:
probability of edge from entry block not initialized
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andi-gcc at firstfloor dot org
  Target Milestone: ---

With recent trunk (019dc63819be)

When running the test suite on a Intel system with autofdo installed

Executing on host: /home/ak/gcc/obj-full/gcc/xgcc -B/home/ak/gcc/obj-full/gcc/ 
/home/ak/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c   
-fdi
agnostics-plain-output   -O0 -pthread -fprofile-update=atomic
-fauto-profile=/home/ak/gcc/obj-full/gcc/testsuite/gcc20/afdo.val-profiler-threads-1.gcda
-DFOR_AU
TOFDO_TESTING -fearly-inlining -dumpbase-ext .x02  -lm  -o
/home/ak/gcc/obj-full/gcc/testsuite/gcc20/val-profiler-threads-1.x02   
(timeout = 300)
spawn -ignore SIGHUP /home/ak/gcc/obj-full/gcc/xgcc
-B/home/ak/gcc/obj-full/gcc/
/home/ak/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c -fdiag
nostics-plain-output -O0 -pthread -fprofile-update=atomic
-fauto-profile=/home/ak/gcc/obj-full/gcc/testsuite/gcc20/afdo.val-profiler-threads-1.gcda
-DFOR_AUTOFD
O_TESTING -fearly-inlining -dumpbase-ext .x02 -lm -o
/home/ak/gcc/obj-full/gcc/testsuite/gcc20/val-profiler-threads-1.x02
/home/ak/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c: In
function 'copy_memory':
/home/ak/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c:13:7:
error: probability of edge from entry block not initialized
/home/ak/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c:13:7:
error: probability of edge 2->4 not initialized
/home/ak/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c:13:7:
error: probability of edge 5->1 not initialized
during GIMPLE pass: fixup_cfg
/home/ak/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/val-profiler-threads-1.c:13:7:
internal compiler error: verify_flow_info failed
0xafb91e verify_flow_info()
../../gcc/gcc/cfghooks.cc:287
0xf0e8a7 execute_function_todo
../../gcc/gcc/passes.cc:2100
0xf0edde execute_todo
../../gcc/gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
compiler exited with status 1

I'm not attaching the source because it also needs the autofdo gcov file to
reproduce and the test case is already in tree.

[Bug lto/107779] Support implicit references from inline assembler to compiler symbols

2023-10-15 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107779

--- Comment #4 from Andi Kleen  ---
This whole manual annotation idea (which is equivalent to marking the symbols
global and visible and that is what a large part of the kernel LTO patchkit) is
dead on arrival because the kernel people already rejected it. Their argument
is that they don't need it for LLVM why should they be forced to it for GCC. In
LLVM it is just done by the assembler, and it works without any extra program
changes.

Since gcc is not the only game in town anymore they have a point.

It's either heuristics or integrating the assembler.

[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

--- Comment #5 from Andi Kleen  ---

config/i386/i386.h:#define SLOW_BYTE_ACCESS 0

You mean it doesn't define it?

[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

--- Comment #2 from Andi Kleen  ---
Okay then it doesn't understand that SHL_signed and SHR_unsigned can be
combined when one the values came from a shorter unsigned.

[Bug middle-end/111743] New: shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

Bug ID: 111743
   Summary: shifts in bit field accesses don't combine with other
shifts
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andi-gcc at firstfloor dot org
  Target Milestone: ---

(not sure it's the middle-end, picked arbitrarily)

The following code

struct bf { 
unsigned a : 10, b : 20, c : 10;
};
unsigned fbc(struct bf bf) { return bf.b | (bf.c << 20); }


generates:

movq%rdi, %rax
shrq$10, %rdi
shrq$32, %rax   
andl$1048575, %edi
andl$1023, %eax
sall$20, %eax
orl %edi, %eax
ret

It doesn't understand that the shift right can be combined with the shift left.
Also not sure why the shift left is arithmetic (this should be all unsigned) 

clang does the simplification which ends up one instruction shorter:
movl%edi, %eax
shrl$10, %eax
andl$1048575, %eax  # imm = 0xF
shrq$12, %rdi
andl$1072693248, %edi   # imm = 0x3FF0
orl %edi, %eax
retq

[Bug lto/107779] New: Support implicit references from inline assembler to compiler symbols

2022-11-20 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107779

Bug ID: 107779
   Summary: Support implicit references from inline assembler to
compiler symbols
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andi-gcc at firstfloor dot org
CC: hubicka at gcc dot gnu.org, marxin at gcc dot gnu.org,
mliska at suse dot cz
  Target Milestone: ---

Created attachment 53933
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53933=edit
prototype patch

So I looked into the problem the kernel people complained about: a
lot of assembler statements reference C symbols, which need externally_visible
and
global for gcc LTO, otherwise they can end up in the wrong asm file
and cause missing symbols.

I came up with the attached (hackish) patch that tries to solve the problem
very
partially: it parses the assembler strings and looks for anything that
could be an identifier, and then tries to mark it externally_visible.

It has the following open issues:

- The parsing is very approximate and doesn't handle some obscure cases.
With the approximation it's also impossible to give error messages,
but hopefully the linker takes care of that.
It also gives false positives with some assembler syntax,
but in the worst case would just lose some optimization from unnecessary
references.

- It doesn't handle the case (which happens in the kernel) that the C
declaration is after the asm statement. This could be fixed with some
more effort.

- It doesn't work for static which can get mangled (that's a lot of
the kernel cases)
static is a difficult problem because there could be conflicting names,
so we cannot jut put it all in partition zero.

This would need some special handling in the LTO partitioning code to
create new partitions just for having unique name spaces, and then
avoid mangling.  Related problem is also PR50676

It's likely possible to create situations where it's impossible to
solve, there could be circular dependencies etc. But I assume in this
case the non LTO case would fail too.

Or maybe do something with redefining symbols at the assembler level.

This one is somewhat difficult and I don't have a simple solution
currently. Unfortunately to solve the kernel issue would need a
solution for static.

[Bug lto/107014] flatten+lto fails the kernel build

2022-09-25 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

Andi Kleen  changed:

   What|Removed |Added

 CC||andi-gcc at firstfloor dot org

--- Comment #9 from Andi Kleen  ---
I suspect what happens is that it hits in some kernel initialization function.
If they don't use initcall the LTO build can all inline them into each other
(because they are only called once) creating a single big initialization
function. With flatten that will create an extremely large function that takes
a long time to process.

I suspect any use of flatten is better using always_inline, since that affects
only a single function. Should probably be fixed upstream in the kernel.

[Bug preprocessor/45227] libcpp Makefile does not enable instrumentation

2022-01-04 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45227

--- Comment #5 from Andi Kleen  ---
I think it was the method from the info file.

But I can't quite remember. If you cannot reproduce it I guess it's ok to
close. Maybe I made some mistake.

[Bug middle-end/99578] gcc-11 -Warray-bounds or -Wstringop-overread warning when accessing a pointer from integer literal

2021-05-01 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578

Andi Kleen  changed:

   What|Removed |Added

 CC||andi-gcc at firstfloor dot org

--- Comment #12 from Andi Kleen  ---
It looks to me separate bugs are mixed together here.

For example I looked at the preallocate_pmd warning again and I don't think
there is any union there. Also I noticed that when I replace the *foo[N] with
**foo it disappears. So I think that is something different.

So there seem to be instances where such warnings happen without union members.
Perhaps that one (and perhaps some others) need to be reanalyzed.

I also looked at the intel_pm.c and I think that one is a real kernel bug,
where the field accessed is really too small. I'll submit a patch for that.

[Bug lto/99828] inlining failed in call to ‘always_inline’ ‘memcpy’: --param max-inline-insns-auto limit reached

2021-03-30 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99828

--- Comment #3 from Andi Kleen  ---
So what do you want to fix in the kernel? 

Use a wrapper for taking the address of the memcpy?
(I hope nothing in gcc would remove such a wrapper)