On 10/7/20 1:23 PM, Jan Hubicka wrote:
During WPA we merge 2 functions where one is built with -O3
-fprofile-use while the other one uses -O0. We end up with:
prevailing node:
BitwiseCast (const double aFrom)
{
long unsigned int temp;
long unsigned int D.4528;
<bb 2> :
BitwiseCast (aFrom_2(D), &temp);
_4 = temp;
temp ={v} {CLOBBER};
<bb 3> :
<L0>:
return _4;
}
merged node:
BitwiseCast (const double aFrom)
{
<bb 2> [count: 1509]:
_3 = VIEW_CONVERT_EXPR<long unsigned int>(aFrom_2(D));
return _3;
}
What happens is that we copy dst->count = src->count.ipa() but
we skip merging of BBs (a different CFG). Then we end up in situation
where profile of a ENTRY_BB_FOR_FN == uninitialized while
dst->count.quality is precise.
In usual case there of diverging CFGs both functions have non-trivial
profile, one is ipa and other is guessed (so count.initialized_p returns
false)
In this case one TU is compiled with -O0, so we miss profile at all
(if I see correctly).
In this case we do want to copy the count which will lead in scaling the
local profile by the known entry frequency. This is arranged by
fixup_cfg that does this kind of transitions.
All right, here we end up with
profile_count num = node->count;
profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
(gdb) p num.debug()
1509 (precise)
$7 = void
(gdb) p den.debug()
1509 (precise)
$8 = void
bool scale = num.initialized_p () && !(num == den);
and so scale == false;
Thus no scaling happens and e.g. BB 2 misses profile (due to -O0):
(gdb) p debug_bb(bb)
<bb 2> :
BitwiseCast (aFrom_2(D), &temp);
_4 = temp;
temp ={v} {CLOBBER};
and we later ICE here:
Breakpoint 1, profile_count::to_frequency (this=0x7ffff73f8058,
fun=0x7ffff73ee000) at ../../gcc/profile-count.c:273
273 gcc_assert (REG_BR_PROB_BASE == BB_FREQ_MAX
(gdb) bt
#0 profile_count::to_frequency (this=0x7ffff73f8058, fun=0x7ffff73ee000) at
../../gcc/profile-count.c:273
#1 0x0000000000d72c4d in regstat_bb_compute_ri (bb=0x7ffff73f8000,
live=0x2839f60) at ../../gcc/regstat.c:200
#2 0x0000000000d73005 in regstat_compute_ri () at ../../gcc/regstat.c:253
#3 0x0000000000be1a7d in ira (f=<optimized out>) at ../../gcc/ira.c:5294
#4 (anonymous namespace)::pass_ira::execute (this=<optimized out>) at
../../gcc/ira.c:5666
#5 0x0000000000d23b45 in execute_one_pass (pass=0x26ef850) at
../../gcc/passes.c:2502
#6 0x0000000000d23e6a in execute_pass_list_1 (pass=0x26ef850) at
../../gcc/passes.c:2590
#7 0x0000000000d23e9b in execute_pass_list_1 (pass=0x26ee5d0) at
../../gcc/passes.c:2591
#8 0x0000000000d23ef3 in execute_pass_list (fn=0x7ffff73ee000, pass=0x26ea890)
at ../../gcc/passes.c:2601
#9 0x000000000090a7d7 in cgraph_node::expand (this=0x7ffff73ede10) at
../../gcc/cgraphunit.c:2300
#10 0x000000000090b232 in output_in_order () at ../../gcc/cgraphunit.c:2578
#11 0x000000000090b869 in symbol_table::compile (this=0x7ffff75d6100) at
../../gcc/cgraphunit.c:2819
#12 0x000000000083df7d in lto_main () at ../../gcc/lto/lto.c:653
#13 0x0000000000deda92 in compile_file () at ../../gcc/toplev.c:458
#14 0x0000000000df0c7a in do_compile () at ../../gcc/toplev.c:2278
#15 0x0000000000df0f63 in toplev::main (this=0x7fffffffd9fe, argc=19,
argv=0x26c26f0) at ../../gcc/toplev.c:2417
#16 0x00000000008196af in main (argc=19, argv=0x7fffffffdb08) at
../../gcc/main.c:39
So what do you suggest to fix it :P?
Is this the bug that riggers Firefox ICE?
What goes wrong later?
Honza
Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
Ready to be installed?
Thanks,
Martin
gcc/ChangeLog:
PR ipa/97295
* ipa-utils.c (ipa_merge_profiles):
---
gcc/ipa-utils.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
index 23e7f714306..fe2149069be 100644
--- a/gcc/ipa-utils.c
+++ b/gcc/ipa-utils.c
@@ -437,10 +437,7 @@ ipa_merge_profiles (struct cgraph_node *dst,
else if (dst->count.ipa ().initialized_p ())
;
else if (src->count.ipa ().initialized_p ())
- {
- copy_counts = true;
- dst->count = src->count.ipa ();
- }
+ copy_counts = true;
/* If no updating needed return early. */
if (dst->count == orig_count)
@@ -620,6 +617,9 @@ ipa_merge_profiles (struct cgraph_node *dst,
bool dstscale = !copy_counts
&& dstnum.initialized_p () && !(dstnum == dstden);
+ if (copy_counts)
+ dst->count = src->count.ipa ();
+
/* TODO: merge also statement histograms. */
FOR_ALL_BB_FN (srcbb, srccfun)
{
--
2.28.0