[Bug lto/92599] ICE in speculative_call_info, at cgraph.c:1142
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92599 --- Comment #4 from Xiong Hu XS Luo --- (In reply to Xiong Hu XS Luo from comment #3) > (In reply to Martin Liška from comment #2) > > So we ICE at the end of cgraph_edge::speculative_call_info: > > (gdb) p ref > > $4 = > > > > (gdb) p e > > $5 = > "ConvertASEToModelSurfaces.constprop"/113> -> > "NumSurfaces"/115>)> > > (gdb) p e2 > > $6 = > "ConvertASEToModelSurfaces.constprop"/113> -> )> > > > > As seen the edge is within idRenderModelStatic class. > > I bet the problem is the ODR warning message, the class is polymorphic in > > one TU, and normal class in another one. > > Seems side effect in indirect->set_call_stmt. > The "ref" is changed in recursive call line "indirect->set_call_stmt > (new_stmt, false)", it will call resolve_speculation to redirect one of it's > polymorphic call > ConvertASEToModelSurfaces/2 => FindMaterial/45, ref->remove_reference will > the related reference, then ref will be pointed to another reference, then > the followed "ref->stmt = new_stmt" will update wrong stmt to the original > reference. > > p *ref > $378 = {referring = 0x3fffb55f10e0, referred = 0x3fffb55f09d8, stmt = > 0x3fffb7f71440, lto_stmt_uid = 10, referred_index = 1, use = IPA_REF_ADDR, > speculative = 1 > > After returning from indirect->set_call_stmt (new_stmt, false): > > p *ref > $380 = {referring = 0x3fffb55f10e0, referred = 0x3fffb55f0ca8, stmt = > 0x3fffb7f713b0, lto_stmt_uid = 6, referred_index = 1, use = IPA_REF_ADDR, > speculative = 1} > > (gdb) pedge direct > $381 = 0x3fffb5560498 "ConvertASEToModelSurfaces.constprop/113" > $382 = 0x3fffb53d78a0 "FindMaterial/116" > (gdb) ps new_stmt > FindMaterial (_6); > ps ref->stmt > # .MEM = VDEF <.MEM> > OBJ_TYPE_REF(_5;(struct idRenderModelStatic)this_3(D)->1) (this_3(D)); > > > How about patch candidate as below? Or check the ref->speculative is true > after return from indirect->set_call_stmt? Not "ref->speculative", should be indirect->speculative as indirect edge is removed speculative now, or lto_stmt_uid is more reliable? (gdb) p *indirect $385 = {count = {static n_bits = 61, static max_count = 2305843009213693950, static uninitialized_count = 2305843009213693951, m_val = 406913472214181285, m_quality = AFDO}, caller = 0xa5a5a5a5a5a5a5a5, callee = 0xa5a5a5a5a5a5a5a5, prev_caller = 0xa5a5a5a5a5a5a5a5, next_caller = 0xa5a5a5a5a5a5a5a5, prev_callee = 0xa5a5a5a5 a5a5a5a5, next_callee = 0xa5a5a5a5a5a5a5a5, call_stmt = 0xa5a5a5a5a5a5a5a5, indirect_info = 0xa5a5a5a5a5a5a5a5, aux = 0xa5a5a5a5a5a5a5a5, inline_failed = 2779096485, lto_stmt_uid = 2779096485, indirect_inlining_edge= 1, indirect_unknown_callee = 0, call_stmt_cannot_inline_p = 1, can_throw_external = 0, speculative = 0, in_polymorphic_cdtor = 1, m_uid = -1515870811, m_summary_id = -1515870811}
[Bug lto/92599] ICE in speculative_call_info, at cgraph.c:1142
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92599 --- Comment #3 from Xiong Hu XS Luo --- (In reply to Martin Liška from comment #2) > So we ICE at the end of cgraph_edge::speculative_call_info: > (gdb) p ref > $4 = > > (gdb) p e > $5 = "ConvertASEToModelSurfaces.constprop"/113> -> "NumSurfaces"/115>)> > (gdb) p e2 > $6 = "ConvertASEToModelSurfaces.constprop"/113> -> )> > > As seen the edge is within idRenderModelStatic class. > I bet the problem is the ODR warning message, the class is polymorphic in > one TU, and normal class in another one. Seems side effect in indirect->set_call_stmt. The "ref" is changed in recursive call line "indirect->set_call_stmt (new_stmt, false)", it will call resolve_speculation to redirect one of it's polymorphic call ConvertASEToModelSurfaces/2 => FindMaterial/45, ref->remove_reference will the related reference, then ref will be pointed to another reference, then the followed "ref->stmt = new_stmt" will update wrong stmt to the original reference. p *ref $378 = {referring = 0x3fffb55f10e0, referred = 0x3fffb55f09d8, stmt = 0x3fffb7f71440, lto_stmt_uid = 10, referred_index = 1, use = IPA_REF_ADDR, speculative = 1 After returning from indirect->set_call_stmt (new_stmt, false): p *ref $380 = {referring = 0x3fffb55f10e0, referred = 0x3fffb55f0ca8, stmt = 0x3fffb7f713b0, lto_stmt_uid = 6, referred_index = 1, use = IPA_REF_ADDR, speculative = 1} (gdb) pedge direct $381 = 0x3fffb5560498 "ConvertASEToModelSurfaces.constprop/113" $382 = 0x3fffb53d78a0 "FindMaterial/116" (gdb) ps new_stmt FindMaterial (_6); ps ref->stmt # .MEM = VDEF <.MEM> OBJ_TYPE_REF(_5;(struct idRenderModelStatic)this_3(D)->1) (this_3(D)); How about patch candidate as below? Or check the ref->speculative is true after return from indirect->set_call_stmt? diff --git a/gcc/cgraph.c b/gcc/cgraph.c index b75430f3f3a..65b6f93c3fe 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -793,7 +793,8 @@ cgraph_edge::set_call_stmt (gcall *new_stmt, bool update_speculative) speculative_call_info (direct, indirect, ref); direct->set_call_stmt (new_stmt, false); indirect->set_call_stmt (new_stmt, false); - ref->stmt = new_stmt; + if (ref->lto_stmt_uid == direct->lto_stmt_uid) + ref->stmt = new_stmt; return; }
[Bug ipa/92133] Support multi versioning on self recursive function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92133 --- Comment #10 from Xiong Hu XS Luo --- (In reply to Feng Xue from comment #9) > Ok. For any followups on this, I'll create new tracker. Seems "--param ipa-cp-eval-threshold=0 --param large-unit-insns=2 -fno-inline" are required to do the recursive clone for digits_2?
[Bug testsuite/92398] [10 regression] error in update of gcc.target/powerpc/pr72804.c in r277872
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92398 --- Comment #9 from Xiong Hu XS Luo --- (In reply to seurer from comment #8) > The changes in r278890 fix the earlier problems but introduce new ones: > > failures in r278889 (not seen in r278890): > FAIL: gcc.target/powerpc/pr72804.c scan-assembler-times not 4 > FAIL: gcc.target/powerpc/pr72804.c scan-assembler-times std 2 > FAIL: gcc.target/powerpc/pr72804.c scan-assembler-not stxvd2x > FAIL: gcc.target/powerpc/pr72804.c scan-assembler-not xxpermdi > > new failures in r278890: > > FAIL: gcc.target/powerpc/pr72804.c scan-assembler-times \\mnot\\M 2 > FAIL: gcc.target/powerpc/pr72804.c scan-assembler-not \\mlxvd2x\\M > > saw this on both power 8 (BE) and power 9 (LE). Spaces are strictly required in dg, when I copied my patch to svn repo and run "svn patch", there is conflicts with latest code in pr72804.c, the typo happens when copy the line manually, will commit it as obvious: diff --git a/gcc/testsuite/gcc.target/powerpc/pr72804.c b/gcc/testsuite/gcc.target/powerpc/pr72804.c index d424bccd5c3..38dff549210 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr72804.c +++ b/gcc/testsuite/gcc.target/powerpc/pr72804.c @@ -1,6 +1,6 @@ /* { dg-do compile { target { lp64 } } } */ /* { dg-require-effective-target powerpc_vsx_ok } */ -/* { dg-options "-O2 -mvsx"} */ +/* { dg-options "-O2 -mvsx" } */
[Bug middle-end/71509] Bitfield causes load hit store with larger store than load
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71509 --- Comment #10 from Xiong Hu XS Luo --- (In reply to Andrew Pinski from comment #9) > (In reply to rguent...@suse.de from comment #8) > > On Fri, 15 Mar 2019, luoxhu at cn dot ibm.com wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71509 > > > > > > Xiong Hu XS Luo changed: > > > > > >What|Removed |Added > > > > > > CC||guojiufu at gcc dot > > > gnu.org, > > >||rguenth at gcc dot > > > gnu.org > > > > > > --- Comment #7 from Xiong Hu XS Luo --- > > > Hi Richard, trying to figure out the issue recently, but get some > > > questions > > > need your help. How is the status of the "proposed simple lowering of > > > bitfield > > > accesses on GIMPLE", please? > > > > There are finished patches in a few variants but all of them show issues > > in the testsuite, mostly around missed optimizations IIRC. It has been > > quite some time since I last looked at this but IIRC Andrew Pinski said > > he's got sth for GCC 10 so you might want to ask him. > > Yes I do, I am planing on submitting the new pass once GCC 10 has opened up. Hi Pinski, is your patch upstreamed to GCC 10 please? Thanks.
[Bug middle-end/26241] [8/9 Regression] None of the IPA passes are documented in passes.texi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26241 --- Comment #24 from Xiong Hu XS Luo --- closing this since no need backport?
[Bug testsuite/92398] [10 regression] error in update of gcc.target/powerpc/pr72804.c in r277872
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92398 --- Comment #7 from Xiong Hu XS Luo --- Starting broken revision on Power8BE is r265398: commit 171920e88fed13ed26336ca884123eff37176c36 (HEAD, refs/bisect/bad) Author: segher Date: Mon Oct 22 20:23:39 2018 + combine: Do not combine moves from hard registers On most targets every function starts with moves from the parameter passing (hard) registers into pseudos. Similarly, after every call there is a move from the return register into a pseudo. These moves usually combine with later instructions (leaving pretty much the same instruction, just with a hard reg instead of a pseudo).
[Bug testsuite/92398] [10 regression] error in update of gcc.target/powerpc/pr72804.c in r277872
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92398 --- Comment #6 from Xiong Hu XS Luo --- Power9 genrates different code than Power8LE is because of reg cost in sched1, r120 from P9 of instruction 8 is a memory instruction while r120 of P8 of instruction 13 is not, which will cause different register cost value in function ira-cost.c:record_operand_costs(), the 2000 is P9's r120 BASE_REGS cost, it is 0 for P8: p op_costs[1][0] $453 = {mem_cost = 4000, cost = {2000}} sched1, For Power9: P9 r120 costs: BASE_REGS:2000 GENERAL_REGS:2000 FLOAT_REGS:0 ALTIVEC_REGS:0 VSX_REGS:0 GEN_OR_FLOAT_REGS:12000 GEN_OR_VSX_REGS:12000 MEM:8000 ;; basic block 2, loop depth 0 ;; pred: ENTRY 5: NOTE_INSN_BASIC_BLOCK 2 11: r121:DI=%3:DI REG_DEAD %3:DI 2: NOTE_INSN_DELETED 12: r122:TI=%4:TI REG_DEAD %4:TI 3: NOTE_INSN_DELETED 4: NOTE_INSN_FUNCTION_BEG 7: r120:TI=~r122:TI REG_DEAD r122:TI 8: [r121:DI]=r120:TI REG_DEAD r121:DI REG_DEAD r120:TI ;; succ: EXIT sched1, For Power8LE: r120 costs: BASE_REGS:0 GENERAL_REGS:0 FLOAT_REGS:0 ALTIVEC_REGS:0 VSX_REGS:0 GEN_OR_FLOAT_REGS:16000 GEN_OR_VSX_REGS:16000 MEM:5000 1: NOTE_INSN_DELETED 5: NOTE_INSN_BASIC_BLOCK 2 2: NOTE_INSN_DELETED 3: NOTE_INSN_DELETED 4: NOTE_INSN_FUNCTION_BEG 12: r122:TI=%4:TI REG_DEAD %4:TI 11: r121:DI=%3:DI REG_DEAD %3:DI 7: r120:TI=~r122:TI REG_DEAD r122:TI 13: r123:TI=r120:TI<-<0x40 REG_DEAD r120:TI 14: [r121:DI]=r123:TI<-<0x40 REG_DEAD r123:TI REG_DEAD r121:DI 15: NOTE_INSN_DELETED
[Bug testsuite/92398] [10 regression] error in update of gcc.target/powerpc/pr72804.c in r277872
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92398 --- Comment #3 from Xiong Hu XS Luo --- Power8 BE generates: L.bar: .LFB1: .cfi_startproc mtvsrd 0,4 mtvsrd 1,5 xxpermdi 12,0,1,0 xxlnor 0,12,12 stxvd2x 0,0,3 blr .long 0 .byte 0,0,0,0,0,0,0,0 .cfi_endproc .LFE1: and source code is below for all: void bar (__int128_t *dst, __int128_t src) { *dst = ~src; }
[Bug testsuite/92398] [10 regression] error in update of gcc.target/powerpc/pr72804.c in r277872
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92398 Xiong Hu XS Luo changed: What|Removed |Added CC||luoxhu at cn dot ibm.com --- Comment #2 from Xiong Hu XS Luo --- Sorry that I mistook to change the case pr72804.c: this case has no relationship with the parameter -fno-inline-functions --param max-inline-insns-single-O2=200. This test appears in category of "New tests that FAIL (6 tests):". Checked the test result history: For Power9, it is failed since r268152 of Jan 22, 2019. https://gcc.gnu.org/ml/gcc-testresults/2019-01/msg02232.html. For Power 7, it starts to fail since r267319 of Dec 21, 2018. https://gcc.gnu.org/ml/gcc-testresults/2018-12/msg02500.html Peter added this case pr72804.c in r251153 of Aug 17, 2017 to generate better code for -mvsx-timode. But it restrict the condition to !BYTES_BIG_ENDIAN and !TARGET_P9_VECTOR: +;; Peepholes to catch loads and stores for TImode if TImode landed in +;; GPR registers on a little endian system. +(define_peephole2 + [(set (match_operand:VSX_TI 0 "int_reg_operand") + (rotate:VSX_TI (match_operand:VSX_TI 1 "memory_operand") + (const_int 64))) + (set (match_operand:VSX_TI 2 "int_reg_operand") + (rotate:VSX_TI (match_dup 0) + (const_int 64)))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && TARGET_VSX_TIMODE && !TARGET_P9_VECTOR + && (rtx_equal_p (operands[0], operands[2]) + || peep2_reg_dead_p (2, operands[0]))" + [(set (match_dup 2) (match_dup 1))]) I am not sure these failures are leaved for won't fix or need update the vsx.md to generate better code? On Power9, the expected code is: 0020 : 20: f8 20 84 7c not r4,r4 24: f8 28 a5 7c not r5,r5 28: 00 00 83 f8 std r4,0(r3) 2c: 08 00 a3 f8 std r5,8(r3) 30: 20 00 80 4e blr But actual code is: 0020 : 20: 66 23 05 7c mtvsrdd vs0,r5,r4 24: 10 05 00 f0 xxlnor vs0,vs0,vs0 28: 05 00 03 f4 stxvvs0,0(r3) 2c: 20 00 80 4e blr
[Bug other/92090] [10 regression] ICE in gcc.dg/atomic/c11-atomic-exec-5.c starting with r276469
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92090 --- Comment #10 from Xiong Hu XS Luo --- (In reply to Xiong Hu XS Luo from comment #9) > (In reply to Segher Boessenkool from comment #7) > > LRA creates > > > > ;; Insn is not within a basic block > > (insn 7037 0 0 (set (reg:PTI 3703) > > (const_wide_int 0x3ff0)) -1 > > (nil)) > > > > but that is not a valid insn. > > > > This started as > > > > (insn 3756 3755 3757 363 (set (reg:TI 2388) > > (const_wide_int 0x3ff0)) > > "c11-atomic-exec-5.c":406:1 1179 {vsx_movti_64bit} > > (expr_list:REG_EQUIV (const_wide_int > > 0x3ff0) > > (nil))) > > > > (insn 3758 3757 3759 363 (set (reg:PTI 2389) > > (subreg:PTI (reg:TI 2388) 0)) "c11-atomic-exec-5.c":406:1 623 > > {*movpti_ppc64} > > (expr_list:REG_EQUIV (const_wide_int > > 0x3ff0) > > (nil))) > > > > which is fine. But we have no insns (in the md) to load an immediate into > > a PTI reg. > > > This instruction is generated in function curr_insn_transform of > lra-constraints.c, maybe need add a target hook to avoid it if subst is > immediate and old is PTI register? > >for (i = 0; i < n_operands; i++) > { > rtx op, subst, old; > bool op_change_p = false; > > if (curr_static_id->operand[i].is_operator) > continue; > > old = op = *curr_id->operand_loc[i]; > if (GET_CODE (old) == SUBREG) > old = SUBREG_REG (old); > subst = get_equiv_with_elimination (old, curr_insn); > original_subreg_reg_mode[i] = VOIDmode; > equiv_substition_p[i] = false; > if (subst != old) > { > equiv_substition_p[i] = true; > subst = copy_rtx (subst); > lra_assert (REG_P (old)); > if (GET_CODE (op) != SUBREG) > *curr_id->operand_loc[i] = subst; > else > { > SUBREG_REG (op) = subst; > if (GET_MODE (subst) == VOIDmode) > original_subreg_reg_mode[i] = GET_MODE (old); > } > if (lra_dump_file != NULL) > { > fprintf (lra_dump_file, >"Changing pseudo %d in operand %i of insn %u on equiv > ", >REGNO (old), i, INSN_UID (curr_insn)); > dump_value_slim (lra_dump_file, subst, 1); > fprintf (lra_dump_file, "\n"); > } This could fix the ICE, but I am not sure whether it is reasonable: diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index 0db6d3151cd..325904ac473 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -3886,7 +3886,9 @@ curr_insn_transform (bool check_only_p) subst = get_equiv_with_elimination (old, curr_insn); original_subreg_reg_mode[i] = VOIDmode; equiv_substition_p[i] = false; - if (subst != old) + if (subst != old + && !(GET_MODE (old) == E_PTImode && GET_CODE (old) == REG +&& GET_CODE (subst) == CONST_WIDE_INT)) { equiv_substition_p[i] = true; subst = copy_rtx (subst);
[Bug other/92090] [10 regression] ICE in gcc.dg/atomic/c11-atomic-exec-5.c starting with r276469
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92090 Xiong Hu XS Luo changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #9 from Xiong Hu XS Luo --- (In reply to Segher Boessenkool from comment #7) > LRA creates > > ;; Insn is not within a basic block > (insn 7037 0 0 (set (reg:PTI 3703) > (const_wide_int 0x3ff0)) -1 > (nil)) > > but that is not a valid insn. > > This started as > > (insn 3756 3755 3757 363 (set (reg:TI 2388) > (const_wide_int 0x3ff0)) > "c11-atomic-exec-5.c":406:1 1179 {vsx_movti_64bit} > (expr_list:REG_EQUIV (const_wide_int 0x3ff0) > (nil))) > > (insn 3758 3757 3759 363 (set (reg:PTI 2389) > (subreg:PTI (reg:TI 2388) 0)) "c11-atomic-exec-5.c":406:1 623 > {*movpti_ppc64} > (expr_list:REG_EQUIV (const_wide_int 0x3ff0) > (nil))) > > which is fine. But we have no insns (in the md) to load an immediate into > a PTI reg. This instruction is generated in function curr_insn_transform of lra-constraints.c, maybe need add a target hook to avoid it if subst is immediate and old is PTI register? for (i = 0; i < n_operands; i++) { rtx op, subst, old; bool op_change_p = false; if (curr_static_id->operand[i].is_operator) continue; old = op = *curr_id->operand_loc[i]; if (GET_CODE (old) == SUBREG) old = SUBREG_REG (old); subst = get_equiv_with_elimination (old, curr_insn); original_subreg_reg_mode[i] = VOIDmode; equiv_substition_p[i] = false; if (subst != old) { equiv_substition_p[i] = true; subst = copy_rtx (subst); lra_assert (REG_P (old)); if (GET_CODE (op) != SUBREG) *curr_id->operand_loc[i] = subst; else { SUBREG_REG (op) = subst; if (GET_MODE (subst) == VOIDmode) original_subreg_reg_mode[i] = GET_MODE (old); } if (lra_dump_file != NULL) { fprintf (lra_dump_file, "Changing pseudo %d in operand %i of insn %u on equiv ", REGNO (old), i, INSN_UID (curr_insn)); dump_value_slim (lra_dump_file, subst, 1); fprintf (lra_dump_file, "\n"); }
[Bug other/92090] [10 regression] ICE in gcc.dg/atomic/c11-atomic-exec-5.c starting with r276469
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92090 Xiong Hu XS Luo changed: What|Removed |Added CC||seurer at gcc dot gnu.org --- Comment #3 from Xiong Hu XS Luo --- (In reply to seurer from comment #0) > Tried 276469 > > make -k check-gcc RUNTESTFLAGS=atomic.exp=gcc.dg/atomic/c11-atomic-exec-5.c > > FAIL: gcc.dg/atomic/c11-atomic-exec-5.c -Os (internal compiler error) > FAIL: gcc.dg/atomic/c11-atomic-exec-5.c -Os (test for excess errors) > > Executing on host: /home/seurer/gcc/build/gcc-test2/gcc/xgcc > -B/home/seurer/gcc/build/gcc-test2/gcc/ > /home/seurer/gcc/gcc-test2/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/ > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/. > libs -latomic -fno-diagnostics-show-caret > -fno-diagnostics-show-line-numbers -fdiagnostics-color=never-Os > -std=c11 -pedantic-errors -pthread -U_POSIX_C_SOURCE > -D_POSIX_C_SOURCE=200809L -lm -o ./c11-atomic-exec-5.exe(timeout = 600) > spawn -ignore SIGHUP /home/seurer/gcc/build/gcc-test2/gcc/xgcc > -B/home/seurer/gcc/build/gcc-test2/gcc/ > /home/seurer/gcc/gcc-test2/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/ > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/. > libs -latomic -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers > -fdiagnostics-color=never -Os -std=c11 -pedantic-errors -pthread > -U_POSIX_C_SOURCE -D_POSIX_C_SOURCE=200809L -lm -o ./c11-atomic-exec-5.exe > during RTL pass: reload > /home/seurer/gcc/gcc-test2/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c: > In function 'main': > /home/seurer/gcc/gcc-test2/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c: > 549:1: internal compiler error: in lra_set_insn_recog_data, at lra.c:995 > 0x108655af lra_set_insn_recog_data(rtx_insn*) > /home/seurer/gcc/gcc-test2/gcc/lra.c:993 > 0x10869117 lra_get_insn_recog_data > /home/seurer/gcc/gcc-test2/gcc/lra-int.h:488 > 0x10869117 remove_scratches_1 > /home/seurer/gcc/gcc-test2/gcc/lra.c:2053 > 0x1086921b lra_emit_move(rtx_def*, rtx_def*) > /home/seurer/gcc/gcc-test2/gcc/lra.c:503 > 0x108861f7 curr_insn_transform > /home/seurer/gcc/gcc-test2/gcc/lra-constraints.c:4397 > 0x1088845f lra_constraints(bool) > /home/seurer/gcc/gcc-test2/gcc/lra-constraints.c:4994 > 0x1086992f lra(_IO_FILE*) > /home/seurer/gcc/gcc-test2/gcc/lra.c:2432 > 0x10804d6b do_reload > /home/seurer/gcc/gcc-test2/gcc/ira.c:5511 > 0x10804d6b execute > /home/seurer/gcc/gcc-test2/gcc/ira.c:5697 The ICE is not reproduced on P8LE and P9, but pr79439-1.c and vsx-builtin-7.c are reproducible. It was caused by r276469 enabling inline-functions for O2 by default, so small functions are inlined, need update the test case due to instruction count difference. will send a patch if @seurer confirmed the ICE not exists.
[Bug ipa/92074] [10 regression] 26% performance regression on Spec2017 548.exchange2_r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074 --- Comment #3 from Xiong Hu XS Luo --- (In reply to Jan Hubicka from comment #2) > The regression is because we now inline covered into digits2: > > IPA function summary for digits_2/29 inlinable > global time: 1553.078985 > self size: 1295 > global size: 1295 > min size: 0 > self stack: 261 > global stack:261 > size:981.00, time:1505.442572 > size:3.00, time:1.999121, executed if:(not inlined) > size:0.50, time:0.50, executed if:(not inlined), nonconst > if:(op0[ref offset: 0] changed) && (not inlined) > size:210.50, time:27.456610, nonconst if:(op0[ref offset: 0] > changed) > size:21.00, time:3.795164, executed if:(op0[ref offset: 0] == 5) > size:6.00, time:0.334389, executed if:(op0[ref offset: 0] != 8) > size:1.00, time:0.033237, executed if:(op0[ref offset: 0] != 8), > nonconst if:(op0[ref offset: 0] changed) && (op0[ref offset: 0] != 8) > size:66.00, time:13.130882, executed if:(op0[ref offset: 0] == 8) > loop iterations:(op0[ref offset: 0] changed) > calls: > digits_2/29 function not considered for inlining > loop depth: 9 freq:0.03 size: 2 time: 11callee size:647 stack:261 > predicate: (op0[ref offset: 0] != 8) >op0 is compile time invariant > covered.constprop/93 function not considered for inlining > loop depth: 9 freq:0.00 size: 4 time: 13callee size:214 stack:1472 > predicate: (op0[ref offset: 0] == 8) >op0 is compile time invariant >op1 is compile time invariant > > digits_2 is quite deeply recursive and inlining quite expensive function > "covered" does not help. Hi Honza, I am analyzing the exchange2 of the recursive call digits_2(int k), this is not relevant with current PR. Sorry for distracting. In Fortran, k is pass by reference instead of pass by value, the new IPA-SRA could do the SRA and convert it to pass by value with some workaround, but ipa-sra is running after ipa-cp, and ipa-cp is not able to leverage the SRA results in WPA stage. As digits_2 consumes most of the run time, and the input param value increases from 1 to 9, if manually convert the recursive call to non-recursive call like: case(1) call digits_2_1(); ... case(9) call digits_2_9(); The performance will go up for about 60%. So there may be possible methods to do such kind of optimization: 1. Enable profile with value range and probability, save the input param k's value range to be [1, 9] 90%, ~[1, 9] 10%, then ipa-cp and ipa-sra could do recursive const propagation for digits_2 to generate digits_2.constprop1, digits_2.constprop2, etc. It would be a combined optimization of ipa-profile, ipa-cp, ipa-sra. This would be complicated as ipa-cp doesn't support recursive const prop and pass by reference prop with operands yet(like *(&k)+1). 2. Or use an independent pass(I am not sure whether it already exists in current GCC) to do the recursive to non-recursive call conversion like manual way for HOT recursive calls, then ipa-cp could do the const prop as usual. Any suggestion about this, please? Thanks. > > This can be solved by --param inline-heuristics-hint-percent=600 > the current default of 1600 is way too high and I scheduled some benchmarks > to tune it down but unfortunately our LNT benchmarking is down currently. (I > would like to see it reduced to even lower value if polyhedron and SPEC > testing is happy about that) > > Generally it would be nice if inliner understood that inlining into self > recursive functions on the path that is not going to recursion may be > harmful. This we do not model and thus this works/does not work sort of > randomly.
[Bug target/91518] [9/10 Regression] segfault when run CPU2006 465.tonto since r263875
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91518 --- Comment #3 from Xiong Hu XS Luo --- (In reply to Richard Biener from comment #2) > Not seen on x86_64. Given you bisected to r263875 it should appear with GCC > 9 as well - are the actual GCC 9 releases also affected? > > I assume this is ppc64le. > > Unless we know more I assume this is a target issue. Please build with debug > info and see where exactly and why it segfaults. Yes. It still fails on both power8 and power9 even on GCC 10 (gcc version 10.0.0 20190823 (experimental) (GCC)). Reset to r263875, the register content shown as below, Wrong address filled for lwzx instruction ($r8 is expected to be a valid address value): 140│0x101a5718 <+552>: ld r12,888(r31) 141│0x101a571c <+556>: ld r0,856(r31) 142│0x101a5720 <+560>: ld r17,880(r31) 143│0x101a5724 <+564>: ld r8,848(r31) 144│0x101a5728 <+568>: addir21,r21,1 145│0x101a572c <+572>: cmpwcr7,r21,r30 146│0x101a5730 <+576>: mulld r4,r3,r12 147│0x101a5734 <+580>: add r18,r4,r0 148│0x101a5738 <+584>: mulld r11,r18,r17 149├> 0x101a573c <+588>: lwzxr3,r8,r11 44: /x $r3 = 0x1 45: /x $r8 = 0x77 46: /x $r11 = 0x1770 47: /x $r18 = 0x7d 48: /x $r17 = 0x30 49: /x $r4 = 0x1 50: /x $r0 = 0x7c 51: /x $r3 = 0x1 52: /x $r12 = 0x1 53: /x $r21 = 0x2 54: /x $r8 = 0x77 55: /x $r17 = 0x30 56: /x $r0 = 0x7c 57: /x $r12 = 0x1 I am not sure whether this is the debug info you needed? function callstack is already pasted in #c0, as source code is not allowed to be pasted, the segment fault place is in line 9375 of file mol.fppized.f90 of function make_image_of_shell. Thanks.
[Bug lto/91518] segfault when run CPU2006 465.tonto since r263875
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91518 --- Comment #1 from Xiong Hu XS Luo --- 51e85e64e125803502fde94b9e22037c0ccaa8b2 is the first bad commit commit 51e85e64e125803502fde94b9e22037c0ccaa8b2 Author: rguenth rguenth@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Mon Aug 27 10:55:53 2018 + 2018-08-27 Richard Biener * cfganal.h (rev_post_order_and_mark_dfs_back_seme): Declare. * cfganal.c (rev_post_order_and_mark_dfs_back_seme): New function. * tree-ssa-sccvn.h (struct vn_pval): New structure. (struct vn_nary_op_s): Add unwind_to member. Add predicated_values flag and put result into a union together with a linked list of vn_pval. (struct vn_ssa_aux): Add name member to make maintaining a map of SSA name to vn_ssa_aux possible. Remove no longer needed info, dfsnum, low, visited, on_sccstack, use_processed and range_info_anti_range_p members. (run_scc_vn, vn_eliminate, free_scc_vn, vn_valueize): Remove. (do_rpo_vn, run_rpo_vn, eliminate_with_rpo_vn, free_rpo_vn): New functions. (vn_valueize): New global. (vn_context_bb): Likewise. (VN_INFO_RANGE_INFO, VN_INFO_ANTI_RANGE_P, VN_INFO_RANGE_TYPE, VN_INFO_PTR_INFO): Remove. * tree-ssa-sccvn.c: ... (rewrite) (pass_fre::execute): For -O2+ initialize loops and run RPO VN in optimistic mode (iterating). For -O1 and -Og run RPO VN in non-optimistic mode. * params.def (PARAM_SCCVN_MAX_SCC_SIZE): Remove. (PARAM_RPO_VN_MAX_LOOP_DEPTH): Add. * doc/invoke.texi (sccvn-max-scc-size): Remove. (rpo-vn-max-loop-depth): Document. * tree-ssa-alias.c (walk_non_aliased_vuses): Stop walking when valuezing the VUSE signals we walked out of the region. * tree-ssa-pre.c (phi_translate_1): Ignore predicated values. (phi_translate): Set VN context block to use for availability lookup. (compute_avail): Likewise. (pre_valueize): New function. (pass_pre::execute): Adjust to the RPO VN API. * tree-ssa-loop-ivcanon.c: Include tree-ssa-sccvn.h. (propagate_constants_for_unrolling): Remove. (tree_unroll_loops_completely): Perform value-numbering on the unrolled bodies loop parent.
[Bug lto/91518] New: segfault when run CPU2006 465.tonto since r263875
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91518 Bug ID: 91518 Summary: segfault when run CPU2006 465.tonto since r263875 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: luoxhu at cn dot ibm.com CC: marxin at gcc dot gnu.org Target Milestone: --- build option: OPTIMIZE = -Ofast -mcpu=power8 -mrecip=all -funroll-loops -flto Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x1008bc2f in ??? #1 0x1008a3ff in ??? #2 0x10050477 in ??? #3 0x101aabfc in ??? #4 0x101aff3f in ??? #5 0x10133e7b in ??? #6 0x101e7023 in ??? #7 0x101e76ab in ??? #8 0x101e48a7 in ??? #9 0x1021fc5f in ??? #10 0x101e6503 in ??? #11 0x1013c7eb in ??? #12 0x1013583f in ??? #13 0x10154c4f in ??? #14 0x1000210b in ??? #15 0x106750ff in ??? Error: 1x465.tonto (gdb) bt #0 0x101a573c in __mol_module_MOD_make_image_of_shell () #1 0x101aaac0 in __mol_module_MOD_symmetrise_r () #2 0x10130b4c in __mol_module_MOD_symmetrise.constprop.16 () #3 0x101e1ba4 in __mol_module_MOD_make_atom_density () #4 0x101e222c in __mol_module_MOD_get_atom_density () #5 0x101df428 in __mol_module_MOD_get_initial_guess () #6 0x1021a7e0 in __mol_module_MOD_constrained_scf () #7 0x101e1084 in __mol_module_MOD_scf () #8 0x101394bc in __mol_main_module_MOD_process_keyword.constprop.1 () #9 0x10132510 in __mol_main_module_MOD_read_keywords.constprop.2 () #10 0x10151920 in __mol_main_module_MOD_main.constprop.0 () #11 0x100020ec in main ()
[Bug lto/91273] [7/8/9/10 Regression] ICE in warn_types_mismatch at ipa-devirt.c:995
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91273 --- Comment #8 from Xiong Hu XS Luo --- SPEC2017 case 507.cactuBSSN_r, also has ICE failure from r273571: lto1: internal compiler error: in warn_types_mismatch, at ipa-devirt.c:995 0x105dad1f warn_types_mismatch(tree_node*, tree_node*, unsigned int, unsigned int) /home/gcc/gcc-10-trunk-2019-08-16/gcc/ipa-devirt.c:995 0x101dd147 lto_symtab_merge_decls_2 /home/gcc/gcc-10-trunk-2019-08-16/gcc/lto/lto-symtab.c:723 0x101dd147 lto_symtab_merge_decls_1 /home/gcc/gcc-10-trunk-2019-08-16/gcc/lto/lto-symtab.c:862 0x101dd147 lto_symtab_merge_decls() /home/gcc/gcc-10-trunk-2019-08-16/gcc/lto/lto-symtab.c:888 0x101f2e47 read_cgraph_and_symbols(unsigned int, char const**) /home/gcc/gcc-10-trunk-2019-08-16/gcc/lto/lto-common.c:2839 0x101c7def lto_main() Compiler options used: "-O3 -mcpu=power9 -flto"
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #34 from Xiong Hu XS Luo --- (In reply to rguent...@suse.de from comment #32) > On Mon, 5 Aug 2019, luoxhu at cn dot ibm.com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 > > > > --- Comment #31 from Xiong Hu XS Luo --- > > (In reply to rguent...@suse.de from comment #30) > > > On Fri, 2 Aug 2019, luoxhu at cn dot ibm.com wrote: > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 > > > > > > > > --- Comment #28 from Xiong Hu XS Luo --- > > > > (In reply to Richard Biener from comment #24) > > > > > Btw, this is controlled by symtab_node::output_to_lto_symbol_table_p > > > > > which > > > > > has > > > > > > > > > > /* FIXME: Builtins corresponding to real functions probably should > > > > > have > > > > > symbol table entries. */ > > > > > if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > > > > > return false; > > > > > > > > > > we could try to do sth like > > > > > > > > > > if (TREE_CODE (decl) == FUNCTION_DECL > > > > > && (fndecl_built_in_p (decl, BUILT_IN_MD) > > > > > || (fndecl_built_in_p (decl, BUILT_IN_NORMAL) > > > > > && !associated_internal_fn (decl > > > > > return false; > > > > > > > > > > but that would still leave us with too many undefineds I guess > > > > > (gcc_unreachable for one). > > > > > > > > > > We do not currently track builtins that do have a library > > > > > implementation > > > > > (whether that it is used in the end is another thing, but less > > > > > important). > > > > > > > > > > What we definitely can do is put a whitelist above like via the > > > > > following > > > > > which also catches the case of definitions of builtins. > > > > > > > > > > Index: gcc/symtab.c > > > > > === > > > > > --- gcc/symtab.c(revision 273968) > > > > > +++ gcc/symtab.c(working copy) > > > > > @@ -2375,10 +2375,24 @@ symtab_node::output_to_lto_symbol_table_ > > > > > first place. */ > > > > >if (VAR_P (decl) && DECL_HARD_REGISTER (decl)) > > > > > return false; > > > > > + > > > > >/* FIXME: Builtins corresponding to real functions probably should > > > > > have > > > > > symbol table entries. */ > > > > > - if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > > > > > -return false; > > > > > + if (TREE_CODE (decl) == FUNCTION_DECL > > > > > + && !definition > > > > > + && fndecl_built_in_p (decl)) > > > > > +{ > > > > > + if (DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) > > > > > + switch (DECL_FUNCTION_CODE (decl)) > > > > > + { > > > > > + CASE_FLT_FN (BUILT_IN_ATAN2): > > > > > + CASE_FLT_FN (BUILT_IN_SIN): > > > > > + return true; > > > > > + default: > > > > > + break; > > > > > + } > > > > > + return false; > > > > > +} > > > > > > > > > >/* We have real symbol that should be in symbol table. However > > > > > try to > > > > > trim > > > > > down the refernces to libraries bit more because linker will > > > > > otherwise > > > > > > > > Hi Richard, no undefineds generated with below code, what's your > > > > opinion about > > > > the updated code, please? Thanks. > > > > > > It will break code calling __builtin_unreachable for example since > > > we'll emit an UNDEF that cannot be satisfied. > > > > Thanks. I tried to add __builtin_unreachable() in the test case, it can also > > works. As BUILT_IN_UNREACHABLE is defined in buitins.def instead of > > internal-fn.def, so associated_internal_fn will return IFN_LAST for it, > > then no >
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #31 from Xiong Hu XS Luo --- (In reply to rguent...@suse.de from comment #30) > On Fri, 2 Aug 2019, luoxhu at cn dot ibm.com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 > > > > --- Comment #28 from Xiong Hu XS Luo --- > > (In reply to Richard Biener from comment #24) > > > Btw, this is controlled by symtab_node::output_to_lto_symbol_table_p which > > > has > > > > > > /* FIXME: Builtins corresponding to real functions probably should have > > > symbol table entries. */ > > > if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > > > return false; > > > > > > we could try to do sth like > > > > > > if (TREE_CODE (decl) == FUNCTION_DECL > > > && (fndecl_built_in_p (decl, BUILT_IN_MD) > > > || (fndecl_built_in_p (decl, BUILT_IN_NORMAL) > > > && !associated_internal_fn (decl > > > return false; > > > > > > but that would still leave us with too many undefineds I guess > > > (gcc_unreachable for one). > > > > > > We do not currently track builtins that do have a library implementation > > > (whether that it is used in the end is another thing, but less important). > > > > > > What we definitely can do is put a whitelist above like via the following > > > which also catches the case of definitions of builtins. > > > > > > Index: gcc/symtab.c > > > === > > > --- gcc/symtab.c(revision 273968) > > > +++ gcc/symtab.c(working copy) > > > @@ -2375,10 +2375,24 @@ symtab_node::output_to_lto_symbol_table_ > > > first place. */ > > >if (VAR_P (decl) && DECL_HARD_REGISTER (decl)) > > > return false; > > > + > > >/* FIXME: Builtins corresponding to real functions probably should have > > > symbol table entries. */ > > > - if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > > > -return false; > > > + if (TREE_CODE (decl) == FUNCTION_DECL > > > + && !definition > > > + && fndecl_built_in_p (decl)) > > > +{ > > > + if (DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) > > > + switch (DECL_FUNCTION_CODE (decl)) > > > + { > > > + CASE_FLT_FN (BUILT_IN_ATAN2): > > > + CASE_FLT_FN (BUILT_IN_SIN): > > > + return true; > > > + default: > > > + break; > > > + } > > > + return false; > > > +} > > > > > >/* We have real symbol that should be in symbol table. However try to > > > trim > > > down the refernces to libraries bit more because linker will > > > otherwise > > > > Hi Richard, no undefineds generated with below code, what's your opinion > > about > > the updated code, please? Thanks. > > It will break code calling __builtin_unreachable for example since > we'll emit an UNDEF that cannot be satisfied. Thanks. I tried to add __builtin_unreachable() in the test case, it can also works. As BUILT_IN_UNREACHABLE is defined in buitins.def instead of internal-fn.def, so associated_internal_fn will return IFN_LAST for it, then no UNDEF of __builtin_unreachable will be emitted to object file. Most of functions in internal-fn.def are math functions, I am not sure whether you mean the BUILT_IN_NOP or something else?
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #29 from Xiong Hu XS Luo --- (In reply to Xiong Hu XS Luo from comment #28) > (In reply to Richard Biener from comment #24) > > Btw, this is controlled by symtab_node::output_to_lto_symbol_table_p which > > has > > > > /* FIXME: Builtins corresponding to real functions probably should have > > symbol table entries. */ > > if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > > return false; > > > > we could try to do sth like > > > > if (TREE_CODE (decl) == FUNCTION_DECL > > && (fndecl_built_in_p (decl, BUILT_IN_MD) > > || (fndecl_built_in_p (decl, BUILT_IN_NORMAL) > > && !associated_internal_fn (decl > > return false; > > > > but that would still leave us with too many undefineds I guess > > (gcc_unreachable for one). > > > > We do not currently track builtins that do have a library implementation > > (whether that it is used in the end is another thing, but less important). > > > > What we definitely can do is put a whitelist above like via the following > > which also catches the case of definitions of builtins. > > > > Index: gcc/symtab.c > > === > > --- gcc/symtab.c(revision 273968) > > +++ gcc/symtab.c(working copy) > > @@ -2375,10 +2375,24 @@ symtab_node::output_to_lto_symbol_table_ > > first place. */ > >if (VAR_P (decl) && DECL_HARD_REGISTER (decl)) > > return false; > > + > >/* FIXME: Builtins corresponding to real functions probably should have > > symbol table entries. */ > > - if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > > -return false; > > + if (TREE_CODE (decl) == FUNCTION_DECL > > + && !definition > > + && fndecl_built_in_p (decl)) > > +{ > > + if (DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) > > + switch (DECL_FUNCTION_CODE (decl)) > > + { > > + CASE_FLT_FN (BUILT_IN_ATAN2): > > + CASE_FLT_FN (BUILT_IN_SIN): > > + return true; > > + default: > > + break; > > + } > > + return false; > > +} > > > >/* We have real symbol that should be in symbol table. However try to > > trim > > down the refernces to libraries bit more because linker will otherwise > > Hi Richard, no undefineds generated with below code, what's your opinion > about the updated code, please? Thanks. I mean "too many undefineds" here. with below modification, symbols will be output to object file, then linker could link static library as expected. > > diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c > index 47a9143ae26..9d42a57b4b6 100644 > --- a/gcc/lto-streamer-out.c > +++ b/gcc/lto-streamer-out.c > @@ -2644,7 +2644,10 @@ write_symbol (struct streamer_tree_cache_d *cache, > >gcc_checking_assert (TREE_PUBLIC (t) >&& (TREE_CODE (t) != FUNCTION_DECL > - || !fndecl_built_in_p (t)) > + || !fndecl_built_in_p (t, BUILT_IN_MD)) > + && (TREE_CODE (t) != FUNCTION_DECL > + || !fndecl_built_in_p (t, BUILT_IN_NORMAL) > + || associated_internal_fn (t) != IFN_LAST) >&& !DECL_ABSTRACT_P (t) >&& (!VAR_P (t) || !DECL_HARD_REGISTER (t))); > > diff --git a/gcc/symtab.c b/gcc/symtab.c > index 63e2820eb93..ce74589b291 100644 > --- a/gcc/symtab.c > +++ b/gcc/symtab.c > @@ -2377,7 +2377,10 @@ symtab_node::output_to_lto_symbol_table_p (void) > return false; >/* FIXME: Builtins corresponding to real functions probably should have > symbol table entries. */ > - if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > + if (TREE_CODE (decl) == FUNCTION_DECL > + && (fndecl_built_in_p (decl, BUILT_IN_MD) > + || (fndecl_built_in_p (decl, BUILT_IN_NORMAL) > + && IFN_LAST == associated_internal_fn (decl > return false; > >/* We have real symbol that should be in symbol table. However try to > trim
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #28 from Xiong Hu XS Luo --- (In reply to Richard Biener from comment #24) > Btw, this is controlled by symtab_node::output_to_lto_symbol_table_p which > has > > /* FIXME: Builtins corresponding to real functions probably should have > symbol table entries. */ > if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > return false; > > we could try to do sth like > > if (TREE_CODE (decl) == FUNCTION_DECL > && (fndecl_built_in_p (decl, BUILT_IN_MD) > || (fndecl_built_in_p (decl, BUILT_IN_NORMAL) > && !associated_internal_fn (decl > return false; > > but that would still leave us with too many undefineds I guess > (gcc_unreachable for one). > > We do not currently track builtins that do have a library implementation > (whether that it is used in the end is another thing, but less important). > > What we definitely can do is put a whitelist above like via the following > which also catches the case of definitions of builtins. > > Index: gcc/symtab.c > === > --- gcc/symtab.c(revision 273968) > +++ gcc/symtab.c(working copy) > @@ -2375,10 +2375,24 @@ symtab_node::output_to_lto_symbol_table_ > first place. */ >if (VAR_P (decl) && DECL_HARD_REGISTER (decl)) > return false; > + >/* FIXME: Builtins corresponding to real functions probably should have > symbol table entries. */ > - if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) > -return false; > + if (TREE_CODE (decl) == FUNCTION_DECL > + && !definition > + && fndecl_built_in_p (decl)) > +{ > + if (DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) > + switch (DECL_FUNCTION_CODE (decl)) > + { > + CASE_FLT_FN (BUILT_IN_ATAN2): > + CASE_FLT_FN (BUILT_IN_SIN): > + return true; > + default: > + break; > + } > + return false; > +} > >/* We have real symbol that should be in symbol table. However try to > trim > down the refernces to libraries bit more because linker will otherwise Hi Richard, no undefineds generated with below code, what's your opinion about the updated code, please? Thanks. diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index 47a9143ae26..9d42a57b4b6 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -2644,7 +2644,10 @@ write_symbol (struct streamer_tree_cache_d *cache, gcc_checking_assert (TREE_PUBLIC (t) && (TREE_CODE (t) != FUNCTION_DECL - || !fndecl_built_in_p (t)) + || !fndecl_built_in_p (t, BUILT_IN_MD)) + && (TREE_CODE (t) != FUNCTION_DECL + || !fndecl_built_in_p (t, BUILT_IN_NORMAL) + || associated_internal_fn (t) != IFN_LAST) && !DECL_ABSTRACT_P (t) && (!VAR_P (t) || !DECL_HARD_REGISTER (t))); diff --git a/gcc/symtab.c b/gcc/symtab.c index 63e2820eb93..ce74589b291 100644 --- a/gcc/symtab.c +++ b/gcc/symtab.c @@ -2377,7 +2377,10 @@ symtab_node::output_to_lto_symbol_table_p (void) return false; /* FIXME: Builtins corresponding to real functions probably should have symbol table entries. */ - if (TREE_CODE (decl) == FUNCTION_DECL && fndecl_built_in_p (decl)) + if (TREE_CODE (decl) == FUNCTION_DECL + && (fndecl_built_in_p (decl, BUILT_IN_MD) + || (fndecl_built_in_p (decl, BUILT_IN_NORMAL) + && IFN_LAST == associated_internal_fn (decl return false; /* We have real symbol that should be in symbol table. However try to trim
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #22 from Xiong Hu XS Luo --- (In reply to Xiong Hu XS Luo from comment #21) > (In reply to H.J. Lu from comment #19) > > (In reply to Richard Biener from comment #17) > > > (In reply to Richard Biener from comment #16) > > > > (In reply to Richard Biener from comment #15) > > > > > Honza probably knows where we output the LTO symtab and why we do not > > > > > put > > > > > undefs for builtins there. > > > > > > > > #include > > > > double y, z; > > > > void foo (); > > > > int main() > > > > { > > > > volatile double x = atan2 (y, z); > > > > foo (); > > > > } > > > > > > > > > gcc-8 -c t.c -flto > > > > > gcc-nm t.o > > > > U foo > > > > T main > > > > C y > > > > C z > > > > > > > > where's the > > > > > > > > U atan2 > > > > > > > > ? > > > > > > For > > > > > > double atan2 (double x, double y) { return x + y; } > > > > > > it doesn't appear either, this CU has an empty symbol table... > > > > > > I do remember quite some "funs" with builtin handling though, so the > > > current handling may be the least bad of all choices... > > > > [hjl@gnu-cfl-1 pr91287]$ cat foo1.c > > #include > > > > float > > atan2f (float x, float y) > > { > > abort (); > > return x * y; > > } > > [hjl@gnu-cfl-1 pr91287]$ cat foo.c > > float > > atan2f (float x, float y) > > { > > return x * y; > > } > > [hjl@gnu-cfl-1 pr91287]$ cat bar.c > > #include > > > > extern float x, y, z; > > > > void > > bar (void) > > { > > x = atan2f (y, z); > > } > > [hjl@gnu-cfl-1 pr91287]$ cat main.c > > #include > > > > extern void bar (void); > > > > float x, y = 1, z =1; > > > > int > > main (void) > > { > > x = atan2f (y, z); > > bar (); > > return 0; > > } > > [hjl@gnu-cfl-1 pr91287]$ make > > cc -O3 -c -o foo.o foo.c > > ar rc libfoo.a foo.o > > cc -O3 -fpic -c -o bar.o bar.c > > cc -O3 -fpic -c -o foo1.o foo1.c > > ld -shared -o libfoo1.so foo1.o # --version-script foo1.v > > ld -shared -o libbar.so bar.o libfoo1.so > > cc -flto -O3 -o x main.c libfoo.a libbar.so libfoo1.so -Wl,-R,. > > cc -O3 -o y main.c libfoo.a libbar.so libfoo1.so -Wl,-R,. > > ./y > > ./x > > make: *** [Makefile:9: all] Aborted > > [hjl@gnu-cfl-1 pr91287]$ > > > > Since atan2f isn't referenced in IR, linker doesn't extract atan2f from > > libfoo.a. atan2f is resolved to definition in libfoo1.so later. > Thanks for your test case. > If remove the libfoo1.so when build x, it will link libfoo.a. And run x will > not abort. > > luoxhu@genoa pr91287 $ ~/local/gcc_t/bin/gcc -flto -O3 -o x main.c libfoo.a > libbar.so -Wl,-R,. > luoxhu@genoa pr91287 $ ./x > > Since x can link the libfoo.a if libfoo1.so not exists, not quite understand > why "atan2f isn't referenced in IR"? > > Acutually my test case shows that binary built with LTO can link libmass.a > first when use gcc, but failed to link libmass.a if use gfortran(link > libm.so finally). > > PS: My test case (gcc LTO link libmass.a first) doesn't match with your case > result(gcc LTO link libfoo1.so instead of libfoo.a). BTW, the link sequence is quite important, switch libbar.so and libfoo.a, then x will link atan2 in libfoo.a instead of libfoo1.so, which matches my test case: ~/local/gcc_t/bin/gcc -flto -O3 -o x main.c libbar.so libfoo.a libfoo1.so -Wl,-R,.
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #21 from Xiong Hu XS Luo --- (In reply to H.J. Lu from comment #19) > (In reply to Richard Biener from comment #17) > > (In reply to Richard Biener from comment #16) > > > (In reply to Richard Biener from comment #15) > > > > Honza probably knows where we output the LTO symtab and why we do not > > > > put > > > > undefs for builtins there. > > > > > > #include > > > double y, z; > > > void foo (); > > > int main() > > > { > > > volatile double x = atan2 (y, z); > > > foo (); > > > } > > > > > > > gcc-8 -c t.c -flto > > > > gcc-nm t.o > > > U foo > > > T main > > > C y > > > C z > > > > > > where's the > > > > > > U atan2 > > > > > > ? > > > > For > > > > double atan2 (double x, double y) { return x + y; } > > > > it doesn't appear either, this CU has an empty symbol table... > > > > I do remember quite some "funs" with builtin handling though, so the > > current handling may be the least bad of all choices... > > [hjl@gnu-cfl-1 pr91287]$ cat foo1.c > #include > > float > atan2f (float x, float y) > { > abort (); > return x * y; > } > [hjl@gnu-cfl-1 pr91287]$ cat foo.c > float > atan2f (float x, float y) > { > return x * y; > } > [hjl@gnu-cfl-1 pr91287]$ cat bar.c > #include > > extern float x, y, z; > > void > bar (void) > { > x = atan2f (y, z); > } > [hjl@gnu-cfl-1 pr91287]$ cat main.c > #include > > extern void bar (void); > > float x, y = 1, z =1; > > int > main (void) > { > x = atan2f (y, z); > bar (); > return 0; > } > [hjl@gnu-cfl-1 pr91287]$ make > cc -O3 -c -o foo.o foo.c > ar rc libfoo.a foo.o > cc -O3 -fpic -c -o bar.o bar.c > cc -O3 -fpic -c -o foo1.o foo1.c > ld -shared -o libfoo1.so foo1.o # --version-script foo1.v > ld -shared -o libbar.so bar.o libfoo1.so > cc -flto -O3 -o x main.c libfoo.a libbar.so libfoo1.so -Wl,-R,. > cc -O3 -o y main.c libfoo.a libbar.so libfoo1.so -Wl,-R,. > ./y > ./x > make: *** [Makefile:9: all] Aborted > [hjl@gnu-cfl-1 pr91287]$ > > Since atan2f isn't referenced in IR, linker doesn't extract atan2f from > libfoo.a. atan2f is resolved to definition in libfoo1.so later. Thanks for your test case. If remove the libfoo1.so when build x, it will link libfoo.a. And run x will not abort. luoxhu@genoa pr91287 $ ~/local/gcc_t/bin/gcc -flto -O3 -o x main.c libfoo.a libbar.so -Wl,-R,. luoxhu@genoa pr91287 $ ./x Since x can link the libfoo.a if libfoo1.so not exists, not quite understand why "atan2f isn't referenced in IR"? Acutually my test case shows that binary built with LTO can link libmass.a first when use gcc, but failed to link libmass.a if use gfortran(link libm.so finally). PS: My test case (gcc LTO link libmass.a first) doesn't match with your case result(gcc LTO link libfoo1.so instead of libfoo.a).
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #20 from Xiong Hu XS Luo --- (In reply to rguent...@suse.de from comment #11) > On Wed, 31 Jul 2019, wschmidt at linux dot ibm.com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 > > > > --- Comment #10 from wschmidt at linux dot ibm.com --- > > On 7/31/19 2:25 AM, rguenther at suse dot de wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 > > > > > > --- Comment #9 from rguenther at suse dot de > > > --- > > > On Wed, 31 Jul 2019, luoxhu at cn dot ibm.com wrote: > > > > > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 > > >> > > >> --- Comment #8 from Xiong Hu XS Luo --- > > >> (In reply to Thomas Koenig from comment #6) > > >>> (In reply to Xiong Hu XS Luo from comment #4) > > >>> > > >>>> /tmp/cctrpu2h.ltrans0.ltrans.o: In function `MAIN__': > > >>>> :(.text+0x114): undefined reference to `_gfortran_st_write' > > >>>> :(.text+0x12c): undefined reference to > > >>>> `_gfortran_transfer_character_write' > > >>> You're not linkging against libgfortran. > > >>> > > >>> Either use gfortran as command for compiling or linking, or > > >>> add the appropriate libraries (-lgfortran -lquadmath) to > > >>> the linking step. > > >> Thanks Thomas and Richard. Sorry that I am not familiar with fortran. > > >> The > > >> regression was fixed by Martin's new change. > > >> > > >> The c code included math.h actually. > > >> > > >> cat atan2bashzowie.c > > >> #include > > >> #include > > >> #include > > >> > > >> double __attribute__((noinline)) zowie (double x, double y, double z) > > >> { > > >> return atan2 (x * y, z); > > >> } > > >> > > >> double __attribute__((noinline)) rand_finite_double (void) > > >> { > > >> union { > > >> double d; > > >> unsigned char uc[sizeof(double)]; > > >> } u; > > >> do { > > >> for (unsigned i = 0; i < sizeof u.uc; i++) { > > >> u.uc[i] = (unsigned char) rand(); > > >> } > > >> } while (!isfinite(u.d)); > > >> return u.d; > > >> } > > >> > > >> int main () > > >> { > > >> double a = rand_finite_double (); > > >> printf ("%lf\n", zowie (a, 4.5, 2.2)); > > >> return 0; > > >> } > > >> cat build.sh > > >> ~/local/gcc_t/bin/gcc -O3 -mcpu=power9 atan2bashzowie.c -mveclibabi=mass > > >> -L/opt/mass/8.1.3/Linux_LE/lib/ -lmass -lmass_simdp8 -lmassv -lmassvp8 > > >> -o a.out > > >> nm a.out | grep atan2 > > >> ~/local/gcc_t/bin/gcc -O3 -mcpu=power9 atan2bashzowie.c -mveclibabi=mass > > >> -L/opt/mass/8.1.3/Linux_LE/lib/ -lmass -flto -lmass_simdp8 -lmassv > > >> -lmassvp8 -o > > >> a.out > > >> nm a.out | grep atan2 > > >> ./build.sh > > >> 1700 T atan2 > > >> 1700 T _atan2 > > >> 17e0 T atan2 > > >> 17e0 T _atan2 > > > Err, but [_]atan2 are surely not vector variants. Also is massv a static > > > library here? It looks more like you are not getting the code vectorized > > > with -flto but without and both variants end up using massv (the -flto > > > variant using the scalar atan2)? > > > > > > That said, you have to do more detailed analysis of what actually > > > happens and what you _want_ to happen. The bugreport summary > > > doesn't really match what you show. > > > > > Agree that there's some unnecessary confusion here. I think the > > temporary ICE and the build issues obscured the original intent of the bug. > > > > There are two libraries provided with the MASS project. libmass > > provides scalar replacements for corresponding libm scalar math > > functions. libmassv provides the vectorized versions of those > > functions. For this bug we are only concerned about libmass and scalar > > math functions. > > OK, so -mveclibabi=mass isn't needed to reproduce the issue, nor is > linking -lmassv or -lmass_smidp8 then I guess. > > > With the C version of the code, we correctl
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #8 from Xiong Hu XS Luo --- (In reply to Thomas Koenig from comment #6) > (In reply to Xiong Hu XS Luo from comment #4) > > > /tmp/cctrpu2h.ltrans0.ltrans.o: In function `MAIN__': > > :(.text+0x114): undefined reference to `_gfortran_st_write' > > :(.text+0x12c): undefined reference to > > `_gfortran_transfer_character_write' > > You're not linkging against libgfortran. > > Either use gfortran as command for compiling or linking, or > add the appropriate libraries (-lgfortran -lquadmath) to > the linking step. Thanks Thomas and Richard. Sorry that I am not familiar with fortran. The regression was fixed by Martin's new change. The c code included math.h actually. cat atan2bashzowie.c #include #include #include double __attribute__((noinline)) zowie (double x, double y, double z) { return atan2 (x * y, z); } double __attribute__((noinline)) rand_finite_double (void) { union { double d; unsigned char uc[sizeof(double)]; } u; do { for (unsigned i = 0; i < sizeof u.uc; i++) { u.uc[i] = (unsigned char) rand(); } } while (!isfinite(u.d)); return u.d; } int main () { double a = rand_finite_double (); printf ("%lf\n", zowie (a, 4.5, 2.2)); return 0; } cat build.sh ~/local/gcc_t/bin/gcc -O3 -mcpu=power9 atan2bashzowie.c -mveclibabi=mass -L/opt/mass/8.1.3/Linux_LE/lib/ -lmass -lmass_simdp8 -lmassv -lmassvp8 -o a.out nm a.out | grep atan2 ~/local/gcc_t/bin/gcc -O3 -mcpu=power9 atan2bashzowie.c -mveclibabi=mass -L/opt/mass/8.1.3/Linux_LE/lib/ -lmass -flto -lmass_simdp8 -lmassv -lmassvp8 -o a.out nm a.out | grep atan2 ./build.sh 1700 T atan2 1700 T _atan2 17e0 T atan2 17e0 T _atan2
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #7 from Xiong Hu XS Luo --- Created attachment 46647 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46647&action=edit fortran_lto_verbose log
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #4 from Xiong Hu XS Luo --- (In reply to Martin Liška from comment #3) > (In reply to Xiong Hu XS Luo from comment #1) > > Martin's commit 4ee64e30659a9125a47eeea882d8044e690ce334 will cause ICE. > > > > It's a REGRESSION not related to this current issue. > > > > ~/local/gcc_t/bin/gfortran -O3 -mcpu=power9 hellofortran.f90 > > -mveclibabi=mass -L/opt/mass/8.1.3/Linux_LE/lib/ -lmass -flto -lmass_simdp8 > > -lmassv -lmassvp8 > > lto1: internal compiler error: bytecode stream: expected tag identifier_node > > instead of LTO_UNKNOWN > > > > It's not fixed even updated to commit > > cf474017fbb8fbb71d69b0ca4b4b34260cfe5ab3 (Mon Jul 29, Fix ICE seen in > > tree-ssa-dce.c for new/delete pair.). > > > > > > commit 4ee64e30659a9125a47eeea882d8044e690ce334 > > Author: marxin > > Date: Thu Jul 25 09:36:38 2019 + > > > > Extend DCE to remove unnecessary new/delete-pairs (PR c++/23383). > > > > 2019-07-25 Martin Liska > > Which should be hopefully fixed by: > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01761.html Still fail to build after applying your changes, but different from before: ~/workspace/gcc-git/gcc-master_build/gcc/xgcc -B/home/luoxhu/workspace/gcc-git/gcc-master_build/gcc/ -O3 -mcpu=power9 hellofortran.f90 -mveclibabi=mass -L/opt/mass/8.1.3/Linux_LE/lib/ -lmass -flto -lmass_simdp8 -lmassv -lmassvp8 /tmp/cctrpu2h.ltrans0.ltrans.o: In function `MAIN__': :(.text+0x114): undefined reference to `_gfortran_st_write' :(.text+0x12c): undefined reference to `_gfortran_transfer_character_write' :(.text+0x140): undefined reference to `_gfortran_transfer_integer_write' :(.text+0x154): undefined reference to `_gfortran_transfer_integer_write' :(.text+0x168): undefined reference to `_gfortran_transfer_integer_write' :(.text+0x174): undefined reference to `_gfortran_st_write_done' /tmp/cctrpu2h.ltrans0.ltrans.o: In function `main': :(.text.startup+0x14): undefined reference to `_gfortran_set_args' :(.text.startup+0x28): undefined reference to `_gfortran_set_options' collect2: error: ld returned 1 exit status
[Bug lto/91287] LTO disables linking with scalar MASS library (Fortran only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91287 --- Comment #1 from Xiong Hu XS Luo --- Martin's commit 4ee64e30659a9125a47eeea882d8044e690ce334 will cause ICE. It's a REGRESSION not related to this current issue. ~/local/gcc_t/bin/gfortran -O3 -mcpu=power9 hellofortran.f90 -mveclibabi=mass -L/opt/mass/8.1.3/Linux_LE/lib/ -lmass -flto -lmass_simdp8 -lmassv -lmassvp8 lto1: internal compiler error: bytecode stream: expected tag identifier_node instead of LTO_UNKNOWN It's not fixed even updated to commit cf474017fbb8fbb71d69b0ca4b4b34260cfe5ab3 (Mon Jul 29, Fix ICE seen in tree-ssa-dce.c for new/delete pair.). commit 4ee64e30659a9125a47eeea882d8044e690ce334 Author: marxin Date: Thu Jul 25 09:36:38 2019 + Extend DCE to remove unnecessary new/delete-pairs (PR c++/23383). 2019-07-25 Martin Liska
[Bug middle-end/71509] Bitfield causes load hit store with larger store than load
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71509 Xiong Hu XS Luo changed: What|Removed |Added CC||guojiufu at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #7 from Xiong Hu XS Luo --- Hi Richard, trying to figure out the issue recently, but get some questions need your help. How is the status of the "proposed simple lowering of bitfield accesses on GIMPLE", please? for "less conservative about DECL_BIT_FIELD_REPRESENTATIVE", do you mean we choose large mode in GIMPLE stage, and make the decision when in target? Thanks. PS: As a newbie, can you tell how did you do to "Widening the representative" :), I am a bit confused about the best mode and where to process it, sometimes big mode is better and sometimes smaller mode is better(from Segher's comments).
[Bug c/43673] Incorrect warning: use of 'D' length modifier with 'a' type character
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43673 --- Comment #5 from Xiong Hu XS Luo --- Ben's reply regarding to testing dfp on other targets: " > I suggest to test it on a platform where dfp is not supported as well, At this stage, the patches on the trunk don't identify any targets as supporting DFP, so powerpc64 is as good as any other. I will double check on x86, though, for good measure. Thanks, "
[Bug c/43673] Incorrect warning: use of 'D' length modifier with 'a' type character
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43673 --- Comment #4 from Xiong Hu XS Luo --- Hi, Joseph, recently, I summited a quick fix in https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01949.html for this issue. Actually this was introduced by the initial patch https://gcc.gnu.org/ml/gcc-patches/2005-12/msg00330.html committed in 2005. All the decimal floating pointer print function are supported except the Da/DA even it has no much difference with De/Df/Dg/DE/DF/DG, all the value are filled with TEX_* instead of BADLEN for decimal printf table. From consistent view, this patch can fix the issue more easily. But there are questions like Ryan said, the dfp full support requires MACRO __STDC_DEC_FP__ set in DFP library and compiler check, still this mechanism is not implemented yet. Otherwise, it maybe fail on other platforms that don't support DFP. Also, this implementation may need a lot changes to the c front end and libdfp support. What's your suggestion about this? Thanks.
[Bug c/43673] Incorrect warning: use of 'D' length modifier with 'a' type character
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43673 Xiong Hu XS Luo changed: What|Removed |Added CC||joseph at codesourcery dot com, ||luoxhu at cn dot ibm.com --- Comment #3 from Xiong Hu XS Luo --- Hi