[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-07-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #16 from luoxhu at gcc dot gnu.org --- The attached files are all built with -mcpu=power8 and the case also fails on P8LE. Also I verified the code produces expected output on P8BE. ('Aborted' is caused by BE returns 0x41 instead

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-07-25 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #15 from luoxhu at gcc dot gnu.org --- In combine: vec_select(vec_concat and the followed vec_select are combined to a single extract instruction, which seems reasonable for both LE and BE? R146: 0 1 2 3 R141: 4 5 6 7 R150: 2

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-07-25 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #14 from luoxhu at gcc dot gnu.org --- Created attachment 53354 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53354=edit split2

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-07-25 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #13 from luoxhu at gcc dot gnu.org --- Created attachment 53353 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53353=edit after combine

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-07-25 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #12 from luoxhu at gcc dot gnu.org --- Created attachment 53352 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53352=edit combine

[Bug tree-optimization/106293] [13 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2022-07-25 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293 --- Comment #5 from luoxhu at gcc dot gnu.org --- r12-6086

[Bug tree-optimization/106293] [13 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2022-07-25 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293 --- Comment #4 from luoxhu at gcc dot gnu.org --- Could you try revert (In reply to Richard Biener from comment #2) > I can reproduce a regression with -Ofast -march=znver2 running on Haswell as > well. -fopt-info doesn't reveal an

[Bug tree-optimization/105740] missed optimization switch transformation for conditions with duplicate conditions

2022-06-30 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740 --- Comment #10 from luoxhu at gcc dot gnu.org --- (In reply to Martin Liška from comment #9) > (In reply to luoxhu from comment #8) > > (In reply to rguent...@suse.de from comment #6) > > > On Tue, 21 Jun 2022, jakub at gcc

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-06-30 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #8 from luoxhu at gcc dot gnu.org --- init-regs: (insn 13 8 17 2 (set (reg:V4SI 141) (vec_select:V4SI (vec_concat:V8SI (reg/v:V4SI 135 [ R2 ]) (reg/v:V4SI 133 [ R0 ])) (parallel

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-06-30 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #5 from luoxhu at gcc dot gnu.org --- Seems combine wrongly merged two vec_select instructions: Trying 188 -> 199: 188: r343:V4SI=vec_select(vec_concat(r168:V4SI,r338:V4SI),parallel) REG_DEAD r338:V4SI REG_DEAD r

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-06-30 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #4 from luoxhu at gcc dot gnu.org --- Reduced to: #include extern "C" void *memcpy(void *, const void *, unsigned long); typedef __attribute__((altivec(vector__))) unsigned native_simd_type; union { native_s

[Bug tree-optimization/106126] [12 Regression] tree check fail in useless_type_conversion_p, at gimple-expr.cc:87 since r13-1184-g57424087e82db140

2022-06-29 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106126 --- Comment #13 from luoxhu at gcc dot gnu.org --- Otherwise we need record first_bb when conditions_in_bbs->is_empty, then check that in is_beneficial, ordered_remove the info entry if that bb is not the first "if condition" wit

[Bug tree-optimization/106126] [12 Regression] tree check fail in useless_type_conversion_p, at gimple-expr.cc:87 since r13-1184-g57424087e82db140

2022-06-29 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106126 --- Comment #12 from luoxhu at gcc dot gnu.org --- conditions_in_bbs->is_empty doesn't mean that range is at the start of switch condition:(, so couldn't assume to ignore the no_side_effect_bb check?

[Bug tree-optimization/106126] [12 Regression] tree check fail in useless_type_conversion_p, at gimple-expr.cc:87 since r13-1184-g57424087e82db140

2022-06-29 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106126 --- Comment #11 from luoxhu at gcc dot gnu.org --- Sorry for breaking, my bugzilla account is luo...@gcc.gnu.org. The patch seems reasonable to fold 65-90 ('A'-'Z') to switch statement, 4,6c4,6 < ;; Canonical GIMPLE case clusters: 33 60

[Bug tree-optimization/105903] Missed optimization for __synth3way

2022-06-28 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105903 --- Comment #2 from luoxhu at gcc dot gnu.org --- diff --git a/gcc/match.pd b/gcc/match.pd index 4a570894b2e..f6b5415a351 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -5718,6 +5718,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (bit_xor

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-06-23 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069 --- Comment #2 from luoxhu at gcc dot gnu.org --- Could you also paste the ASM difference please? (I don't have environment at handle so far..)

[Bug tree-optimization/105740] missed optimization switch transformation for conditions with duplicate conditions

2022-06-22 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740 --- Comment #8 from luoxhu at gcc dot gnu.org --- (In reply to rguent...@suse.de from comment #6) > On Tue, 21 Jun 2022, jakub at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740 > > >

[Bug tree-optimization/105740] missed optimization switch transformation for conditions with duplicate conditions

2022-06-20 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740 --- Comment #2 from luoxhu at gcc dot gnu.org --- Run if_to_switch and convert_switch again after copyprop2 could remove the redundant statement and expose opportunity for if-to-switch again, is this reasonable or just move if-to-switch/switch

[Bug ipa/100034] missed optimization for dead code elimination at -O3 (vs. -O1, -Os, -O2)

2022-06-08 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100034 --- Comment #2 from luoxhu at gcc dot gnu.org --- (In reply to Richard Biener from comment #1) > Looks related to PR1 - we do an IPA SRA clone but fail to inline it and > thus we end up with > > void d.isra () > { > int

[Bug ipa/93318] [10 regression] Firefox LTO+FDO ICEs in speculative_call_info

2022-05-13 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93318 --- Comment #10 from luoxhu at gcc dot gnu.org --- And the Profile id of that node is streamed to many objects after lto partition: grep -- "19598949" ** db_server.ltrans0.000i.cgraph: Profile id: 19598949 db_server.ltrans0.0

[Bug ipa/93318] [10 regression] Firefox LTO+FDO ICEs in speculative_call_info

2022-05-13 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93318 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug lto/105133] lto/gold: lto failed to link --start-lib/--end-lib in gold for duplicate libraries

2022-04-05 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105133 --- Comment #2 from luoxhu at gcc dot gnu.org --- (In reply to Richard Biener from comment #1) > (In reply to luoxhu from comment #0) > > > > cat hellow.res > > 3 > > hello.o 2 > > 192 ccb9165e037

[Bug lto/105133] New: lto/gold: lto failed to link --start-lib/--end-lib in gold

2022-04-01 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: luoxhu at gcc dot gnu.org CC: marxin at gcc dot gnu.org Target Milestone: --- Hi, linker gold supports --start-lib and --end-lib to "mimics the semantics of static libr

[Bug target/102239] powerpc suboptimal boolean test of contiguous bits

2022-01-11 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution

[Bug tree-optimization/103802] [12 regression] recip-3.c fails after r12-6087 on Power m32

2022-01-11 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution

[Bug bootstrap/103820] [12 Regression] i686 failed to bootstrap with ada by r12-6077

2022-01-11 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug tree-optimization/103802] [12 regression] recip-3.c fails after r12-6087 on Power m32

2022-01-06 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802 --- Comment #6 from luoxhu at gcc dot gnu.org --- (In reply to Richard Biener from comment #5) > So the point is that P is invariant but we do not hoist it because it's > computed in a (estimated) cold block? I notice that the con

[Bug tree-optimization/103802] [12 regression] recip-3.c fails after r12-6087 on Power m32

2021-12-28 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802 --- Comment #4 from luoxhu at gcc dot gnu.org --- Or restore the previous recip count check by comment out the if condition to avoid bb in loop turns cold? diff --git a/gcc/testsuite/gcc.dg/tree-ssa/recip-3.c b/gcc/testsuite/gcc.dg/tree-ssa

[Bug tree-optimization/103793] [12 Regression] ICE: in to_reg_br_prob_base, at profile-count.h:277 with -O3 -fno-guess-branch-probability since r12-6086-gcd5ae148c47c6dee

2021-12-28 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103793 luoxhu at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status

[Bug rtl-optimization/94790] Failure to use andn in specific pattern in which it is available

2021-12-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94790 --- Comment #4 from luoxhu at gcc dot gnu.org --- Just noticed they are different case, scalar vs. vector...

[Bug rtl-optimization/94790] Failure to use andn in specific pattern in which it is available

2021-12-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94790 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug tree-optimization/103802] [12 regression] recip-3.c fails after r12-6087 on Power m32

2021-12-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802 --- Comment #2 from luoxhu at gcc dot gnu.org --- -funroll-loops could work around this, is this reasonable?

[Bug tree-optimization/103802] [12 regression] recip-3.c fails after r12-6087 on Power m32

2021-12-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802 --- Comment #1 from luoxhu at gcc dot gnu.org --- MOVE_MAX_PIECES is 4 on m32 but it is 8 on m64, then estimate_move_cost is different between them 2 vs 1 for “((size + MOVE_MAX_PIECES - 1) / MOVE_MAX_PIECES)". recip-3.m32.c.172t.cunroll:

[Bug middle-end/103802] New: [12 regression] recip-3.c fails after r12-6087 on Power m32

2021-12-22 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: luoxhu at gcc dot gnu.org Target Milestone: --- Invoking the compiler as /home/luoxhu/workspace/gcc-master_build/gcc/xgcc -B/home/luoxhu/workspace/gcc-master_build/gcc/ /home/luoxhu

[Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526

2021-12-21 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution

[Bug tree-optimization/103793] [12 Regression] ICE: in to_reg_br_prob_base, at profile-count.h:277 with -O3 -fno-guess-branch-probability since r12-6086-gcd5ae148c47c6dee

2021-12-21 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103793 luoxhu at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |luoxhu at gcc dot

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2021-12-14 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #6 from luoxhu at gcc dot gnu.org --- Fortran's modulo is floor_mod as documented here: https://gcc.gnu.org/onlinedocs/gfortran/MODULO.html? Syntax: RESULT = MODULO(A, P) Return value: The type and kind of the result are those

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2021-12-14 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug target/102239] powerpc suboptimal boolean test of contiguous bits

2021-11-30 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239 --- Comment #11 from luoxhu at gcc dot gnu.org --- +(define_insn_and_split "*anddi3_insn_dot" + [(set (pc) +(if_then_else (eq (and:DI (match_operand:DI 1 "gpc_reg_operand" "%r,r") +

[Bug target/102239] powerpc suboptimal boolean test of contiguous bits

2021-11-29 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239 --- Comment #9 from luoxhu at gcc dot gnu.org --- (In reply to Segher Boessenkool from comment #8) > (In reply to luoxhu from comment #6) > > > > foo: > > > > .LFB0: > > > > .cfi_sta

[Bug target/102239] powerpc suboptimal boolean test of contiguous bits

2021-11-28 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239 --- Comment #7 from luoxhu at gcc dot gnu.org --- 1| Dump of assembler code for function foo: 2|0x15e0 <+0>: rldicr. r3,r3,29,1 3+> 0x15e4 <+4>: beq 0x15f0 4|0x15

[Bug target/102239] powerpc suboptimal boolean test of contiguous bits

2021-11-28 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239 --- Comment #6 from luoxhu at gcc dot gnu.org --- (In reply to Segher Boessenkool from comment #5) > (In reply to luoxhu from comment #4) > > Simply adjust the sequence of dot instruction could produce expected code, > >

[Bug target/102239] powerpc suboptimal boolean test of contiguous bits

2021-11-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239 --- Comment #4 from luoxhu at gcc dot gnu.org --- Simply adjust the sequence of dot instruction could produce expected code, is this correct? foo: .LFB0: .cfi_startproc rldicr. 3,3,29,1 beq 0,.L2 #APP # 10 "pr102

[Bug target/102239] powerpc suboptimal boolean test of contiguous bits

2021-11-23 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526

2021-11-22 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270 --- Comment #5 from luoxhu at gcc dot gnu.org --- ;; Loop 0 ;; header 0, latch 1 ;; depth 0, outer -1 ;; nodes: 0 1 2 3 4 5 6 11 7 8 10 9 ;; ;; Loop 1 ;; header 8, latch 7 ;; depth 1, outer 0 ;; nodes: 8 7 6 10 5 4 11 3 ;; ;; Loop 2

[Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526

2021-11-22 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270 --- Comment #4 from luoxhu at gcc dot gnu.org --- Created attachment 51851 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51851=edit Fix incorrect loop exit edge probability

[Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526

2021-11-22 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270 --- Comment #3 from luoxhu at gcc dot gnu.org --- The profile count is correct but something wrong with edge probability, and it turns out that r12-4526 exposes a long-existing issue in profile_estimate:predict_extra_loop_exits, when searching

[Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526

2021-11-16 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270 --- Comment #2 from luoxhu at gcc dot gnu.org --- (In reply to Richard Biener from comment #1) > So you say this is a problem with loop header copying, that would mean the > issue is really latent and general, no? Header copyin

[Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526

2021-11-15 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: luoxhu at gcc dot gnu.org Target Milestone: --- For the testcase gcc.dg/vect/pr96698.c, the inner loop was hot (preheader count < loop co

[Bug target/102991] [12 regression] gcc.dg/vect/vect-simd-17.c fails after r12-4757

2021-11-08 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution

[Bug target/102991] [12 regression] gcc.dg/vect/vect-simd-17.c fails after r12-4757

2021-11-04 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991 --- Comment #7 from luoxhu at gcc dot gnu.org --- Fixed, will backport to gcc-11 in a week.

[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818

2021-11-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||ro at gcc dot gnu.org

[Bug tree-optimization/103041] [12 regression] gcc.dg/vect/slp-reduc-10a.c etc. FAIL

2021-11-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103041 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution

[Bug tree-optimization/103041] [12 regression] gcc.dg/vect/slp-reduc-10a.c etc. FAIL

2021-11-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103041 --- Comment #1 from luoxhu at gcc dot gnu.org --- Could you please verify whether it is caused by r12-4818 instead of r12-4819? r12-4819 is a NFC patch which seems more unlikely, and r12-4818 also ICEs in PR103029, it is possibly a duplicate

[Bug target/102991] [12 regression] gcc.dg/vect/vect-simd-17.c fails after r12-4757

2021-11-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991 --- Comment #5 from luoxhu at gcc dot gnu.org --- P9: .L149: lxvx %vs32,%r8,%r10 vadduwm %v12,%v12,%v1 mfvsrd %r5,%vs43 mfvsrld %r4,%vs43 vadduwm %v11,%v11,%v9 stxv %vs44,112(%r1) xxperm

[Bug target/102991] [12 regression] gcc.dg/vect/vect-simd-17.c fails after r12-4757

2021-11-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991 --- Comment #4 from luoxhu at gcc dot gnu.org --- vect-simd-17.p10.c.335r.final: 3379: %v1:V16QI=unspec[%v1:V16QI,%v1:V16QI,%v9:V16QI] 254 3372: {%v11:V4SI=~%v0:V4SI&%v13:V4SI|%v11:V4SI;clobber %r10:V4SI;} // wrong code. REG_DEAD %v0:

[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818

2021-11-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 --- Comment #3 from luoxhu at gcc dot gnu.org --- This hack could restore the previous phi order to put nondfs phi args before dfs_edge args. But I am not sure whether this is the correct direction. At least it proves that the phi order

[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818

2021-11-01 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug target/102991] [12 regression] gcc.dg/vect/vect-simd-17.c fails after r12-4757

2021-10-31 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991 --- Comment #3 from luoxhu at gcc dot gnu.org --- (In reply to Kewen Lin from comment #2) > (In reply to luoxhu from comment #1) > > Couldn't reproduce on rain6p1 (P10): > > > > It's weird, I can reproduce this on rain6

[Bug target/102991] [12 regression] gcc.dg/vect/vect-simd-17.c fails after r12-4757

2021-10-29 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991 --- Comment #1 from luoxhu at gcc dot gnu.org --- Couldn't reproduce on rain6p1 (P10): Test run by luoxhu on Fri Oct 29 04:08:49 2021 Native configuration is powerpc64le-unknown-linux-gnu === gcc tests === Schedule

[Bug target/102868] Missed optimization with __builtin_shuffle and zero vector on ppc

2021-10-28 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102868 luoxhu at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status

[Bug target/94613] S/390, powerpc: Wrong code generated for vec_sel builtin

2021-10-28 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94613 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution

[Bug target/102868] Missed optimization with __builtin_shuffle and zero vector on ppc

2021-10-24 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102868 --- Comment #1 from luoxhu at gcc dot gnu.org --- Patch submitted: https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582452.html

[Bug target/102868] New: Missed optimization with __builtin_shuffle and zero vector on ppc

2021-10-21 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: luoxhu at gcc dot gnu.org Target Milestone: --- Similar to PR94680 and PR100165, PPC currently generates inefficient instructions for below case: typedef float V __attribute__

[Bug target/97142] __builtin_fmod not optimized on POWER

2021-09-13 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution

[Bug tree-optimization/102075] fill_always_executed_in_1 incomplete computation

2021-09-13 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102075 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution

[Bug tree-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2021-09-06 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #2 from luoxhu at gcc dot gnu.org --- Verified 470.lbm doesn't show regression on Power8 with Ofast. runtime is 141 sec for r12-897, without that patch it is 142 sec.

[Bug rtl-optimization/102008] [12 Regression] no cmov generated for loads next to each other

2021-09-06 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102008 --- Comment #3 from luoxhu at gcc dot gnu.org --- phiopt4 and sink2 are doing reverse optimizations: pr102008.c.200t.phiopt4: Hoisting adjacent loads from 3 and 4 into 2: _6 = foo_4(D)->a; _5 = foo_4(D)->b; pr102008.c.202t

[Bug rtl-optimization/102008] [12 Regression] no cmov generated for loads next to each other

2021-09-06 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102008 --- Comment #2 from luoxhu at gcc dot gnu.org --- Confirmed if move the sink2 pass before phiopt4 could restore the previous instructons for this case: test: .LFB0: .cfi_startproc cmp w0, 1 ldp w0, w1, [x1

[Bug target/97142] __builtin_fmod not optimized on POWER

2021-09-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142 --- Comment #15 from luoxhu at gcc dot gnu.org --- Patch updated: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578740.html

[Bug middle-end/102075] New: fill_always_executed_in_1 incomplete computation

2021-08-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: luoxhu at gcc dot gnu.org Target Milestone: --- ALWAYS_EXECUTED_IN is not computed completely for nested loops. Current design will exit if an inner loop doesn't dominate outer loop's latch or exit after exiting

[Bug tree-optimization/101250] adjust_iv_update_pos update the iv statement unexpectedly cause memory address offset mismatch

2021-07-06 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101250 --- Comment #1 from luoxhu at gcc dot gnu.org --- Patch posted: [PATCH] ivopts: Don't adjust IV update statement if both operands use the IV in COND [PR101250] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573894.html

[Bug middle-end/101250] New: adjust_iv_update_pos update the iv statement unexpectedly cause memory address offset mismatch

2021-06-29 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: luoxhu at gcc dot gnu.org Target Milestone: --- Test case: unsigned int foo (unsigned char *ip, unsigned char *ref, unsigned int maxlen

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-21 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866 --- Comment #13 from luoxhu at gcc dot gnu.org --- It is not visible in combine due to the constant data is in *.LC0 and UNSPEC_VPERM. Will shelf this and switch to other high priority issues. pr100866.c.277r.combine: (note 4 0 20 2 [bb 2

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-20 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866 --- Comment #8 from luoxhu at gcc dot gnu.org --- (In reply to Jens Seifert from comment #7) > Regarding vec_revb for vector unsigned int. I agree that > revb: > .LFB0: > .cfi_startproc > vspltish %v1,8 >

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-17 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866 --- Comment #6 from luoxhu at gcc dot gnu.org --- For V4SI, it is also better to use vector splat and vector rotate operations. revb: .LFB0: .cfi_startproc vspltish %v1,8 vspltisw %v0,-16 vrlh %v2,%v2,%v1

[Bug target/93571] PPC: fmr gets used instead of faster xxlor

2021-06-16 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93571 --- Comment #3 from luoxhu at gcc dot gnu.org --- BTW, I didn't see performance difference between fmr and xxlor within a small benchmark. Max Ops Per CycleLatency (Min) Latency (Max) fmr

[Bug target/93571] PPC: fmr gets used instead of faster xxlor

2021-06-16 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93571 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-15 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866 --- Comment #5 from luoxhu at gcc dot gnu.org --- (In reply to Segher Boessenkool from comment #4) > This PR is specifically about the vec_revb builtin. But yes, we should > look at what is generated for all other code (having only the b

[Bug testsuite/101020] [12 regression] Several test case failures after r12-1316

2021-06-15 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101020 luoxhu at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-15 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866 --- Comment #3 from luoxhu at gcc dot gnu.org --- diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 097a127be07..35b3f1a0e1a 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -1932,7

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-15 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug testsuite/101020] [12 regression] Several test case failures after r12-1316

2021-06-10 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101020 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||segher at gcc dot gnu.org

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2021-06-08 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #10 from luoxhu at gcc dot gnu.org --- float128 to vector __int128 is fixed by: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f700e4b0ee3ef53b48975cf89be26b9177e3a3f3

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2021-06-02 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #9 from luoxhu at gcc dot gnu.org --- Patch sent, it could fix the __float128 to vector __int128 issue, https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571689.html But for __float128 to __int128 mentioned in #c4, need hack

[Bug target/94613] S/390, powerpc: Wrong code generated for vec_sel builtin

2021-05-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94613 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug target/97142] __builtin_fmod not optimized on POWER

2021-05-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142 --- Comment #12 from luoxhu at gcc dot gnu.org --- Patch submitted: https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2021-05-24 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()

2021-04-30 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323 --- Comment #17 from luoxhu at gcc dot gnu.org --- If the constant limitation is removed, it could be combined successfully with my new patch for PR94613. https://gcc.gnu.org/pipermail/gcc-patches/2021-April/569255.html And what do you mean

[Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()

2021-04-29 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323 --- Comment #16 from luoxhu at gcc dot gnu.org --- > +2016-11-09 Segher Boessenkool > + > + * simplify-rtx.c (simplify_binary_operation_1): Simplify > + (xor (and (xor A B) C) B) to (ior (and A C) (and B ~C)) and &g

[Bug target/97142] __builtin_fmod not optimized on POWER

2021-04-13 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142 --- Comment #10 from luoxhu at gcc dot gnu.org --- If not built with fast-math, gimple_has_side_effects will return true and cause the expand_call_stmt fail to expand the "_1 = fmod (x_2(D), y_3(D));" to internal function. X86 also pr

[Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()

2021-04-12 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323 --- Comment #15 from luoxhu at gcc dot gnu.org --- (In reply to Segher Boessenkool from comment #14) > (In reply to luoxhu from comment #12) > > That code was called by combine pass but fail to match. > > > > > pr new

[Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()

2021-04-09 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323 --- Comment #12 from luoxhu at gcc dot gnu.org --- That code was called by combine pass but fail to match. pr newpat (set (reg:DI 125 [ l ]) (xor:DI (and:DI (xor:DI (reg/v:DI 120 [ l ]) (reg:DI 127)) (const_int

[Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()

2021-04-08 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323 --- Comment #11 from luoxhu at gcc dot gnu.org --- I noticed that you added the below optimization with commit a62436c0a505155fc8becac07a8c0abe2c265bfe. But it doesn't even handle this case, cse1 pass will call simplify_binary_operation_1, both

[Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()

2021-04-07 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323 --- Comment #9 from luoxhu at gcc dot gnu.org --- Then we could optimized it in match.pd diff --git a/gcc/match.pd b/gcc/match.pd index 036f92fa959..8944312c153 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3711,6 +3711,17

[Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()

2021-04-07 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org

[Bug target/99718] [11 regression] ICE in new test case gcc.target/powerpc/pr98914.c for 32 bits

2021-03-30 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718 luoxhu at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution

[Bug target/99718] [11 regression] ICE in new test case gcc.target/powerpc/pr98914.c for 32 bits

2021-03-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718 --- Comment #19 from luoxhu at gcc dot gnu.org --- https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567395.html This patch extends variable vec_insert to all 32bit VSX targets including Power7{BE} {32,64}, Power8{BE}{32, 64}, Power8{LE}{64

[Bug target/99718] [11 regression] ICE in new test case gcc.target/powerpc/pr98914.c for 32 bits

2021-03-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718 --- Comment #15 from luoxhu at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #14) > You still have: > if (VECTOR_MEM_VSX_P (mode)) > { > if (!CONST_INT_P (elt_rtx)) > { > if

[Bug target/99718] [11 regression] ICE in new test case gcc.target/powerpc/pr98914.c for 32 bits

2021-03-26 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718 --- Comment #13 from luoxhu at gcc dot gnu.org --- Performance data in #c11 is for int variable vec_insert of 32bit mode, the float variable vec_insert of 32-bit is a bit slower but much better than original(extra stfs+lwz of insn #17 and insn 18

  1   2   >