[Bug tree-optimization/96129] [11 regression] gcc.dg/vect/vect-alias-check.c etc. FAIL

2020-10-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96129 --- Comment #4 from Kewen Lin --- As the regressed failures, it's highly suspected to be duplicated of PR96376.

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #32 from Kewen Lin --- (In reply to Richard Biener from comment #31) > (In reply to Kewen Lin from comment #29) > > (In reply to Hongtao.liu from comment #28) > > > > Probably you can try to tweak it in ix86_add_stmt_cost? when the

[Bug tree-optimization/97075] [11 regression] powerpc64 vector tests fails after r11-3230

2020-09-23 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97075 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #18 from Kewen Lin --- (In reply to Richard Biener from comment #10) > (In reply to Kewen Lin from comment #9) > > (In reply to Richard Biener from comment #8) > > > (In reply to Kewen Lin from comment #7) > > > > Two questions in

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #19 from Kewen Lin --- (In reply to rguent...@suse.de from comment #17) > On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 > > > > --- Comment #15 from Kewen Lin --- > >

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #29 from Kewen Lin --- (In reply to Hongtao.liu from comment #28) > > Probably you can try to tweak it in ix86_add_stmt_cost? when the statement > > Yes, it's the place. > > > is UB to UH conversion statement, further check if the

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #25 from Kewen Lin --- > > > > Got it! For > > > > else if (vect_nop_conversion_p (stmt_info)) > > continue; > > > > Is it a good idea to change it to call record_stmt_cost like the others? > > 1) introduce one

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #26 from Kewen Lin --- > > By following this idea, to release the restriction on loop_outer > > (loop_father) when setting the father_bbs, I can see FRE works as > > expectedly. But it actually does the rpo_vn from cfun's entry to

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 --- Comment #27 from Kewen Lin --- (In reply to Hongtao.liu from comment #22) > >One of my workmates found that if we disable vectorization for SPEC2017 > >>525.x264_r function sub4x4_dct in source file x264_src/common/dct.c with > >?>explicit

[Bug tree-optimization/98138] BB vect fail to SLP one case

2021-01-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 --- Comment #7 from Kewen Lin --- (In reply to Richard Biener from comment #6) > Starting from the loads is not how SLP discovery works so there will be > zero re-use of code. Sure - the only important thing is you end up > with a valid SLP

[Bug tree-optimization/98138] BB vect fail to SLP one case

2021-01-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 --- Comment #8 from Kewen Lin --- Created attachment 49942 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49942=edit vectorized with altivec built-in functions

[Bug tree-optimization/98138] BB vect fail to SLP one case

2020-12-06 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 --- Comment #3 from Kewen Lin --- (In reply to Richard Biener from comment #2) > So the expected vectorization builds vectors > > { tmp[0][0], tmp[1][0], tmp[2][0], tmp[3][0] } > > that's not SLP, SLP tries to build the > > { tmp[i][0],

[Bug tree-optimization/98113] [11 Regression] popcnt is not vectorized on s390 since f5e18dd9c7da

2020-12-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98113 Kewen Lin changed: What|Removed |Added CC||rguenther at suse dot de Last

[Bug tree-optimization/98113] [11 Regression] popcnt is not vectorized on s390 since f5e18dd9c7da

2020-12-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98113 --- Comment #2 from Kewen Lin --- (In reply to Kewen Lin from comment #1) > (In reply to Ilya Leoshkevich from comment #0) > > s390's vxe/popcount-1.c began to fail after PR96789 fix. > > Sorry to see this regression. > > ... > > > > > that

[Bug tree-optimization/98138] New: BB vect fail to SLP one case

2020-12-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 Bug ID: 98138 Summary: BB vect fail to SLP one case Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization

[Bug tree-optimization/98138] BB vect fail to SLP one case

2020-12-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 --- Comment #1 from Kewen Lin --- Similar case is x264_pixel_satd_8x4 in x264 https://github.com/mirror/x264/blob/4121277b40a667665d4eea1726aefdc55d12d110/common/pixel.c#L288

[Bug other/98437] New: confusing wording in the description of option -fsanitize=address

2020-12-24 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98437 Bug ID: 98437 Summary: confusing wording in the description of option -fsanitize=address Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/97744] [11 regression] 32 bit floating point result errors after r11-4637

2020-11-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97744 --- Comment #4 from Kewen Lin --- The additional pass fre4 run triggers this, to disable fre4 can make it pass (but to disable dse3 can't separately, so it's unrelated), further narrowing down shows fre4 on the function MG3XDEMO is responsible.

[Bug tree-optimization/97744] [11 regression] 32 bit floating point result errors after r11-4637

2020-11-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97744 --- Comment #5 from Kewen Lin --- btw, this is power7 specific, I found it can pass with -mcpu=power8.

[Bug tree-optimization/97744] [11 regression] 32 bit floating point result errors after r11-4637

2020-11-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97744 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug rtl-optimization/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637

2020-11-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/97744] [11 regression] 32 bit floating point result errors after r11-4637

2020-11-06 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97744 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Last

[Bug other/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637

2020-11-03 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0

[Bug tree-optimization/96376] [11 regression] vect/vect-alias-check.c and vect/vect-live-5.c fail on armeb

2020-10-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96376 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #5

[Bug tree-optimization/96129] [11 regression] gcc.dg/vect/vect-alias-check.c etc. FAIL

2020-10-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96129 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/96376] [11 regression] vect/vect-alias-check.c and vect/vect-live-5.c fail on armeb

2020-10-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96376 Kewen Lin changed: What|Removed |Added CC||ro at gcc dot gnu.org --- Comment #4 from

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2020-11-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 96789, which changed state. Bug 96789 Summary: x264: sub4x4_dct() improves when vectorization is disabled https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 What|Removed |Added

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-11-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug testsuite/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637

2020-11-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705 --- Comment #4 from Kewen Lin --- I think my commit just exposed one bug in ira. The newly introduced function remove_scratches can bump the max_regno, then the data structures regstat_n_sets_and_refs and reg_info_p which are allocated according

[Bug target/96933] rs6000: inefficient code for char/short vec CTOR

2020-11-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug gcov-profile/97594] [11 Regression] new test case gcc.dg/tree-prof/pr97461.c execution failure

2020-11-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97594 --- Comment #3 from Kewen Lin --- (In reply to Martin Liška from comment #2) > (In reply to Martin Liška from comment #1) > > Mine, I see a strange error: > > > > $ Program received signal SIGBUS, Bus error. > > 0x3fffb7ceddbc in

[Bug testsuite/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637

2020-11-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705 --- Comment #3 from Kewen Lin --- The "-DMASK=2" dumping has more lines for register 282, which is introduced in ira. Something weird causes ira to dump more contexts. $ diff dump1/dump-noaddr.c.289r.ira dump2/dump-noaddr.c.289r.ira 107a108 >

[Bug tree-optimization/98464] [11 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in tree_nop_conversion_p, at tree.c:12825 by r11-4637

2020-12-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98464 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED CC|

[Bug tree-optimization/98464] [11 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in tree_nop_conversion_p, at tree.c:12825 by r11-4637

2020-12-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98464 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment

[Bug tree-optimization/98138] BB vect fail to SLP one case

2021-01-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 --- Comment #4 from Kewen Lin --- (In reply to Kewen Lin from comment #3) > > IIUC, in current implementation, we get four grouped stores: > { tmp[i][0], tmp[i][1], tmp[i][2], tmp[i][3] } /i=0,1,2,3/ independently > > When all these tryings

[Bug c/89126] missing -Wtype-limits for int variables

2021-01-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89126 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #4

[Bug tree-optimization/98138] BB vect fail to SLP one case

2021-01-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98138 --- Comment #5 from Kewen Lin --- (In reply to Kewen Lin from comment #4) > One rough idea seems: > 1) Relax this condition all_uniform_p somehow to get SLP instance building > to go deeper and get those p1/p2 loads as SLP nodes. > 2)

[Bug tree-optimization/98464] [11 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in tree_nop_conversion_p, at tree.c:12825 by r11-4637

2021-01-03 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98464 Kewen Lin changed: What|Removed |Added Assignee|linkw at gcc dot gnu.org |rguenth at gcc dot gnu.org ---

[Bug tree-optimization/98464] [11 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in tree_nop_conversion_p, at tree.c:12825 by r11-4637

2021-01-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98464 --- Comment #8 from Kewen Lin --- (In reply to Richard Biener from comment #5) > But this > > sprime = eliminate_avail (gimple_bb (SSA_NAME_DEF_STMT (use)), use); > > should make it more conservative (compared to the more desirable use

[Bug tree-optimization/98464] [11 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in tree_nop_conversion_p, at tree.c:12825 by r11-4637

2021-01-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98464 --- Comment #10 from Kewen Lin --- (In reply to rguent...@suse.de from comment #9) > On Mon, 4 Jan 2021, linkw at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98464 > > > > --- Comment #8 from Kewen Lin --- > >

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-06-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug rtl-optimization/100328] IRA doesn't model matching constraint well

2021-06-24 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 --- Comment #3 from Kewen Lin --- (In reply to Vladimir Makarov from comment #2) > (In reply to Kewen Lin from comment #1) > > Created attachment 50715 [details] > > ira:consider matching cstr in all alternatives > > > > With little

[Bug rtl-optimization/100328] IRA doesn't model matching constraint well

2021-06-24 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at

[Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST

2021-05-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-05-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 --- Comment #2 from Kewen Lin --- (In reply to Richard Biener from comment #1) Thanks for the comments! > There's predictive commoning which can do similar transforms and runs after > vectorization. It might be it doesn't handle these

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-05-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 --- Comment #4 from Kewen Lin --- (In reply to rguent...@suse.de from comment #3) > On Fri, 28 May 2021, linkw at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 > > > > --- Comment #2 from Kewen Lin --- > >

[Bug tree-optimization/100794] New: suboptimal code due to missing pre2 when vectorization fails

2021-05-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 Bug ID: 100794 Summary: suboptimal code due to missing pre2 when vectorization fails Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-05-31 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 --- Comment #6 from Kewen Lin --- Created attachment 50894 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50894=edit Method 1, implicitly enable pcom without unrolling once loop vectorization is enabled but pcom isn't set explicitly.

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-05-31 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 --- Comment #8 from Kewen Lin --- Created attachment 50896 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50896=edit M1 M2 SPEC2017 P9 eval result

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-05-31 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 --- Comment #7 from Kewen Lin --- Created attachment 50895 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50895=edit Method 2, let pre generate loop carried dependence for very cheap and cheap cost model.

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-05-31 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 --- Comment #9 from Kewen Lin --- (In reply to rguent...@suse.de from comment #5) > On Fri, 28 May 2021, linkw at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 > > > > --- Comment #4 from Kewen Lin --- > >

[Bug tree-optimization/100794] suboptimal code due to missing pre2 when vectorization fails

2021-05-31 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100794 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Ever

[Bug tree-optimization/101291] turns infinite loop into finite

2021-07-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101291 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #1

[Bug tree-optimization/101291] turns infinite loop into finite

2021-07-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101291 --- Comment #2 from Kewen Lin --- (In reply to Kewen Lin from comment #1) > Hi Jeff, what's the option and stanza? The reason why I asked is that I can't simply reproduce it locally at O2, with C compiler it likely runs forever. I guess what

[Bug rtl-optimization/100328] IRA doesn't model matching constraint well

2021-07-01 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 --- Comment #8 from Kewen Lin --- (In reply to rsand...@gcc.gnu.org from comment #7) > (In reply to Kewen Lin from comment #6) > > Created attachment 51066 [details] > > aarch64 XPASS failure list > > > > The patch v3 bootstrapped and

[Bug target/101235] [11/12 Regression] Fails to bootstrap with binutils 2.32

2021-06-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101235 --- Comment #3 from Kewen Lin --- Will backport the fix after 2021 July 7th (two weeks since it's into trunk) if this isn't urgent meanwhile got the backport approval.

[Bug target/101235] [11/12 Regression] Fails to bootstrap with binutils 2.32

2021-06-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101235 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug rtl-optimization/100328] IRA doesn't model matching constraint well

2021-07-06 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug rtl-optimization/100328] IRA doesn't model matching constraint well

2021-06-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 --- Comment #6 from Kewen Lin --- Created attachment 51066 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51066=edit aarch64 XPASS failure list The patch v3 bootstrapped and regression-tested on x86_64-redhat-linux and

[Bug rtl-optimization/100328] IRA doesn't model matching constraint well

2021-06-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 --- Comment #5 from Kewen Lin --- Created attachment 51065 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51065=edit ira: Consider matching constraint heavily with some parameter v3 The mentioned only one aarch64-linux-gnu "PASS->FAIL"

[Bug target/101235] [11/12 Regression] Fails to bootstrap with binutils 2.32

2021-06-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101235 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0

[Bug rtl-optimization/100328] IRA doesn't model dup num constraint well

2021-04-30 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 --- Comment #1 from Kewen Lin --- Created attachment 50715 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50715=edit ira:consider matching cstr in all alternatives With little understanding on ira, I am not quite sure this patch is on

[Bug rtl-optimization/100328] New: IRA doesn't model dup num constraint well

2021-04-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328 Bug ID: 100328 Summary: IRA doesn't model dup num constraint well Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST

2021-03-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398 Bug ID: 99398 Summary: Miss to optimize vector permutation fed by CTOR and CTOR/CST Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST

2021-03-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0

[Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST

2021-03-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398 --- Comment #2 from Kewen Lin --- Created attachment 50329 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50329=edit tested patch

[Bug tree-optimization/101944] New: suboptimal SLP for reduced case from namd_r

2021-08-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101944 Bug ID: 101944 Summary: suboptimal SLP for reduced case from namd_r Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug tree-optimization/101944] suboptimal SLP for reduced case from namd_r

2021-08-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101944 --- Comment #2 from Kewen Lin --- Back to the optimized IR, I thought the problem is that the vectorized version has longer critical path for the reduc_plus result (latency in total). For vectorized version, _51 = diffa_41(D) *

[Bug tree-optimization/101944] suboptimal SLP for reduced case from namd_r

2021-08-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101944 --- Comment #1 from Kewen Lin --- The original costing shows the vectorized version wins, by checking the costings, it missed to model the cost of lane extraction, the patch was posted in:

[Bug tree-optimization/101944] suboptimal SLP for reduced case from namd_r

2021-08-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101944 --- Comment #5 from Kewen Lin --- (In reply to Richard Biener from comment #3) > On x86 we even have > > Vector cost: 136 > Scalar cost: 196 > > note that we seem to vectorize the reduction but that only happens with > -ffast-math, not

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-09-01 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #20 from Kewen Lin --- Thanks for the detailed explanation, Mike! The fusion related flags have been considered in the posted patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html. One RFC/Patch

[Bug tree-optimization/102054] New: slightly worse code as PRE on some code got disabled for loop vectorization

2021-08-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102054 Bug ID: 102054 Summary: slightly worse code as PRE on some code got disabled for loop vectorization Product: gcc Version: 12.0 Status: UNCONFIRMED Severity:

[Bug tree-optimization/102054] slightly worse code as PRE on some code got disabled for loop vectorization

2021-08-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102054 Kewen Lin changed: What|Removed |Added CC||crazylht at gmail dot com,

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-08-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #9 from Kewen Lin --- One more reduced test case: fail cmd: gcc -c -O2 -flto -mcpu=power8 pass cmd: gcc -c -O2 -flto -mcpu=power8 -mno-htm -mno-power8-fusion -- __attribute__((always_inline)) int foo(int *b) {

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-08-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #13 from Kewen Lin --- (In reply to Richard Biener from comment #10) > OPTION_MASK_P8_FUSION is purely optimization and shouldn't prevent inlining, > no? > > As of HTM it would make the testcase a user error - when using

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-08-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #14 from Kewen Lin --- (In reply to Richard Biener from comment #11) > Note that x86 uses for example > > else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath >/* If the calle doesn't use FP expressions

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-08-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #15 from Kewen Lin --- (In reply to Florian Weimer from comment #12) > (In reply to Richard Biener from comment #10) > > As of HTM it would make the testcase a user error - when using -mcpu=power10 > > it would require building with

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-08-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #17 from Kewen Lin --- Created attachment 51357 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51357=edit Fix some issues in rs6000_can_inline_p As Martin pointed out, currently function rs6000_can_inline_p just returns true

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-08-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #18 from Kewen Lin --- (In reply to Martin Liška from comment #16) > > > > Thanks for the example, it looks useful! Now the field fp_expressions is > > generic, one target specific summary class seems required then. And not sure >

[Bug c/102062] powerpc suboptimal unrolling simple array sum

2021-08-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102062 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #8

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-08-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org

[Bug tree-optimization/102054] slightly worse code as PRE on some code got disabled for loop vectorization

2021-09-13 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102054 --- Comment #2 from Kewen Lin --- Yet another reduced test case from 526.blender_r. #include typedef struct QMCSampler { struct QMCSampler *next, *prev; int type; int tot; int used; double *samp2d; double offs[1][2]; }

[Bug lto/102347] "fatal error: target specific builtin not available" with MMA and LTO

2021-09-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102347 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org

[Bug target/102347] "fatal error: target specific builtin not available" with MMA and LTO

2021-09-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102347 --- Comment #4 from Kewen Lin --- I found i386 port seems doesn't have this issue. #include #include typedef union { __m128 x; float a[4]; } union128; #pragma GCC target("sse") int main() { union128 u; __m128 a = _mm_set_ps

[Bug tree-optimization/102383] Missing optimization for PRE after enable O2 vectorization

2021-09-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102383 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #4

[Bug target/102347] "fatal error: target specific builtin not available" with MMA and LTO

2021-09-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102347 --- Comment #3 from Kewen Lin --- This seems not a target specific issue. I noticed the target_option tree node is created expectedly when seeing target pragma, it explains why it works well without lto. When lto does streaming out, it does

[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")

2021-09-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #23 from Kewen Lin --- (In reply to Chip Kerchner from comment #22) > (In reply to Chip Kerchner from comment #21) - Forgot one line of code > > -- > > #pragma GCC target "cpu=power10" > > int main() { > >

[Bug target/102347] "fatal error: target specific builtin not available" with MMA and LTO

2021-09-23 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102347 --- Comment #9 from Kewen Lin --- (In reply to Peter Bergner from comment #7) > (In reply to Martin Liška from comment #6) > > Quickly looking at the rs6000 code, it fails here: > > > > #1 0x11a0993c in rs6000_invalid_builtin > >

[Bug target/102347] "fatal error: target specific builtin not available" with MMA and LTO

2021-09-23 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102347 Kewen Lin changed: What|Removed |Added CC||ktkachov at gcc dot gnu.org,

[Bug other/102440] Uinteger Opt/Param but the underlying type is signed

2021-09-23 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102440 --- Comment #3 from Kewen Lin --- (In reply to Andrew Pinski from comment #2) > The other option handling bug report I saw dealing with the awk script was > recorded as other. Thanks Andrew! I just found there is a "other", how blind I am!

[Bug testsuite/102658] [12 regression] Many test case failures after r12-4240

2021-10-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102658 --- Comment #7 from Kewen Lin --- r12-4273 caused some new expected failures: FAIL: c-c++-common/Wstringop-overflow-2.c -Wc++-compat (test for excess errors) FAIL: c-c++-common/Wstringop-overflow-2.c -std=gnu++14 (test for excess errors)

[Bug other/102713] [12 regression] Several failures after r12-3273

2021-10-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102713 Kewen Lin changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED

[Bug testsuite/102658] [12 regression] Many test case failures after r12-4240

2021-10-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102658 --- Comment #6 from Kewen Lin --- *** Bug 102713 has been marked as a duplicate of this bug. ***

[Bug testsuite/102658] [12 regression] Many test case failures after r12-4240

2021-10-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102658 --- Comment #9 from Kewen Lin --- There are some discussions [1] to improve the fixing way for the test cases in g++.dg and c-c++-common. So I hold the changes adding powerpc*-*-* onto them, just updated the testcases under gcc.target/powerpc/.

[Bug testsuite/102658] [12 regression] Many test case failures after r12-4240

2021-10-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102658 --- Comment #11 from Kewen Lin --- > > For the failure: > > FAIL: libgomp.graphite/force-parallel-8.c scan-tree-dump-times graphite "5 > > loops carried no dependency" 1 > > > > It's not a target specific failure, Hongtao already posted one

[Bug testsuite/102658] [12 regression] Many test case failures after r12-4240

2021-10-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102658 --- Comment #4 from Kewen Lin --- Created attachment 51576 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51576=edit rs6000-test-Adjust-test-cases-due-to-O2-vect Tested successfully on P9LE, note that it relies on r12-4273. Still

[Bug target/102847] [12 regression] r12-4504 breaks powerpc64 build on power 7

2021-10-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102847 Kewen Lin changed: What|Removed |Added Status|RESOLVED|REOPENED Last reconfirmed|

[Bug target/102847] [12 regression] r12-4504 breaks powerpc64 build on power 7

2021-10-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102847 --- Comment #8 from Kewen Lin --- (In reply to Richard Biener from comment #5) > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 9cbc1af4cc9..8f527452bd0 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@

[Bug target/102789] [12 regression] libgomp.c++/simd-3.C fails after r12-4340 for 32 bits

2021-10-19 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102789 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org

[Bug target/102789] [12 regression] libgomp.c++/simd-3.C fails after r12-4340 for 32 bits

2021-10-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102789 Kewen Lin changed: What|Removed |Added CC||bergner at gcc dot gnu.org,

  1   2   3   4   5   6   7   8   >