[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-09-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 --- Comment #7 from cuilili --- (In reply to Martin Jambor from comment #6) > I believe this has been fixed? Yes.

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-06-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 --- Comment #3 from cuilili --- I reproduced S1244 regression on znver3. Src code: for (int i = 0; i < LEN_1D-1; i++) { a[i] = b[i] + c[i] * c[i] + b[i] * b[i] + c[i]; d[i] = a[i] + a[i+1]; }

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-06-09 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 cuilili changed: What|Removed |Added CC||lili.cui at intel dot com --- Comment #2

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2023-06-06 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #14 from cuilili --- This regression has been fixed with the commit below and we can close this ticket. https://gcc.gnu.org/g:1b9a5cc9ec08e9f239dd2096edcc447b7a72f64a

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-06-06 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 --- Comment #5 from cuilili --- (In reply to Martin Jambor from comment #4) > So is this now fixed? Yes, the attachment case has been fixed.

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-05-30 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 --- Comment #2 from cuilili --- (In reply to Richard Biener from comment #1) > Probably best to limit the values to reassoc-width by adding the > appropriate IntegerRange attribute in params.opt > > IntegerRange(0, 256) > > maybe?

[Bug target/104271] [12/13 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-11-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #12 from cuilili --- This regression caused by the store forwarding issue, we eliminate the redundant two pairs of loads and stores which have store forwarding issue by inlining. This regression has been fixed by

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2022-07-26 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 105493, which changed state. Bug 105493 Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

[Bug target/105493] [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-07-26 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 cuilili changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/105493] [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-05-05 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 --- Comment #2 from cuilili --- (In reply to Richard Biener from comment #1) > Martin is currently re-benchmarking GCC 12 on AMD, so let's see if there's > anything left on those. AMD may not have this issue, Richard fixed AMD regression with

[Bug target/105493] New: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-05-05 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 Bug ID: 105493 Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718 Product: gcc

[Bug target/104723] [12 regression] Redundant usage of stack

2022-04-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #11 from cuilili --- (In reply to Jakub Jelinek from comment #10) > And for the backend, the question is how big the penalty for the overlapping > store is compared to doing multiple non-overlapping stores. Say for those > 49

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-04-15 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #9 from cuilili --- Really appreciate for your reply, I debugged SRA pass with the small testcase and found that SRA dose not handle this situation. SRA cannot split callee's first parameter for "Do not decompose non-BLKmode

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-03-29 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #7 from cuilili --- Created attachment 52706 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52706=edit Add a heuristic for eliminate redundant load and store in inline pass. Hi Richard, Could you help take a look? This is my

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-03-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #6 from cuilili --- I created a patch to fix this regression. The patch is under performance testing. Will sent it out later.

[Bug target/104723] [12 regression] Redundant usage of stack

2022-03-02 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #9 from cuilili --- (In reply to cuilili from comment #3) > (In reply to Hongtao.liu from comment #1) > > STF issue here? > correct comment #3 I used perf to collect the "ld_blocks.store_forward" event for those two test cases,

[Bug target/104723] [12 regression] Redundant usage of stack

2022-03-01 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #3 from cuilili --- (In reply to Hongtao.liu from comment #1) > STF issue here? Yes, Since "YMMWORD PTR [rsp-72]" across the cache line, it has STLF issue here. vmovdqu64 YMMWORD PTR [rsp-72], ymm31 --> store 32 bytes from

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #28 from cuilili --- (In reply to H.J. Lu from comment #25) > Can this be mitigated by removing redundant load and store? Yes, inlining say_sphere can remove redundant loads and stores, O3 does inlining, but O2 is more sensitive to

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #24 from cuilili --- (In reply to cuilili from comment #23) > (In reply to Richard Biener from comment #17) > > I do wonder though how CLX is fine with such access pattern ;) (did you > > test > > with just -O2?) > Sorry, correct

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 cuilili changed: What|Removed |Added CC||lili.cui at intel dot com --- Comment #23