[Bug tree-optimization/56223] New: Integer ABS is not recognized for more complicated pattern

2013-02-06 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223 Bug #: 56223 Summary: Integer ABS is not recognized for more complicated pattern Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRME

[Bug tree-optimization/56223] Integer ABS is not recognized for more complicated pattern

2013-02-06 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223 --- Comment #1 from Yuri Rumyantsev 2013-02-06 11:41:21 UTC --- Created attachment 29367 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29367 testcase to reproduce

[Bug target/56200] queens benchmark is faster with -O0 than with any other optimization level

2013-02-06 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56200 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #5 from Yuri Rumyantsev 2013-02-11 13:42:49 UTC --- This pattern is already recognized by simplify_bitwise_binary but only for usual int type, i.e. if we change all short types to the ordinary int (or unsigned) this simplificat

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #7 from Yuri Rumyantsev 2013-02-12 13:05:16 UTC --- (In reply to comment #6) > (In reply to comment #5) > > This pattern is already recognized by simplify_bitwise_binary but only for > > usual int type, i.e. if we change all s

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #9 from Yuri Rumyantsev 2013-02-12 14:43:53 UTC --- (In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #6) > > > (In reply to comment #5) > > > > This pattern is already recognized by simplify_bitwis

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #11 from Yuri Rumyantsev 2013-02-14 12:03:37 UTC --- I did measurements of 3 possible fixes: 1. Comment out 2 patterns related to type sinking. 2. Comment out 1st pattern only. 3. Prohibit type sinking if source type (of def

[Bug target/56309] -O3 optimizer generates conditional moves instead of compare and branch resulting in almost 2x slower code

2013-02-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com

[Bug tree-optimization/56415] New: Performance regression after fix for 56273

2013-02-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56415 Bug #: 56415 Summary: Performance regression after fix for 56273 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56415] Performance regression after fix for 56273

2013-02-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56415 --- Comment #1 from Yuri Rumyantsev 2013-02-21 08:59:33 UTC --- Created attachment 29515 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29515 testcase Test must be compiled with "-O3 -funroll-loops" options.

[Bug tree-optimization/56415] Performance regression after fix for 56273

2013-02-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56415 --- Comment #2 from Yuri Rumyantsev 2013-02-21 09:11:21 UTC --- This bug was introduced by the following fix: r195940 is guilty: Author: rguenth Date: Mon Feb 11 13:33:19 2013 New Revision: 195940 URL: http://gcc.gnu.org/viewcvs?r

[Bug lto/56483] New: LTO issue with expanding GIMPLE_COND

2013-02-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56483 Bug #: 56483 Summary: LTO issue with expanding GIMPLE_COND Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Pr

[Bug lto/56483] LTO issue with expanding GIMPLE_COND

2013-02-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56483 --- Comment #1 from Yuri Rumyantsev 2013-02-28 14:43:22 UTC --- Created attachment 29551 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29551 test-case to reproduce Test must be compiled with -O2 -flto -fno-inline options

[Bug tree-optimization/56595] New: Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization.

2013-03-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56595 Bug #: 56595 Summary: Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization. Classification: Unclassified Product: gcc Version: 4.8.0

[Bug tree-optimization/56595] Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization.

2013-03-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56595 --- Comment #1 from Yuri Rumyantsev 2013-03-11 13:38:25 UTC --- Created attachment 29636 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29636 testcase This test must be compiled with the following options for x86: -ffree-line-le

[Bug tree-optimization/56688] New: Fortran save statement prevents loop vectorization.

2013-03-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56688 Bug #: 56688 Summary: Fortran save statement prevents loop vectorization. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: norma

[Bug tree-optimization/56717] New: Enhance Dot-product pattern recognition to avoid mult widening.

2013-03-25 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56717 Bug #: 56717 Summary: Enhance Dot-product pattern recognition to avoid mult widening. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCON

[Bug tree-optimization/56778] New: ICE on several benchmarks after r196775.

2013-03-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56778 Bug #: 56778 Summary: ICE on several benchmarks after r196775. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56799] New: Runfail after r197060+r197082.

2013-04-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56799 Bug #: 56799 Summary: Runfail after r197060+r197082. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority

[Bug tree-optimization/56799] Runfail after r197060+r197082.

2013-04-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56799 --- Comment #1 from Yuri Rumyantsev 2013-04-01 14:51:28 UTC --- It is sufficient to compile test with '-O2' option.

[Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 Bug #: 56812 Summary: Simple loop is not SLP-vectorized after r196872 Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #1 from Yuri Rumyantsev 2013-04-02 11:22:45 UTC --- Created attachment 29775 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29775 testcase Need to compile with -O3 -funroll-loops options.

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #2 from Yuri Rumyantsev 2013-04-02 11:41:23 UTC --- Sorry, i did a typo in -march option - it must be -march=corei7 -mavx.

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #4 from Yuri Rumyantsev 2013-04-02 13:27:15 UTC --- Yes, the test-case is correct. If we delete your changes we got thee following (with -ftree-vectorizer-verbose-3): t.cc:12: note: vectorizing stmts using SLP.BASIC BLOCK VEC

[Bug tree-optimization/56826] New: Run-fail after r197189.

2013-04-03 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56826 Bug #: 56826 Summary: Run-fail after r197189. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3

[Bug tree-optimization/56826] Run-fail after r197189.

2013-04-03 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56826 --- Comment #1 from Yuri Rumyantsev 2013-04-03 09:48:59 UTC --- Created attachment 29789 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29789 test-case to reproduce To reproduce tha failure the test must be compiled on x86 corei7 w

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #12 from Yuri Rumyantsev 2013-04-08 14:03:45 UTC --- Richard, We found out another issue related to your fix (r196872), namely for the attached test-case t1.c function vect_gen_niters_for_prolog_loop() uses non-invariant poi

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #13 from Yuri Rumyantsev 2013-04-08 14:05:26 UTC --- Created attachment 29824 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29824 testcase The following optins were used to compile on x86: -O3 -funroll-loops -ffast

[Bug tree-optimization/56878] New: Issue with candidate choice in vect_gen_niters_for_prolog_loop.

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878 Bug #: 56878 Summary: Issue with candidate choice in vect_gen_niters_for_prolog_loop. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCON

[Bug tree-optimization/56878] Issue with candidate choice in vect_gen_niters_for_prolog_loop.

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878 --- Comment #1 from Yuri Rumyantsev 2013-04-08 15:47:05 UTC --- Created attachment 29826 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29826 testcase Need to be compiled on x86 with the following options: -O3 -funroll-loops -ffas

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #15 from Yuri Rumyantsev 2013-04-08 15:48:45 UTC --- New bug has been opened for it: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878

[Bug rtl-optimization/56885] [4.8/4.9 Regression] ICE: in assign_by_spills, at lra-assigns.c:1268 with -O -fschedule-insns -fselective-scheduling

2013-04-09 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56885 --- Comment #5 from Yuri Rumyantsev 2013-04-09 13:33:53 UTC --- I did simple investigation and found out that 1. Test is compiled successfully without selective scheduling, i.e. with '-fschedule-insns' only. 2. The problem is that selecti

[Bug rtl-optimization/56885] [4.8/4.9 Regression] ICE: in assign_by_spills, at lra-assigns.c:1268 with -O -fschedule-insns -fselective-scheduling

2013-04-09 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56885 --- Comment #6 from Yuri Rumyantsev 2013-04-09 14:22:28 UTC --- Forgot to mention that __builtin_memset and function argument are not interchangeable since both use the same register di.

[Bug tree-optimization/56935] New: Basic block is not SLP-vectorizeed after r197635.

2013-04-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 Bug #: 56935 Summary: Basic block is not SLP-vectorizeed after r197635. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56935] Basic block is not SLP-vectorizeed after r197635.

2013-04-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 --- Comment #1 from Yuri Rumyantsev 2013-04-12 14:01:50 UTC --- Created attachment 29862 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29862 testcase Need to be compiled with the following options: -O3 -mavx -march=corei7

[Bug tree-optimization/56935] Basic block is not SLP-vectorizeed after r197635.

2013-04-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 --- Comment #4 from Yuri Rumyantsev 2013-04-15 14:54:50 UTC --- Richard, both subq's are accessed the same cash line and it means that after 1st store tthe 2nd load will stall till finish updating data cash (this is not exact explanatio

[Bug tree-optimization/56935] Basic block is not SLP-vectorizeed after r197635.

2013-04-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 --- Comment #6 from Yuri Rumyantsev 2013-04-22 14:46:16 UTC --- Richard, Sorry for troubles since we found out the real cause of performance degradation - code layout was changed after your fix and it caused ~5% slowdown on 253.perlbmk

[Bug tree-optimization/54240] New: Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 Bug #: 54240 Summary: Routine hoist_adjacent_loads does not work properly after r189366 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/54241] New: Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241 Bug #: 54241 Summary: Routine hoist_adjacent_loads does not work properly after r189366 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/54935] New: No way to do if converison

2012-10-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54935 Bug #: 54935 Summary: No way to do if converison Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority:

[Bug tree-optimization/54935] No way to do if converison

2012-10-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54935 Yuri Rumyantsev changed: What|Removed |Added Version|unknown |4.8.0 --- Comment #1 from Yur

[Bug tree-optimization/54939] New: Very poor vectorization of loops with complex arithmetic

2012-10-16 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 Bug #: 54939 Summary: Very poor vectorization of loops with complex arithmetic Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/54939] Very poor vectorization of loops with complex arithmetic

2012-10-16 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 --- Comment #2 from Yuri Rumyantsev 2012-10-16 14:54:50 UTC --- Created attachment 28455 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28455 test reproducer

[Bug tree-optimization/54939] Very poor vectorization of loops with complex arithmetic

2012-10-16 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 --- Comment #3 from Yuri Rumyantsev 2012-10-16 15:06:19 UTC --- I looked through the list of all issues related to vectorization but could not find duplicate.

[Bug tree-optimization/57124] 254.gap@spec2000 got miscompare after r198413

2013-05-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57124 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com --- Comment

[Bug tree-optimization/57430] New: Redundant move instruction is produced after function inlining

2013-05-27 Thread ysrumyan at gmail dot com
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com This is based on performance analysis of eembc2.0 suite on Atom processor in comparison with clang compiler. I prepared a simple reproducer that

[Bug tree-optimization/57430] Redundant move instruction is produced after function inlining

2013-05-27 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #1 from Yuri Rumyantsev --- Created attachment 30203 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30203&action=edit testcase Need to compile with "-O2 -m32" options on x86.

[Bug rtl-optimization/57430] Redundant move instruction is produced after function inlining

2013-05-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #2 from Yuri Rumyantsev --- I don't believe that this is related to rtl optimizations, but rather to inlining phase. To prove it I did small changes in t.c for remove.c (it now has type void): void remove (node ** head, node* elt) {

[Bug rtl-optimization/57430] Redundant move instruction is produced after function inlining

2013-05-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #3 from Yuri Rumyantsev --- Created attachment 30213 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30213&action=edit modified testcase This is modified testcase which does not have a problem with redundant move instruction in in

[Bug rtl-optimization/57468] New: [4.9 Regression] 26% performance drop on important benchmark after r199298.

2013-05-30 Thread ysrumyan at gmail dot com
: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com We found significant performance drop after changes in lra phase, which can be demonstrated on example from http://gcc.gnu.org/bugzilla

[Bug rtl-optimization/57468] [4.9 Regression] 26% performance drop on important benchmark after r199298.

2013-05-30 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57468 --- Comment #1 from Yuri Rumyantsev --- Created attachment 30224 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30224&action=edit test-case to reproduce It should be compiled on x86 with "-O2 -m32" options.

[Bug tree-optimization/57558] New: Issue with number of iterations calculation

2013-06-07 Thread ysrumyan at gmail dot com
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com For the following simple test-case extracted from 254.gap (spec2000): typedef unsigned long ul; void foo (ul* __restrict x, ul* __restrict y, ul n) { ul i; for (i=1; i<=n; i++, x++

[Bug rtl-optimization/57430] Redundant move instruction is produced after function inlining

2013-06-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #4 from Yuri Rumyantsev --- I have not seen any comments about my latest note - have you any ideas about this issue? Thanks ahead.

[Bug lto/57602] Runfails for several C/C++ benchmarks from spec2000 for i686 with -flto after r199422

2013-06-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57602 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com --- Comment

[Bug rtl-optimization/56129] Seg fault on 256.bzip2 from spec2000 with -lto and pre-reload scheduler for x86 Atom

2013-07-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56129 Yuri Rumyantsev changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/57796] AVX2 gather vectorization: code bloat and reduction of performance

2013-07-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com --- Comment

[Bug target/57796] AVX2 gather vectorization: code bloat and reduction of performance

2013-07-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796 --- Comment #4 from Yuri Rumyantsev --- (In reply to Jakub Jelinek from comment #3) > By tuning I've meant the vectorizer cost model. If the desirability of > gathers vs. no vectorization at all doesn't depend only on the insns in the > loop, but

[Bug target/57954] AVX missing vxorps (zeroing) before vcvtsi2s %edx, slow down AVX code

2013-07-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57954 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com --- Comment

[Bug lto/57602] [4.9 Regression] Runfails for several C/C++ benchmarks from spec2000 for i686 with -flto after r199422

2013-08-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57602 --- Comment #12 from Yuri Rumyantsev --- Jan, I tried to test your fix and got the following error message while building trunk compiler (with your fix): ../../../../../trunk/libstdc++-v3/src/c++11/fstream-inst.cc:48:1: error: node is alias but

[Bug lto/57602] [4.9 Regression] Runfails for several C/C++ benchmarks from spec2000 for i686 with -flto after r199422

2013-08-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57602 --- Comment #14 from Yuri Rumyantsev --- Hi Jan, I checked that all benches from spec2000 are run successfully with -flto options and eembc_2_0 suite was also run sucessfully with lto (for 32-bit mode). So go ahead and commit your fix. Best re

[Bug tree-optimization/58135] New: [x86] Missed opportunities for partial SLP

2013-08-12 Thread ysrumyan at gmail dot com
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com If we consider the following simple test-case int a[100]; void foo() { a[0] = a[1] = a[2] = a[3] = 0; } SLP vectorization of basic block takes place: gcc -S -O3 -m32 t.c -ftree

[Bug rtl-optimization/55342] [4.8/4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2013-09-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #8 from Yuri Rumyantsev --- Created attachment 30751 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30751&action=edit modified test-case Modified test-case to reproduce sub-optimal register allocation.

[Bug rtl-optimization/55342] [4.8/4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2013-09-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #9 from Yuri Rumyantsev --- The issue still exists in 4.9 compiler but we got another 30% degradation after r202165 fix. It can be reproduced with modified test-case which as attached with any 4.9 compiler, namely code produced for inn

[Bug rtl-optimization/58384] New: [4.9 regression] Runfail on spec2000/253.perlbmk if lto and pre-reload scheduler is used on x86 after r200133.

2013-09-10 Thread ysrumyan at gmail dot com
Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com We also assume that arm can have the same problem at given benchmark if -flto is

[Bug rtl-optimization/58384] [4.9 regression] Runfail on spec2000/253.perlbmk if lto and pre-reload scheduler is used on x86 after r200133.

2013-09-10 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58384 --- Comment #1 from Yuri Rumyantsev --- Created attachment 30791 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30791&action=edit test-case to reproduce This is compile only test which must be compiled with pre-reload scheduler, i.e. with '-

[Bug rtl-optimization/55342] [4.8/4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2013-09-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #10 from Yuri Rumyantsev --- After fix rev. 202468 assembly looks slightly better but we met with another RA inefficiency which can be illustrated on the attached (t1.c) test compiled with options "-march=atom -mtune=atom -m32 -O2" tha

[Bug rtl-optimization/55342] [4.8/4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2013-09-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #11 from Yuri Rumyantsev --- Created attachment 30816 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30816&action=edit test-case to reproduce t1.c must be compiled on x86 with options: -O2 -march=atom -mtune=atom -mfpmath=sse -m

[Bug tree-optimization/58444] New: [4.9 regression] Runfail on spec2006/434.zeusmp after r202516.

2013-09-17 Thread ysrumyan at gmail dot com
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com We found out that phase loop distribution is responsible for it, namely wrong cfg is generated (after ldist) for pdv.f if it was compiled with options

[Bug tree-optimization/58459] New: [4.9 regression] Loop invariant is not hoisted out of loop after r202525.

2013-09-18 Thread ysrumyan at gmail dot com
: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com We noticed significant performance regression on important bench from eembc2.0 suite which can be exhibit with attached test-case. Assembly

[Bug tree-optimization/58459] [4.9 regression] Loop invariant is not hoisted out of loop after r202525.

2013-09-18 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58459 --- Comment #1 from Yuri Rumyantsev --- Created attachment 30850 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30850&action=edit test-case to reproduce Test must be compiled on x86 with options -Ofast -m332 -march=atom -mtune=atom

[Bug rtl-optimization/58826] New: Runfail on CPU2006 436.cactusADM with after r203739 for core-avx2 target.

2013-10-21 Thread ysrumyan at gmail dot com
: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com To reproduce it is sufficient to compile only one source file - StaggeredLeapfrog2.F which must be preprocessed. In rtl-dump after reload we

[Bug rtl-optimization/58826] Runfail on CPU2006 436.cactusADM with after r203377 for core-avx2 target.

2013-10-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58826 --- Comment #2 from Yuri Rumyantsev --- In fact LRA is responsible for this failure - there is a bug in constant regeneration. LRA correctly regenerates all occurrences of virtual register which is not allocated(i.e. it does not has a register) bu

[Bug rtl-optimization/58853] New: [4.9 regression] ICE after r203937

2013-10-23 Thread ysrumyan at gmail dot com
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com After fix for memcpy/memset expansion on x86 target we have met with ICE on the following simple test: void my_memcpy (char *dest, const char *src, int n) { __builtin_memcpy (dest, src, n); } which

[Bug rtl-optimization/58384] [4.9 regression] Runfail on spec2000/253.perlbmk if lto and pre-reload scheduler is used on x86 after r200133.

2013-11-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58384 --- Comment #4 from Yuri Rumyantsev --- I checked that the fix for PR58831 does not cure the issue, but we cab close this bug since 253.perlbmk is passed now with pre-reload scheduler.

[Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.

2013-11-07 Thread ysrumyan at gmail dot com
: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com After patch to improve register preferencing in IRA and to *remove regmove* pass we noticed performance degradation on several benchmarks

[Bug rtl-optimization/59036] [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.

2013-11-07 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036 --- Comment #1 from Yuri Rumyantsev --- Created attachment 31178 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31178&action=edit test-case to reproduce test need to be compiled with -m32 option for any x86 targets.

[Bug target/57756] Function target attribute is retaining state of previously seen function

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com --- Comment

[Bug target/57756] Function target attribute is retaining state of previously seen function

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756 --- Comment #7 from Yuri Rumyantsev --- Created attachment 31217 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31217&action=edit Additioanl patch for r203634. See my comments.

[Bug rtl-optimization/59133] New: [4.9 regression] ICE after r204219 on SPEC2006 435.gromacs.

2013-11-14 Thread ysrumyan at gmail dot com
Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com I attached a simple reproducer to compile with -march=core-avx2 -c -O3 -ffast-math options (-ffast-math is essential) and got the following ICE: t.c: In

[Bug rtl-optimization/59133] [4.9 regression] ICE after r204219 on SPEC2006 435.gromacs.

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59133 --- Comment #1 from Yuri Rumyantsev --- Created attachment 31219 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31219&action=edit test-case to reproduce Need to be compiled with -m32 -march=core-avx2 -O3 -ffast-math options to reproduce ICE

[Bug rtl-optimization/59133] [4.9 regression] ICE after r204219 on SPEC2006 435.gromacs.

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59133 --- Comment #2 from Yuri Rumyantsev --- It is worth noting that -m32 option is also essential for reproducing.

[Bug target/57756] Function target attribute is retaining state of previously seen function

2013-11-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756 --- Comment #9 from Yuri Rumyantsev --- Hi Uros, I decided that the bug owner should fix it and send my patch (or modified one) for review to GCC community, i.e. I was not planning to fix it. But if I should do it pls let me know and I send it to

[Bug rtl-optimization/58826] [4.9 Regression] Runfail on CPU2006 436.cactusADM with after r203377 for core-avx2 target.

2013-11-19 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58826 --- Comment #6 from Yuri Rumyantsev --- I assume that this bug should be closed since it is not reproducible after the latest LRA fix.

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-11-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com --- Comment

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-12-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 --- Comment #4 from Yuri Rumyantsev --- It turned out that proposed fix does not help trunk compilers since now another huge routine is inlined firstly (read_input) and for perdida we got the following message: not inlinable: iztaccihuatl/17 -> p

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-12-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 --- Comment #5 from Yuri Rumyantsev --- Created attachment 31348 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31348&action=edit test-case to reproduce It need to be compiled with -Ofast -flto options to reproduce.

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-12-03 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 --- Comment #7 from Yuri Rumyantsev --- I saw that on old compiler sources (dated by 20130911) with my patch 'perdida' was inlined without any additional inline parameters (using -flto) but now it does not inlined since another large function read

[Bug rtl-optimization/55342] New: [LRA,x86] Non-optimal code for simple loop with LRA

2012-11-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 Bug #: 55342 Summary: [LRA,x86] Non-optimal code for simple loop with LRA Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: norma

[Bug rtl-optimization/55342] [4.8 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2012-11-19 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #2 from Yuri Rumyantsev 2012-11-19 12:06:20 UTC --- The patching compiler produces better binaries but we still have -6% performance degradation on corei7. The main cause of it it that LRA compiler generates spill of 'pure' byt

[Bug tree-optimization/55731] New: Issue with complete innermost loop unrolling (cunrolli)

2012-12-18 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 Bug #: 55731 Summary: Issue with complete innermost loop unrolling (cunrolli) Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)

2012-12-18 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 --- Comment #1 from Yuri Rumyantsev 2012-12-18 14:23:30 UTC --- Created attachment 28997 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28997 testcase1

[Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)

2012-12-18 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 --- Comment #2 from Yuri Rumyantsev 2012-12-18 14:24:05 UTC --- Created attachment 28998 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28998 testcase2

[Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)

2012-12-19 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 --- Comment #4 from Yuri Rumyantsev 2012-12-19 09:17:40 UTC --- (In reply to comment #3) > The reason is that unrolling early can be harmful to for example vectorization > and thus cunrolli restricts itself to "obviously" profitable cases.

[Bug tree-optimization/55970] New: [x86] Avoid reverse order of function argument gimplifying

2013-01-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55970 Bug #: 55970 Summary: [x86] Avoid reverse order of function argument gimplifying Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIR

[Bug tree-optimization/55970] [x86] Avoid reverse order of function argument gimplifying

2013-01-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55970 --- Comment #1 from Yuri Rumyantsev 2013-01-14 14:46:48 UTC --- Created attachment 29162 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29162 testcase

[Bug tree-optimization/55970] [x86] Avoid reverse order of function argument gimplifying

2013-01-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55970 --- Comment #3 from Yuri Rumyantsev 2013-01-14 15:15:52 UTC --- I pointed out that this code is not C standard compliant but it occurred in customer application that should be ported to x86 platform. This bug is not issued by gcc and very

[Bug tree-optimization/53787] Possible IPA-SRA / IPA-CP improvement

2013-01-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53787 --- Comment #14 from Yuri Rumyantsev 2013-01-22 15:32:06 UTC --- Created attachment 29250 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29250 testcase in F90 Reproducer for IPA_CP

[Bug tree-optimization/53787] Possible IPA-SRA / IPA-CP improvement

2013-01-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53787 Yuri Rumyantsev changed: What|Removed |Added CC||ysrumyan at gmail dot com

[Bug rtl-optimization/56129] New: Seg fault on 256.bzip2 from spec2000 with -lto and pre-reload scheduler for x86 Atom

2013-01-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56129 Bug #: 56129 Summary: Seg fault on 256.bzip2 from spec2000 with -lto and pre-reload scheduler for x86 Atom Classification: Unclassified Product: gcc Version: 4.8.0

[Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom.

2013-01-30 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151 Bug #: 56151 Summary: Performance degradation after r194054 on x86 Atom. Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal

  1   2   3   4   >