[Bug rtl-optimization/55342] New: [LRA,x86] Non-optimal code for simple loop with LRA

2012-11-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 Bug #: 55342 Summary: [LRA,x86] Non-optimal code for simple loop with LRA Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity:

[Bug rtl-optimization/55342] [4.8 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2012-11-19 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com 2012-11-19 12:06:20 UTC --- The patching compiler produces better binaries but we still have -6% performance degradation on corei7. The main cause of it it that LRA compiler

[Bug tree-optimization/55731] New: Issue with complete innermost loop unrolling (cunrolli)

2012-12-18 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 Bug #: 55731 Summary: Issue with complete innermost loop unrolling (cunrolli) Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)

2012-12-18 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2012-12-18 14:23:30 UTC --- Created attachment 28997 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28997 testcase1

[Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)

2012-12-18 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com 2012-12-18 14:24:05 UTC --- Created attachment 28998 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28998 testcase2

[Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)

2012-12-19 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com 2012-12-19 09:17:40 UTC --- (In reply to comment #3) The reason is that unrolling early can be harmful to for example vectorization and thus cunrolli restricts itself

[Bug tree-optimization/55970] New: [x86] Avoid reverse order of function argument gimplifying

2013-01-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55970 Bug #: 55970 Summary: [x86] Avoid reverse order of function argument gimplifying Classification: Unclassified Product: gcc Version: unknown Status:

[Bug tree-optimization/55970] [x86] Avoid reverse order of function argument gimplifying

2013-01-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55970 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-01-14 14:46:48 UTC --- Created attachment 29162 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29162 testcase

[Bug tree-optimization/55970] [x86] Avoid reverse order of function argument gimplifying

2013-01-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55970 --- Comment #3 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-01-14 15:15:52 UTC --- I pointed out that this code is not C standard compliant but it occurred in customer application that should be ported to x86 platform. This bug

[Bug tree-optimization/53787] Possible IPA-SRA / IPA-CP improvement

2013-01-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53787 --- Comment #14 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-01-22 15:32:06 UTC --- Created attachment 29250 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29250 testcase in F90 Reproducer for IPA_CP

[Bug tree-optimization/53787] Possible IPA-SRA / IPA-CP improvement

2013-01-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53787 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC

[Bug rtl-optimization/56129] New: Seg fault on 256.bzip2 from spec2000 with -lto and pre-reload scheduler for x86 Atom

2013-01-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56129 Bug #: 56129 Summary: Seg fault on 256.bzip2 from spec2000 with -lto and pre-reload scheduler for x86 Atom Classification: Unclassified Product: gcc Version: 4.8.0

[Bug tree-optimization/56151] New: Performance degradation after r194054 on x86 Atom.

2013-01-30 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151 Bug #: 56151 Summary: Performance degradation after r194054 on x86 Atom. Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity:

[Bug tree-optimization/56151] Performance degradation after r194054 on x86 Atom.

2013-01-30 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56151 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-01-30 15:20:07 UTC --- Created attachment 29306 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29306 testcase to reproduce

[Bug tree-optimization/55970] [x86] Avoid reverse order of function argument gimplifying

2013-02-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55970 --- Comment #5 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-01 13:34:20 UTC --- I sent for review a simple fix that can be considered as workaround for this issue - I simply change macros #define PUSH_ARGS_REVERSED

[Bug rtl-optimization/56175] New: Issue with combine phase on x86.

2013-02-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 Bug #: 56175 Summary: Issue with combine phase on x86. Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal

[Bug rtl-optimization/56175] Issue with combine phase on x86.

2013-02-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-01 15:58:49 UTC --- Created attachment 29330 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29330 testcase This test must be compiled with the following options

[Bug rtl-optimization/56175] Issue with combine phase on x86.

2013-02-04 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #3 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-04 08:55:12 UTC --- Created attachment 29345 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29345 real test-case Need to be compiled with proposed options.

[Bug tree-optimization/56223] New: Integer ABS is not recognized for more complicated pattern

2013-02-06 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223 Bug #: 56223 Summary: Integer ABS is not recognized for more complicated pattern Classification: Unclassified Product: gcc Version: 4.8.0 Status:

[Bug tree-optimization/56223] Integer ABS is not recognized for more complicated pattern

2013-02-06 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56223 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-06 11:41:21 UTC --- Created attachment 29367 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29367 testcase to reproduce

[Bug target/56200] queens benchmark is faster with -O0 than with any other optimization level

2013-02-06 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56200 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC

[Bug rtl-optimization/58826] New: Runfail on CPU2006 436.cactusADM with after r203739 for core-avx2 target.

2013-10-21 Thread ysrumyan at gmail dot com
: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com To reproduce it is sufficient to compile only one source file - StaggeredLeapfrog2.F which must be preprocessed. In rtl-dump after reload we

[Bug rtl-optimization/58826] Runfail on CPU2006 436.cactusADM with after r203377 for core-avx2 target.

2013-10-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58826 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com --- In fact LRA is responsible for this failure - there is a bug in constant regeneration. LRA correctly regenerates all occurrences of virtual register which is not allocated(i.e

[Bug rtl-optimization/58853] New: [4.9 regression] ICE after r203937

2013-10-23 Thread ysrumyan at gmail dot com
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com After fix for memcpy/memset expansion on x86 target we have met with ICE on the following simple test: void my_memcpy (char *dest, const char *src, int n) { __builtin_memcpy (dest, src, n); } which

[Bug rtl-optimization/58384] [4.9 regression] Runfail on spec2000/253.perlbmk if lto and pre-reload scheduler is used on x86 after r200133.

2013-11-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58384 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com --- I checked that the fix for PR58831 does not cure the issue, but we cab close this bug since 253.perlbmk is passed now with pre-reload scheduler.

[Bug rtl-optimization/59036] New: [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.

2013-11-07 Thread ysrumyan at gmail dot com
: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com After patch to improve register preferencing in IRA and to *remove regmove* pass we noticed performance degradation on several benchmarks

[Bug rtl-optimization/59036] [4.9 regression] Performance degradation after r204212 on 32-bit x86 targets.

2013-11-07 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59036 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 31178 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31178action=edit test-case to reproduce test need to be compiled with -m32 option for any x86 targets.

[Bug target/57756] Function target attribute is retaining state of previously seen function

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC||ysrumyan

[Bug target/57756] Function target attribute is retaining state of previously seen function

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756 --- Comment #7 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 31217 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31217action=edit Additioanl patch for r203634. See my comments.

[Bug rtl-optimization/59133] New: [4.9 regression] ICE after r204219 on SPEC2006 435.gromacs.

2013-11-14 Thread ysrumyan at gmail dot com
Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com I attached a simple reproducer to compile with -march=core-avx2 -c -O3 -ffast-math options (-ffast-math is essential) and got the following ICE: t.c

[Bug rtl-optimization/59133] [4.9 regression] ICE after r204219 on SPEC2006 435.gromacs.

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59133 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 31219 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31219action=edit test-case to reproduce Need to be compiled with -m32 -march=core-avx2 -O3 -ffast-math

[Bug rtl-optimization/59133] [4.9 regression] ICE after r204219 on SPEC2006 435.gromacs.

2013-11-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59133 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com --- It is worth noting that -m32 option is also essential for reproducing.

[Bug target/57756] Function target attribute is retaining state of previously seen function

2013-11-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756 --- Comment #9 from Yuri Rumyantsev ysrumyan at gmail dot com --- Hi Uros, I decided that the bug owner should fix it and send my patch (or modified one) for review to GCC community, i.e. I was not planning to fix it. But if I should do it pls

[Bug rtl-optimization/58826] [4.9 Regression] Runfail on CPU2006 436.cactusADM with after r203377 for core-avx2 target.

2013-11-19 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58826 --- Comment #6 from Yuri Rumyantsev ysrumyan at gmail dot com --- I assume that this bug should be closed since it is not reproducible after the latest LRA fix.

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-11-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC||ysrumyan

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-12-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com --- It turned out that proposed fix does not help trunk compilers since now another huge routine is inlined firstly (read_input) and for perdida we got the following message

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-12-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 --- Comment #5 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 31348 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31348action=edit test-case to reproduce It need to be compiled with -Ofast -flto options to reproduce.

[Bug ipa/58721] [4.9 Regression] The subroutine perdida is no longer inlined in fatigue.f90

2013-12-03 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58721 --- Comment #7 from Yuri Rumyantsev ysrumyan at gmail dot com --- I saw that on old compiler sources (dated by 20130911) with my patch 'perdida' was inlined without any additional inline parameters (using -flto) but now it does not inlined since

[Bug tree-optimization/54240] New: Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 Bug #: 54240 Summary: Routine hoist_adjacent_loads does not work properly after r189366 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/54241] New: Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241 Bug #: 54241 Summary: Routine hoist_adjacent_loads does not work properly after r189366 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/54935] New: No way to do if converison

2012-10-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54935 Bug #: 54935 Summary: No way to do if converison Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority:

[Bug tree-optimization/54935] No way to do if converison

2012-10-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54935 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added Version|unknown |4.8.0

[Bug tree-optimization/54939] New: Very poor vectorization of loops with complex arithmetic

2012-10-16 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 Bug #: 54939 Summary: Very poor vectorization of loops with complex arithmetic Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug tree-optimization/54939] Very poor vectorization of loops with complex arithmetic

2012-10-16 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com 2012-10-16 14:54:50 UTC --- Created attachment 28455 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28455 test reproducer

[Bug tree-optimization/54939] Very poor vectorization of loops with complex arithmetic

2012-10-16 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 --- Comment #3 from Yuri Rumyantsev ysrumyan at gmail dot com 2012-10-16 15:06:19 UTC --- I looked through the list of all issues related to vectorization but could not find duplicate.

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #5 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-11 13:42:49 UTC --- This pattern is already recognized by simplify_bitwise_binary but only for usual int type, i.e. if we change all short types to the ordinary int

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #7 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-12 13:05:16 UTC --- (In reply to comment #6) (In reply to comment #5) This pattern is already recognized by simplify_bitwise_binary but only for usual int type

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #9 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-12 14:43:53 UTC --- (In reply to comment #8) (In reply to comment #7) (In reply to comment #6) (In reply to comment #5) This pattern is already recognized

[Bug tree-optimization/56175] Issue with combine phase on x86.

2013-02-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175 --- Comment #11 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-14 12:03:37 UTC --- I did measurements of 3 possible fixes: 1. Comment out 2 patterns related to type sinking. 2. Comment out 1st pattern only. 3. Prohibit type

[Bug target/56309] -O3 optimizer generates conditional moves instead of compare and branch resulting in almost 2x slower code

2013-02-14 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC

[Bug tree-optimization/56415] New: Performance regression after fix for 56273

2013-02-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56415 Bug #: 56415 Summary: Performance regression after fix for 56273 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56415] Performance regression after fix for 56273

2013-02-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56415 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-21 08:59:33 UTC --- Created attachment 29515 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29515 testcase Test must be compiled with -O3 -funroll-loops options.

[Bug tree-optimization/56415] Performance regression after fix for 56273

2013-02-21 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56415 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-21 09:11:21 UTC --- This bug was introduced by the following fix: r195940 is guilty: Author: rguenth Date: Mon Feb 11 13:33:19 2013 New Revision: 195940 URL

[Bug lto/56483] New: LTO issue with expanding GIMPLE_COND

2013-02-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56483 Bug #: 56483 Summary: LTO issue with expanding GIMPLE_COND Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal

[Bug lto/56483] LTO issue with expanding GIMPLE_COND

2013-02-28 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56483 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-02-28 14:43:22 UTC --- Created attachment 29551 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29551 test-case to reproduce Test must be compiled with -O2 -flto -fno

[Bug tree-optimization/56595] New: Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization.

2013-03-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56595 Bug #: 56595 Summary: Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization. Classification: Unclassified Product: gcc Version: 4.8.0

[Bug tree-optimization/56595] Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization.

2013-03-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56595 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-03-11 13:38:25 UTC --- Created attachment 29636 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29636 testcase This test must be compiled with the following options

[Bug tree-optimization/56688] New: Fortran save statement prevents loop vectorization.

2013-03-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56688 Bug #: 56688 Summary: Fortran save statement prevents loop vectorization. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity:

[Bug tree-optimization/56717] New: Enhance Dot-product pattern recognition to avoid mult widening.

2013-03-25 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56717 Bug #: 56717 Summary: Enhance Dot-product pattern recognition to avoid mult widening. Classification: Unclassified Product: gcc Version: 4.9.0 Status:

[Bug tree-optimization/56778] New: ICE on several benchmarks after r196775.

2013-03-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56778 Bug #: 56778 Summary: ICE on several benchmarks after r196775. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56799] New: Runfail after r197060+r197082.

2013-04-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56799 Bug #: 56799 Summary: Runfail after r197060+r197082. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56799] Runfail after r197060+r197082.

2013-04-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56799 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-01 14:51:28 UTC --- It is sufficient to compile test with '-O2' option.

[Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 Bug #: 56812 Summary: Simple loop is not SLP-vectorized after r196872 Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-02 11:22:45 UTC --- Created attachment 29775 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29775 testcase Need to compile with -O3 -funroll-loops options.

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-02 11:41:23 UTC --- Sorry, i did a typo in -march option - it must be -march=corei7 -mavx.

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-02 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-02 13:27:15 UTC --- Yes, the test-case is correct. If we delete your changes we got thee following (with -ftree-vectorizer-verbose-3): t.cc:12: note: vectorizing stmts

[Bug tree-optimization/56826] New: Run-fail after r197189.

2013-04-03 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56826 Bug #: 56826 Summary: Run-fail after r197189. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3

[Bug tree-optimization/56826] Run-fail after r197189.

2013-04-03 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56826 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-03 09:48:59 UTC --- Created attachment 29789 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29789 test-case to reproduce To reproduce tha failure the test must

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #12 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-08 14:03:45 UTC --- Richard, We found out another issue related to your fix (r196872), namely for the attached test-case t1.c function vect_gen_niters_for_prolog_loop

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #13 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-08 14:05:26 UTC --- Created attachment 29824 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29824 testcase The following optins were used to compile on x86

[Bug tree-optimization/56878] New: Issue with candidate choice in vect_gen_niters_for_prolog_loop.

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878 Bug #: 56878 Summary: Issue with candidate choice in vect_gen_niters_for_prolog_loop. Classification: Unclassified Product: gcc Version: 4.9.0 Status:

[Bug tree-optimization/56878] Issue with candidate choice in vect_gen_niters_for_prolog_loop.

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-08 15:47:05 UTC --- Created attachment 29826 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29826 testcase Need to be compiled on x86 with the following options

[Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872

2013-04-08 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812 --- Comment #15 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-08 15:48:45 UTC --- New bug has been opened for it: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878

[Bug rtl-optimization/56885] [4.8/4.9 Regression] ICE: in assign_by_spills, at lra-assigns.c:1268 with -O -fschedule-insns -fselective-scheduling

2013-04-09 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56885 --- Comment #5 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-09 13:33:53 UTC --- I did simple investigation and found out that 1. Test is compiled successfully without selective scheduling, i.e. with '-fschedule-insns' only. 2

[Bug rtl-optimization/56885] [4.8/4.9 Regression] ICE: in assign_by_spills, at lra-assigns.c:1268 with -O -fschedule-insns -fselective-scheduling

2013-04-09 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56885 --- Comment #6 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-09 14:22:28 UTC --- Forgot to mention that __builtin_memset and function argument are not interchangeable since both use the same register di.

[Bug tree-optimization/56935] New: Basic block is not SLP-vectorizeed after r197635.

2013-04-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 Bug #: 56935 Summary: Basic block is not SLP-vectorizeed after r197635. Classification: Unclassified Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal

[Bug tree-optimization/56935] Basic block is not SLP-vectorizeed after r197635.

2013-04-12 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-12 14:01:50 UTC --- Created attachment 29862 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29862 testcase Need to be compiled with the following options: -O3

[Bug tree-optimization/56935] Basic block is not SLP-vectorizeed after r197635.

2013-04-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-15 14:54:50 UTC --- Richard, both subq's are accessed the same cash line and it means that after 1st store tthe 2nd load will stall till finish updating data cash

[Bug tree-optimization/56935] Basic block is not SLP-vectorizeed after r197635.

2013-04-22 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56935 --- Comment #6 from Yuri Rumyantsev ysrumyan at gmail dot com 2013-04-22 14:46:16 UTC --- Richard, Sorry for troubles since we found out the real cause of performance degradation - code layout was changed after your fix and it caused ~5

[Bug tree-optimization/57124] 254.gap@spec2000 got miscompare after r198413

2013-05-15 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57124 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC||ysrumyan

[Bug tree-optimization/57430] New: Redundant move instruction is produced after function inlining

2013-05-27 Thread ysrumyan at gmail dot com
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com This is based on performance analysis of eembc2.0 suite on Atom processor in comparison with clang compiler. I prepared a simple reproducer

[Bug tree-optimization/57430] Redundant move instruction is produced after function inlining

2013-05-27 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 30203 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30203action=edit testcase Need to compile with -O2 -m32 options on x86.

[Bug rtl-optimization/57430] Redundant move instruction is produced after function inlining

2013-05-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com --- I don't believe that this is related to rtl optimizations, but rather to inlining phase. To prove it I did small changes in t.c for remove.c (it now has type void): void remove

[Bug rtl-optimization/57430] Redundant move instruction is produced after function inlining

2013-05-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #3 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 30213 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30213action=edit modified testcase This is modified testcase which does not have a problem

[Bug rtl-optimization/57468] New: [4.9 Regression] 26% performance drop on important benchmark after r199298.

2013-05-30 Thread ysrumyan at gmail dot com
: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com We found significant performance drop after changes in lra phase, which can be demonstrated on example from http://gcc.gnu.org/bugzilla

[Bug rtl-optimization/57468] [4.9 Regression] 26% performance drop on important benchmark after r199298.

2013-05-30 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57468 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 30224 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30224action=edit test-case to reproduce It should be compiled on x86 with -O2 -m32 options.

[Bug tree-optimization/57558] New: Issue with number of iterations calculation

2013-06-07 Thread ysrumyan at gmail dot com
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com For the following simple test-case extracted from 254.gap (spec2000): typedef unsigned long ul; void foo (ul* __restrict x, ul* __restrict y, ul n) { ul i; for (i=1; i=n; i++, x++, y

[Bug rtl-optimization/57430] Redundant move instruction is produced after function inlining

2013-06-11 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57430 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com --- I have not seen any comments about my latest note - have you any ideas about this issue? Thanks ahead.

[Bug lto/57602] Runfails for several C/C++ benchmarks from spec2000 for i686 with -flto after r199422

2013-06-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57602 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC||ysrumyan

[Bug rtl-optimization/56129] Seg fault on 256.bzip2 from spec2000 with -lto and pre-reload scheduler for x86 Atom

2013-07-01 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56129 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED

[Bug target/57796] AVX2 gather vectorization: code bloat and reduction of performance

2013-07-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC||ysrumyan

[Bug target/57796] AVX2 gather vectorization: code bloat and reduction of performance

2013-07-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com --- (In reply to Jakub Jelinek from comment #3) By tuning I've meant the vectorizer cost model. If the desirability of gathers vs. no vectorization at all doesn't depend only

[Bug target/57954] AVX missing vxorps (zeroing) before vcvtsi2s %edx, slow down AVX code

2013-07-29 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57954 Yuri Rumyantsev ysrumyan at gmail dot com changed: What|Removed |Added CC||ysrumyan

[Bug lto/57602] [4.9 Regression] Runfails for several C/C++ benchmarks from spec2000 for i686 with -flto after r199422

2013-08-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57602 --- Comment #12 from Yuri Rumyantsev ysrumyan at gmail dot com --- Jan, I tried to test your fix and got the following error message while building trunk compiler (with your fix): ../../../../../trunk/libstdc++-v3/src/c++11/fstream-inst.cc:48:1

[Bug lto/57602] [4.9 Regression] Runfails for several C/C++ benchmarks from spec2000 for i686 with -flto after r199422

2013-08-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57602 --- Comment #14 from Yuri Rumyantsev ysrumyan at gmail dot com --- Hi Jan, I checked that all benches from spec2000 are run successfully with -flto options and eembc_2_0 suite was also run sucessfully with lto (for 32-bit mode). So go ahead

[Bug tree-optimization/58135] New: [x86] Missed opportunities for partial SLP

2013-08-12 Thread ysrumyan at gmail dot com
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com If we consider the following simple test-case int a[100]; void foo() { a[0] = a[1] = a[2] = a[3] = 0; } SLP vectorization of basic block takes place: gcc -S -O3 -m32 t.c -ftree

[Bug rtl-optimization/55342] [4.8/4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2013-09-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #8 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 30751 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30751action=edit modified test-case Modified test-case to reproduce sub-optimal register allocation.

[Bug rtl-optimization/55342] [4.8/4.9 Regression] [LRA,x86] Non-optimal code for simple loop with LRA

2013-09-05 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55342 --- Comment #9 from Yuri Rumyantsev ysrumyan at gmail dot com --- The issue still exists in 4.9 compiler but we got another 30% degradation after r202165 fix. It can be reproduced with modified test-case which as attached with any 4.9 compiler

[Bug rtl-optimization/58384] New: [4.9 regression] Runfail on spec2000/253.perlbmk if lto and pre-reload scheduler is used on x86 after r200133.

2013-09-10 Thread ysrumyan at gmail dot com
Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com We also assume that arm can have the same problem at given benchmark if -flto

[Bug rtl-optimization/58384] [4.9 regression] Runfail on spec2000/253.perlbmk if lto and pre-reload scheduler is used on x86 after r200133.

2013-09-10 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58384 --- Comment #1 from Yuri Rumyantsev ysrumyan at gmail dot com --- Created attachment 30791 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30791action=edit test-case to reproduce This is compile only test which must be compiled with pre-reload

  1   2   3   4   >