https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43182
Andrew Pinski changed:
What|Removed |Added
Severity|normal |enhancement
--- Comment #7 from Andrew
--- Comment #4 from changpeng dot fang at amd dot com 2010-02-26 18:53
---
Here is another similar case but more general. We know that a(j) and a(i)
never access the same memory location. intel ifort can vectorize this
triangular
loop:
do 10 j = 1,n
do 20 i = j+1, n
--- Comment #5 from pinskia at gcc dot gnu dot org 2010-02-26 18:55 ---
(In reply to comment #4)
Here is another similar case but more general.
Actually it is a totally different case. Please file a new bug with that case;
though there might already be a bug about that one.
--
--- Comment #6 from changpeng dot fang at amd dot com 2010-02-26 19:06
---
Actually it is a totally different case. Please file a new bug with that
case;
though there might already be a bug about that one.
I could not see the difference even though j is not a compile-time
--- Comment #2 from pinskia at gcc dot gnu dot org 2010-02-25 23:50 ---
So currently inside LIM (which does load motion in general):
D.2724_7 = a_6(D) + D.2723_5;
D.2725_8 = *a_6(D);
*D.2724_7 = D.2725_8;
But LIM/alias oracle does not know that D.2723_5 has a range of [4, n_3*4]
--- Comment #3 from pinskia at gcc dot gnu dot org 2010-02-25 23:54 ---
Related to PR 29751 but that only does a simple method and does not handle this
case as we need range info.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43182