Looks good. Thanks.
-Mei
From: Mathew, Pallavi
Sent: Friday, May 13, 2011 10:43 AM
To: Ye, Mei; open64-devel@lists.sourceforge.net
Subject: RE: Code review request for vectorizer patch
Hi Mei,
For comment-1: we cannot factor the two cases into one. If you examine the code
more closely, you will find the const part for sub is -1 instead of 1.
For comment-2: added a safety check at the beginning of 'Simd_Simplify_LB_UB'
to make sure the index variable in the remainder loop is not live at exit,
thereby making sure that such index variables are not renamed. This makes the
code safer in general. But in current implementation, the condition will never
be true. The loop being vectorized is "standardized" at very early stage and
the remainder loop is simply copied from the standardized original loop,
therefore its index variable will not be live at the exit.
Other changes compared to previous patch include fixes to typos and an
assertion check added to lnoutils.cxx.
Thanks.
Pallavi
From: Ye, Mei
Sent: Tuesday, May 10, 2011 1:38 PM
To: Mathew, Pallavi; open64-devel@lists.sourceforge.net
Subject: RE: Code review request for vectorizer patch
Two comments:
- L131~139 can be merged together.
- In the case that loop index variable is live at exit and
"Upper_Bound_Standardize" return TRUE, do we still modify the loop?
-Mei
From: Mathew, Pallavi [mailto:pallavi.mat...@amd.com]
Sent: Monday, May 09, 2011 10:55 AM
To: open64-devel@lists.sourceforge.net
Subject: [Open64-devel] Code review request for vectorizer patch
Hi,
Can a gatekeeper please review the attached vectorizer patch?
This change is a kludge in an attempt to reduce the overhead of
auto-vectorization.
The overhead we are trying to reduce is the cost of computing lower bound (LB)
and
upper bound (UB) of vectorized loop and remainder loop.
Suppose the original loop is:
for (i = 0; i < n; i++) and the unrolling factor (i.e. vector length) is f.
After SIMD, the vectorized loop and the remainder loop were the following:
vectorized loop: for (i = 0; i <= (n/f)*f - 1; i+=f) {}
remainder loop : for (i =(n/f)/f; i < n; i++) {}
The cost of computing LB/UB is high if :
- the loop being vectorized is in a deep nest, and
- the trip-count is small and it is not known at compile time
With this change, the vectorized loop and remainder loop are now:
vectorized loop: for (i = 0; i <= n-f; i+=f) {}
remainder loop : for (i'= i; i < n; i++) {}
The change to lnopt_hoistif.cxx is the prevent the LB of the remainder loop
back to what it was. The "hoist if" phase blindly change the code in order to
not let index variable live out of loop. The change to this file is to prevent
loops being modified if they are not amenable for hoist-if optimization.
Also, this change keeps index variable private to the loop being vectorized
if the loop is flagged Do_Loop_Is_MP().
Thanks.
Pallavi
------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel