Hi Mei, For comment-1: we cannot factor the two cases into one. If you examine the code more closely, you will find the const part for sub is -1 instead of 1. For comment-2: added a safety check at the beginning of 'Simd_Simplify_LB_UB' to make sure the index variable in the remainder loop is not live at exit, thereby making sure that such index variables are not renamed. This makes the code safer in general. But in current implementation, the condition will never be true. The loop being vectorized is "standardized" at very early stage and the remainder loop is simply copied from the standardized original loop, therefore its index variable will not be live at the exit.
Other changes compared to previous patch include fixes to typos and an assertion check added to lnoutils.cxx. Thanks. Pallavi From: Ye, Mei Sent: Tuesday, May 10, 2011 1:38 PM To: Mathew, Pallavi; open64-devel@lists.sourceforge.net Subject: RE: Code review request for vectorizer patch Two comments: - L131~139 can be merged together. - In the case that loop index variable is live at exit and "Upper_Bound_Standardize" return TRUE, do we still modify the loop? -Mei From: Mathew, Pallavi [mailto:pallavi.mat...@amd.com] Sent: Monday, May 09, 2011 10:55 AM To: open64-devel@lists.sourceforge.net Subject: [Open64-devel] Code review request for vectorizer patch Hi, Can a gatekeeper please review the attached vectorizer patch? This change is a kludge in an attempt to reduce the overhead of auto-vectorization. The overhead we are trying to reduce is the cost of computing lower bound (LB) and upper bound (UB) of vectorized loop and remainder loop. Suppose the original loop is: for (i = 0; i < n; i++) and the unrolling factor (i.e. vector length) is f. After SIMD, the vectorized loop and the remainder loop were the following: vectorized loop: for (i = 0; i <= (n/f)*f - 1; i+=f) {} remainder loop : for (i =(n/f)/f; i < n; i++) {} The cost of computing LB/UB is high if : - the loop being vectorized is in a deep nest, and - the trip-count is small and it is not known at compile time With this change, the vectorized loop and remainder loop are now: vectorized loop: for (i = 0; i <= n-f; i+=f) {} remainder loop : for (i'= i; i < n; i++) {} The change to lnopt_hoistif.cxx is the prevent the LB of the remainder loop back to what it was. The "hoist if" phase blindly change the code in order to not let index variable live out of loop. The change to this file is to prevent loops being modified if they are not amenable for hoist-if optimization. Also, this change keeps index variable private to the loop being vectorized if the loop is flagged Do_Loop_Is_MP(). Thanks. Pallavi
vectorizer2.p
Description: vectorizer2.p
------------------------------------------------------------------------------ Achieve unprecedented app performance and reliability What every C/C++ and Fortran developer should know. Learn how Intel has extended the reach of its next-generation tools to help boost performance applications - inlcuding clusters. http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________ Open64-devel mailing list Open64-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/open64-devel