Hi Mei,
For comment-1: we cannot factor the two cases into one. If you examine the code 
more closely, you will find the const part for sub is -1 instead of 1.
For comment-2: added a safety check at the beginning of 'Simd_Simplify_LB_UB' 
to make sure the index variable in the remainder loop is not live at exit, 
thereby making sure that such index variables are not renamed. This makes the 
code safer in general. But in current implementation, the condition will never 
be true. The  loop being vectorized is "standardized" at very early stage and  
the remainder loop is simply copied  from the standardized original loop, 
therefore its index variable will not be live at the exit.

Other changes compared to previous patch include fixes to typos and an 
assertion check added to lnoutils.cxx.

Thanks.
Pallavi

From: Ye, Mei
Sent: Tuesday, May 10, 2011 1:38 PM
To: Mathew, Pallavi; open64-devel@lists.sourceforge.net
Subject: RE: Code review request for vectorizer patch

Two comments:
- L131~139 can be merged together.
- In the case that loop index variable is live at exit and 
"Upper_Bound_Standardize" return TRUE, do we still modify the loop?

-Mei

From: Mathew, Pallavi [mailto:pallavi.mat...@amd.com]
Sent: Monday, May 09, 2011 10:55 AM
To: open64-devel@lists.sourceforge.net
Subject: [Open64-devel] Code review request for vectorizer patch

Hi,
Can a gatekeeper please review the attached vectorizer patch?

This change is a kludge in an attempt to reduce the overhead of 
auto-vectorization.
The overhead we are trying to reduce is the cost of computing lower bound (LB) 
and
upper bound (UB) of vectorized loop and remainder loop.

Suppose the original loop is:
   for (i = 0; i < n; i++)  and the unrolling factor (i.e. vector length) is f.

After SIMD, the vectorized loop and the remainder loop were the following:
   vectorized loop: for (i = 0; i <= (n/f)*f - 1; i+=f) {}
   remainder loop : for (i =(n/f)/f; i < n; i++) {}

The cost of computing LB/UB is high if :
  - the loop being vectorized is in a deep nest, and
  - the trip-count is small and it is not known at compile time

With this change, the vectorized loop and remainder loop are now:
   vectorized loop: for (i = 0; i <= n-f; i+=f) {}
   remainder loop : for (i'= i; i < n; i++) {}

The change to lnopt_hoistif.cxx is the prevent the LB of the remainder loop
back to what it was. The "hoist if" phase blindly change the code in order to
not let index variable live out of loop. The change to this file is to prevent
loops being modified if they are not amenable for hoist-if optimization.

Also, this change keeps index variable private to the loop being vectorized
if the loop is flagged Do_Loop_Is_MP().

Thanks.
Pallavi

Attachment: vectorizer2.p
Description: vectorizer2.p

------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Reply via email to