Re: [PATCH] Fix PR53295

2012-05-12 Thread Toon Moene

On 05/11/2012 01:59 PM, Richard Guenther wrote:


This fixes the dependency of vectorization of strided loads on
gather support.  For that to work we need to lift the restriction
in data-ref analysis that requries a constant DR_STEP.  Fortunately
fallout is small.


Would this also vectorize strided loops when the architecture doesn't 
have a gather instruction ?


If so, it doesn't work for the attached case, which *does* vectorize 
with a gather instruction:


$ /tmp/c/bin/gfortran -g -O3 -ftree-vectorizer-verbose=2 -mavx2 -S 
verintlin.f


Analyzing loop at verintlin.f:68

Analyzing loop at verintlin.f:69


Vectorizing loop at verintlin.f:69

69: LOOP VECTORIZED.
verintlin.f:1: note: vectorized 1 loops in function.

whereas:

$ /tmp/c/bin/gfortran -g -O3 -ftree-vectorizer-verbose=2 -mavx -S 
verintlin.f


Analyzing loop at verintlin.f:68

Analyzing loop at verintlin.f:69

69: not vectorized: not suitable for gather load D.2051_74 = 
*parg_73(D)[D.2050_72];


69: not vectorized: not suitable for gather load D.2051_74 = 
*parg_73(D)[D.2050_72];


verintlin.f:1: note: vectorized 0 loops in function.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
  SUBROUTINE VERINT (
 I   KLON   , KLAT   , KLEV   , KINT  , KHALO
 I , KLON1  , KLON2  , KLAT1  , KLAT2
 I , KP , KQ , KR
 R , PARG   , PRES
 R , PALFH  , PBETH
 R , PALFA  , PBETA  , PGAMA   )
C
C***
C
C  VERINT - THREE DIMENSIONAL INTERPOLATION
C
C  PURPOSE:
C
C  THREE DIMENSIONAL INTERPOLATION
C
C  INPUT PARAMETERS:
C
C  KLON  NUMBER OF GRIDPOINTS IN X-DIRECTION
C  KLAT  NUMBER OF GRIDPOINTS IN Y-DIRECTION
C  KLEV  NUMBER OF VERTICAL LEVELS
C  KINT  TYPE OF INTERPOLATION
C= 1 - LINEAR
C= 2 - QUADRATIC
C= 3 - CUBIC
C= 4 - MIXED CUBIC/LINEAR
C  KLON1 FIRST GRIDPOINT IN X-DIRECTION
C  KLON2 LAST  GRIDPOINT IN X-DIRECTION
C  KLAT1 FIRST GRIDPOINT IN Y-DIRECTION
C  KLAT2 LAST  GRIDPOINT IN Y-DIRECTION
C  KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KRARRAY OF INDEXES FOR VERTICAL   DISPLACEMENTS
C  PARG  ARRAY OF ARGUMENTS
C  PALFH ALFA HAT
C  PBETH BETA HAT
C  PALFA ARRAY OF WEIGHTS IN X-DIRECTION
C  PBETA ARRAY OF WEIGHTS IN Y-DIRECTION
C  PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION
C
C  OUTPUT PARAMETERS:
C
C  PRES  INTERPOLATED FIELD
C
C  HISTORY:
C
C  J.E. HAUGEN   1  1992
C
C***
C
  IMPLICIT NONE
C
  INTEGER KLON   , KLAT   , KLEV   , KINT   , KHALO,
 IKLON1  , KLON2  , KLAT1  , KLAT2
C
  INTEGER   KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT)
  REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV)  ,   
 RPRES(KLON,KLAT) ,
 R   PALFH(KLON,KLAT) ,  PBETH(KLON,KLAT)  ,
 R   PALFA(KLON,KLAT,4)   ,  PBETA(KLON,KLAT,4),
 R   PGAMA(KLON,KLAT,4)
C
  INTEGER JX, JY, IDX, IDY, ILEV
  REAL Z1MAH, Z1MBH
C
C  LINEAR INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C+
 +   + PGAMA(JX,JY,2)*(
C+
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
  ENDDO
  ENDDO
C
  RETURN
  END


Re: [PATCH] Fix PR53295

2012-05-12 Thread Richard Guenther
On Sat, May 12, 2012 at 9:53 AM, Toon Moene t...@moene.org wrote:
 On 05/11/2012 01:59 PM, Richard Guenther wrote:

 This fixes the dependency of vectorization of strided loads on
 gather support.  For that to work we need to lift the restriction
 in data-ref analysis that requries a constant DR_STEP.  Fortunately
 fallout is small.


 Would this also vectorize strided loops when the architecture doesn't have a
 gather instruction ?

gather is different from strided loops.  Gather is a[b[i]] while strided loops
are for (i=0;; i+=stride) ...= a[i] with stride being non-constant.

Your testcase requires gather support.

Richard.

 If so, it doesn't work for the attached case, which *does* vectorize with a
 gather instruction:

 $ /tmp/c/bin/gfortran -g -O3 -ftree-vectorizer-verbose=2 -mavx2 -S
 verintlin.f

 Analyzing loop at verintlin.f:68

 Analyzing loop at verintlin.f:69


 Vectorizing loop at verintlin.f:69

 69: LOOP VECTORIZED.
 verintlin.f:1: note: vectorized 1 loops in function.

 whereas:

 $ /tmp/c/bin/gfortran -g -O3 -ftree-vectorizer-verbose=2 -mavx -S
 verintlin.f

 Analyzing loop at verintlin.f:68

 Analyzing loop at verintlin.f:69

 69: not vectorized: not suitable for gather load D.2051_74 =
 *parg_73(D)[D.2050_72];

 69: not vectorized: not suitable for gather load D.2051_74 =
 *parg_73(D)[D.2050_72];

 verintlin.f:1: note: vectorized 0 loops in function.

 --
 Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
 Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
 At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
 Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


Re: [PATCH] Fix PR53295

2012-05-12 Thread Toon Moene

On 05/12/2012 12:36 PM, Richard Guenther wrote:


On Sat, May 12, 2012 at 9:53 AM, Toon Moenet...@moene.org  wrote:



On 05/11/2012 01:59 PM, Richard Guenther wrote:


This fixes the dependency of vectorization of strided loads on
gather support.  For that to work we need to lift the restriction
in data-ref analysis that requries a constant DR_STEP.  Fortunately
fallout is small.



Would this also vectorize strided loops when the architecture doesn't have a
gather instruction ?


gather is different from strided loops.  Gather is a[b[i]] while strided loops
are for (i=0;; i+=stride) ...= a[i] with stride being non-constant.

Your testcase requires gather support.


Yep, apparently I didn't read your explanation correctly.

On the other hand, I'm wondering if - in the absence of a gather 
*instruction* - one could do a gather-by-hand, i.e., load 8 32-bit 
floating point values in a (temporary) consecutive buffer, then load it 
into a vector register ...


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


Re: [PATCH] Fix PR53295

2012-05-12 Thread Richard Guenther
On Sat, May 12, 2012 at 1:39 PM, Toon Moene t...@moene.org wrote:
 On 05/12/2012 12:36 PM, Richard Guenther wrote:

 On Sat, May 12, 2012 at 9:53 AM, Toon Moenet...@moene.org  wrote:


 On 05/11/2012 01:59 PM, Richard Guenther wrote:

 This fixes the dependency of vectorization of strided loads on
 gather support.  For that to work we need to lift the restriction
 in data-ref analysis that requries a constant DR_STEP.  Fortunately
 fallout is small.



 Would this also vectorize strided loops when the architecture doesn't
 have a
 gather instruction ?


 gather is different from strided loops.  Gather is a[b[i]] while strided
 loops
 are for (i=0;; i+=stride) ...= a[i] with stride being non-constant.

 Your testcase requires gather support.


 Yep, apparently I didn't read your explanation correctly.

 On the other hand, I'm wondering if - in the absence of a gather
 *instruction* - one could do a gather-by-hand, i.e., load 8 32-bit floating
 point values in a (temporary) consecutive buffer, then load it into a vector
 register ...

Sure - gather and non-constant stride support is somewhat related.
We are also currently missing to handle non-power-of-two constant
strides (which can simply use the non-constant stride path as well).

Richard.


 --
 Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
 Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
 At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
 Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news