Hi Cesar!

(At least several of) the issues that I pointed out (see below) have
never been fixed on gomp-4_0-branch, but the test cases have now been
merged from gomp-4_0-branch into trunk, so the regression (PASS -> FAIL
for libgomp.oacc-c-c++-common/reduction-2.c) as well as the other
"oddities" are now to be fixed in trunk.  I re-assigned
<https://gcc.gnu.org/PR68242> from Nathan to Cesar.  (I didn't verify
that the following list of items is conclusive/complete.)

On Fri, 18 Sep 2015 15:37:58 +0200, I wrote:
> Hi Cesar!
> 
> On Fri, 17 Jul 2015 11:13:59 -0700, Cesar Philippidis 
> <ce...@codesourcery.com> wrote:
> > This patch updates the libgomp OpenACC reduction test cases to check
> > worker, vector and combined gang worker vector reductions. I tried to
> > use some macros to simplify the c test cases a bit. I probably could
> > have made them more generic with an additional header file/macro, but
> > then that makes it too confusing too debug. The fortran tests are a bit
> > of a lost clause, unless someone knows how to use the preprocessor with
> > !$acc loops.
> 
> > --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-2.c
> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-2.c
> 
> > +static void
> > +test_reductions (void)
> >  {
> 
> > -  [...]
> > +  const int n = 100;
> >    int i;
> > -  [...]
> > +  float array[n];
> >  
> >    for (i = 0; i < n; i++)
> > -    [...]
> > +    array[i] = i+1;
> >  
> > -  [...]
> > +  /* Gang reductions.  */
> > +  check_reduction_op (float, +, 0, array[i], num_gangs (ng), gang);
> > +  check_reduction_op (float, *, 1, array[i], num_gangs (ng), gang);
> 
> I see this one reproducibly FAIL in the x86_64 -m32 multilib's
> host-fallback testing (there is no nvptx offloading for 32-bit
> configurations).  (The -m32 multilib is configured/enabled by default, so
> fixing this is a prerequisite for trunk integration.)  From a very quick
> glance, might it be that we're overflowing the float data type with the
> "1 * 2 * 3 * [...] * 1000" computation?  The OpenACC reduction computes
> "inf" which is then compared against a very high finite reference value
> -- or the other way round (I lost my debugging session).  Instead of
> multiplying these "big" numbers, I guess we should just do a more
> idiomatic floating point computation?
> 
> > --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-4.c
> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-4.c
> 
> >  /* complex reductions.  */
> 
> > +static void
> > +test_reductions (void)
> >  {
> 
> > +  double _Complex array[n];
> > +
> > +  for (i = 0; i < n; i++)
> > +    array[i] = i+1;
> > +
> > +  /* Gang reductions.  */
> > +  check_reduction_op (double, +, 0, creal (array[i]), num_gangs (ng), 
> > gang);
> 
> Given that in the check_reduction_op instantiations you're specifying a
> "double" data type (instead of "double _Complex", for example), and
> "creal (array[i])" reduction operands (instead of "array[i]", for
> example), we're not actually testing reductions with complex data types,
> so I guess that should be changed.  :-)
> 
> > --- /dev/null
> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h
> > @@ -0,0 +1,43 @@
> > +#ifndef REDUCTION_H
> > +#define REDUCTION_H
> > +
> > +#define DO_PRAGMA(x) _Pragma (#x)
> > +
> > +#define check_reduction_op(type, op, init, b, gwv_par, gwv_loop)   \
> > +  {                                                                        
> > \
> > +    type res, vres;                                                        
> > \
> > +    res = (init);                                                  \
> > +DO_PRAGMA (acc parallel gwv_par copy (res))                                
> > \
> > +DO_PRAGMA (acc loop gwv_loop reduction (op:res))                   \
> > +    for (i = 0; i < n; i++)                                                
> > \
> > +      res = res op (b);                                                    
> > \
> > +                                                                   \
> > +    vres = (init);                                                 \
> > +    for (i = 0; i < n; i++)                                                
> > \
> > +      vres = vres op (b);                                          \
> > +                                                                   \
> > +    if (res != vres)                                                       
> > \
> > +      abort ();                                                            
> > \
> > +  }
> 
> It's the right thing for integer data types, but for anything floating
> point, we should be allowing for some small difference (epsilon) between
> res and vres, due to rounding differences in the OpenACC reduction
> (possibly offloaded) and reference value computation, and similar.
> 
> > +#define check_reduction_macro(type, op, init, b, gwv_par, gwv_loop)        
> > \
> > +  {                                                                        
> > \
> > +    type res, vres;                                                        
> > \
> > +    res = (init);                                                  \
> > +    DO_PRAGMA (acc parallel gwv_par copy(res))                             
> > \
> > +DO_PRAGMA (acc loop gwv_loop reduction (op:res))                   \
> > +    for (i = 0; i < n; i++)                                                
> > \
> > +      res = op (res, (b));                                         \
> > +                                                                   \
> > +    vres = (init);                                                 \
> > +    for (i = 0; i < n; i++)                                                
> > \
> > +      vres = op (vres, (b));                                               
> > \
> > +                                                                   \
> > +    if (res != vres)                                                       
> > \
> > +      abort ();                                                            
> > \
> > +  }
> 
> Likewise.
> 
> > +#define max(a, b) (((a) > (b)) ? (a) : (b))
> > +#define min(a, b) (((a) < (b)) ? (a) : (b))
> > +
> > +#endif
> 
> > --- a/libgomp/testsuite/libgomp.oacc-fortran/reduction-4.f90
> > +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-4.f90
> > @@ -5,50 +5,108 @@
> >  program reduction_4
> >    implicit none
> >  
> > -  integer, parameter    :: n = 10, gangs = 20
> > +  integer, parameter    :: n = 10, ng = 8, nw = 4, vl = 32
> >    integer               :: i
> > -  complex               :: vresult, result
> > +  real                  :: vresult, rg, rw, rv, rc
> >    complex, dimension (n) :: array
> 
> Same problem as in the C test case: not actually testing complex data
> types:
> 
> >    do i = 1, n
> >       array(i) = i
> >    end do
> >  
> > -[...]
> > +  !
> > +  ! '+' reductions
> > +  !
> > +
> > +  rg = 0
> > +  rw = 0
> > +  rv = 0
> > +  rc = 0
> >    vresult = 0
> >  
> > -[...]
> > +  !$acc parallel num_gangs(ng) copy(rg)
> > +  !$acc loop reduction(+:rg) gang
> > +  do i = 1, n
> > +     rg = rg + REAL(array(i))
> > +  end do
> > +  !$acc end parallel


Grüße
 Thomas

Attachment: signature.asc
Description: PGP signature

Reply via email to