On Fri, 17 Nov 2017, Janne Blomqvist wrote:

> On Fri, Nov 17, 2017 at 3:03 PM, Richard Biener <rguent...@suse.de> wrote:
> > On Fri, 17 Nov 2017, Janne Blomqvist wrote:
> >
> >> On Fri, Nov 17, 2017 at 11:13 AM, Richard Biener <rguent...@suse.de> wrote:
> >> > This patch changes the Fortran frontend to annotate DO CONCURRENT
> >> > with parallel instead of ivdep.
> >> >
> >> > The patch is not enough to enable a runtime benefit because of
> >> > some autopar costing issues but for other cases it should enable
> >> > auto-parallelization of all DO CONCURRENT loops.
> >> >
> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for the fortran
> >> > part?
> >>
> >> I recall some years ago there was discussion whether DO CONCURRENT
> >> should be handled as "Ok to vectorize" or "Ok to parallelize using
> >> threads", and I believe back then it was decided to play it safe and
> >> just vectorize. Has this consensus changed now? And with this change,
> >> are we now using threads, or both vectors and threads? And is it
> >> enabled always, or only with -fopenmp, or some
> >> -ftree-loop-parallel-whatever? And if the second, how does it interact
> >> with openmp parallelization?
> >
> > It is only parallelized with -ftree-loop-parallelize, OpenMP processing
> > comes first to I think it doesn't interfere here.  The loops are still
> > marked with ivdep as well and thus should enable vectorization.
> 
> Ok, sounds good then. Ok for my part.
> 
> For OpenMP, the thing that came to mind was that if you had an outer
> loop parallelized with "omp parallel do", then an inner DO CONCURRENT
> loop, hopefully you don't then get N*N threads? That is, does
> -ftree-loop-parallelize use the thread pool that libgomp creates, or
> does it create its own?

It uses the same thread pool.

> > The
> > usual caveats may apply when trying to vectorize outlined loops
> > (loops that have been parallelized).
> >
> > Was the "play safe" for correctness concerns or for optimization
> > concern?
> 
> IIRC it was an optimization concern, if users assume DO CONCURRENT
> means "vectorize" then they may use it for short loops where the
> overhead of threads isn't worth it. Back then I recall it wasn't clear
> what other compiler were going to do with DO CONCURRENT, thus merely
> vectorizing was a safer option at the time. It seems that nowadays at
> least Intel attempts to use threads for DO CONCURRENT only with
> -parallel//Qparallel, which more or less matches what you're proposing
> here. Which I guess is good for users in a performance portability
> sense.

Sure.  -ftree-parallelize-loops isn't enabled by default.  The patch
merely adds more precise dependence analysis hints for the middle-end.

Richard.

Reply via email to