On Fri, 17 Nov 2017, Janne Blomqvist wrote: > On Fri, Nov 17, 2017 at 3:03 PM, Richard Biener <rguent...@suse.de> wrote: > > On Fri, 17 Nov 2017, Janne Blomqvist wrote: > > > >> On Fri, Nov 17, 2017 at 11:13 AM, Richard Biener <rguent...@suse.de> wrote: > >> > This patch changes the Fortran frontend to annotate DO CONCURRENT > >> > with parallel instead of ivdep. > >> > > >> > The patch is not enough to enable a runtime benefit because of > >> > some autopar costing issues but for other cases it should enable > >> > auto-parallelization of all DO CONCURRENT loops. > >> > > >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for the fortran > >> > part? > >> > >> I recall some years ago there was discussion whether DO CONCURRENT > >> should be handled as "Ok to vectorize" or "Ok to parallelize using > >> threads", and I believe back then it was decided to play it safe and > >> just vectorize. Has this consensus changed now? And with this change, > >> are we now using threads, or both vectors and threads? And is it > >> enabled always, or only with -fopenmp, or some > >> -ftree-loop-parallel-whatever? And if the second, how does it interact > >> with openmp parallelization? > > > > It is only parallelized with -ftree-loop-parallelize, OpenMP processing > > comes first to I think it doesn't interfere here. The loops are still > > marked with ivdep as well and thus should enable vectorization. > > Ok, sounds good then. Ok for my part. > > For OpenMP, the thing that came to mind was that if you had an outer > loop parallelized with "omp parallel do", then an inner DO CONCURRENT > loop, hopefully you don't then get N*N threads? That is, does > -ftree-loop-parallelize use the thread pool that libgomp creates, or > does it create its own?
It uses the same thread pool. > > The > > usual caveats may apply when trying to vectorize outlined loops > > (loops that have been parallelized). > > > > Was the "play safe" for correctness concerns or for optimization > > concern? > > IIRC it was an optimization concern, if users assume DO CONCURRENT > means "vectorize" then they may use it for short loops where the > overhead of threads isn't worth it. Back then I recall it wasn't clear > what other compiler were going to do with DO CONCURRENT, thus merely > vectorizing was a safer option at the time. It seems that nowadays at > least Intel attempts to use threads for DO CONCURRENT only with > -parallel//Qparallel, which more or less matches what you're proposing > here. Which I guess is good for users in a performance portability > sense. Sure. -ftree-parallelize-loops isn't enabled by default. The patch merely adds more precise dependence analysis hints for the middle-end. Richard.