On 18 April 2011 16:01, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> wrote: > (apologies for top post)
No problem, it means I have to scroll less :) > This all seems to scream 'disallow' to me, in particular since some openmp > implementations may not support it etc. > > At any rate I feel 'parallel/parallel/prange/prange' is going to far; so > next step could be to only allowing 'parallel/prange/parallel/prange'. > > But really, my feeling is that if you really do need this then you can > always write a seperate function for the inner loop (I honestly can't think > of a usecase anyway...). So I'd really drop it; at least until the rest of > the gsoc project is completed :) Ok, sure, I'll disallow it. Then the user won't be able to make mistakes and I don't have to detect the case and issue a warning for inner reductions or lastprivates :). > DS > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity. > > mark florisson <markflorisso...@gmail.com> wrote: >> >> On 16 April 2011 18:42, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> >> wrote: > (Moving discussion from http://markflorisson.wordpress.com/, where >> Mark > said:) Ok, sure, it was just an issue I was wondering about at that >> moment, but it's a tricky issue, so thanks. > """ > Started a new branch >> https://github.com/markflorisson88/cython/tree/openmp . > > Now the question >> is whether sharing attributes should be propagated > outwards. e.g. if you >> do > > for i in prange(m): > for j in prange(n): > sum += i * j > >> > then ‘sum’ is a reduction for the inner parallel loop, but not for the >> outer > one. So the user would currently have to rewrite this to > > for i >> in prange(m): > for j in prange(n): > sum += i * j > sum += 0 > >> > which seems a bit silly . Of course, we could just disable nested > >> parallelism, or tell the users to use a prange and a ‘for from’ in such > >> cases. > """ > > Dag: Interesting. The first one is definitely the behaviour >> we want, as long > as it doesn't cause unintended consequences. > > I don't >> really think it will -- the important thing is that that the order > of loop >> iteration evaluation must be unimportant. And that is still true > (for the >> outer loop, as well as for the inner) in your first example. > > Question: >> When you have nested pranges, what will happen is that two nested > OpenMP >> parallel blocks are used, right? And do you know if there is complete > >> freedom/"reentrancy" in that variables that are thread-private in an outer > >> parallel block and be shared in an inner one, and vice versa? An >> implementation may or may not support it, and if it is supported the >> behaviour can be configured through omp_set_nested(). So we should consider >> the case where it is supported and enabled. If you have a lastprivate or >> reduction, and after the loop these are (reduced and) assigned to the >> original variable. So if that happens inside a parallel construct which does >> not declare the variable private to the construct, you actually have a race. >> So e.g. the nested prange currently races in the outer parallel range. > If >> so I'd think that this algorithm should work and feel natural: > > - In >> each prange, for the purposes of variable private/shared/reduction > >> inference, consider all internal "prange" just as if they had been "range"; >> > no special treatment. > > - Recurse to children pranges. Right, that is >> most natural. Algorithmically, reductions and lastprivates (as those can >> have races if placed in inner parallel constructs) propagate outwards >> towards the outermost parallel block, or up to the first parallel with >> block, or up to the first construct that already determined the sharing >> attribute. e.g. with parallel: with parallel: for i in prange(n): for j in >> prange(n): sum += i * j # sum is well-defined here # sum is undefined here >> Here 'sum' is a reduction for the two innermost loops. 'sum' is not private >> for the inner parallel with block, as a prange in a parallel with block is a >> worksharing loop that binds to that parallel with block. However, the >> outermost parallel with block declares sum (and i and j) private, so after >> that block all those variables become undefined. However, in the outermost >> parallel with block, sum will have to be initialized to 0 before anything >> else, or be declared firstprivate, otherwise 'sum' is undefined to begin >> with. Do you think declaring it firstprivate would be the way to go, or >> should we make it private and issue a warning or perhaps even an error? > DS >> > >> ________________________________ >> > cython-devel mailing list > cython-devel@python.org > >> > http://mail.python.org/mailman/listinfo/cython-devel > >> ________________________________ >> cython-devel mailing list cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > _______________________________________________ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > > _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel