On 06/19/2015 03:45 PM, Jakub Jelinek wrote:
If the loop remains in the IL (isn't optimized away as unreachable or
isn't removed, e.g. as a non-loop - say if it contains a noreturn call),
the flags on struct loop should be still there. For the loop clauses
(reduction always, and
On Wed, Jun 24, 2015 at 03:11:04PM +0200, Bernd Schmidt wrote:
On 06/19/2015 03:45 PM, Jakub Jelinek wrote:
If the loop remains in the IL (isn't optimized away as unreachable or
isn't removed, e.g. as a non-loop - say if it contains a noreturn call),
the flags on struct loop should be still
On 06/22/15 11:18, Bernd Schmidt wrote:
You can have a hint that it is desirable, but not a hint that it is correct
(because passes in between may invalidate that). The OpenACC directives
guarantee to the compiler that the program can be transformed into a parallel
form. If we lose them early
On 06/22/15 12:20, Jakub Jelinek wrote:
OpenMP worksharing loop is just coordination between the threads in the
team, which thread takes which subset of the loop's iterations, and
optionally followed by a barrier. OpenMP simd loop is a loop that has
certain properties guaranteed by the user
On Mon, 22 Jun 2015 16:24:56 +0200
Jakub Jelinek ja...@redhat.com wrote:
On Mon, Jun 22, 2015 at 02:55:49PM +0100, Julian Brown wrote:
One problem is that (at least on the GPU hardware we've considered
so far) we're somewhat constrained in how much control we have over
how the underlying
On 06/22/2015 04:24 PM, Jakub Jelinek wrote:
I don't understand why lowering the way you suggest helps here at all.
In the proposed scheme, you essentially have whole function
in e.g. worker-single or vector-single mode, which you need to be able to
handle properly in any case, because users can
On Mon, Jun 22, 2015 at 12:08:36PM -0400, Nathan Sidwell wrote:
On 06/22/15 11:18, Bernd Schmidt wrote:
You can have a hint that it is desirable, but not a hint that it is correct
(because passes in between may invalidate that). The OpenACC directives
guarantee to the compiler that the
On Mon, 22 Jun 2015 16:24:56 +0200
Jakub Jelinek ja...@redhat.com wrote:
On Mon, Jun 22, 2015 at 02:55:49PM +0100, Julian Brown wrote:
One problem is that (at least on the GPU hardware we've considered
so far) we're somewhat constrained in how much control we have over
how the underlying
On Mon, Jun 22, 2015 at 06:48:10PM +0100, Julian Brown wrote:
In vector-single or worker-single mode, divergence of threads within a
warp or a CTA is controlled by broadcasting the controlling expression
of conditional branches to the set of inactive threads, so each of
those follows along
On 06/19/2015 03:45 PM, Jakub Jelinek wrote:
I actually believe having some optimization passes in between the ompexp
and the lowering of the IR into the form PTX wants is highly desirable,
the form with the worker-single or vector-single mode lowered will contain
too complex CFG for many
On Mon, Jun 22, 2015 at 03:59:57PM +0200, Bernd Schmidt wrote:
On 06/19/2015 03:45 PM, Jakub Jelinek wrote:
I actually believe having some optimization passes in between the ompexp
and the lowering of the IR into the form PTX wants is highly desirable,
the form with the worker-single or
On Mon, Jun 22, 2015 at 02:55:49PM +0100, Julian Brown wrote:
One problem is that (at least on the GPU hardware we've considered so
far) we're somewhat constrained in how much control we have over how the
underlying hardware executes code: it's possible to draw up a scheme
where OpenACC
On Fri, 19 Jun 2015 14:25:57 +0200
Jakub Jelinek ja...@redhat.com wrote:
On Fri, Jun 19, 2015 at 11:53:14AM +0200, Bernd Schmidt wrote:
On 05/28/2015 05:08 PM, Jakub Jelinek wrote:
I understand it is more work, I'd just like to ask that when
designing stuff for the OpenACC offloading
On 06/19/2015 02:25 PM, Jakub Jelinek wrote:
Emitting PTX specific code from current ompexp is highly undesirable of
course, but I must say I'm not a big fan of keeping the GOMP_* gimple trees
around for too long either, they've never meant to be used in low gimple,
and even all the early
On Fri, Jun 19, 2015 at 03:03:38PM +0200, Bernd Schmidt wrote:
they are also very much OpenMP or OpenACC specific, rather than representing
language neutral behavior, so there is a problem that you'd need M x N
different expansions of those constructs, which is not really maintainable
(M being
On Fri, Jun 19, 2015 at 11:53:14AM +0200, Bernd Schmidt wrote:
On 05/28/2015 05:08 PM, Jakub Jelinek wrote:
I understand it is more work, I'd just like to ask that when designing stuff
for the OpenACC offloading you (plural) try to take the other offloading
devices and host fallback into
On 05/28/2015 05:08 PM, Jakub Jelinek wrote:
I understand it is more work, I'd just like to ask that when designing stuff
for the OpenACC offloading you (plural) try to take the other offloading
devices and host fallback into account.
The problem is that many of the transformations we need to
On Thu, 28 May 2015 16:37:04 +0200
Richard Biener richard.guent...@gmail.com wrote:
On Thu, May 28, 2015 at 4:06 PM, Julian Brown
jul...@codesourcery.com wrote:
For NVPTX, it is vitally important that the divergence of threads
within a warp can be controlled: in particular we must be able
On Thu, May 28, 2015 at 4:06 PM, Julian Brown jul...@codesourcery.com wrote:
For NVPTX, it is vitally important that the divergence of threads
within a warp can be controlled: in particular we must be able to
generate code that we know reconverges at a particular point.
Unfortunately GCC's
For NVPTX, it is vitally important that the divergence of threads
within a warp can be controlled: in particular we must be able to
generate code that we know reconverges at a particular point.
Unfortunately GCC's middle-end optimisers can cause this property to
be violated, which causes problems
On Thu, May 28, 2015 at 03:06:35PM +0100, Julian Brown wrote:
For NVPTX, it is vitally important that the divergence of threads
within a warp can be controlled: in particular we must be able to
generate code that we know reconverges at a particular point.
Unfortunately GCC's middle-end
Hi!
On Thu, 28 May 2015 16:20:11 +0200, Jakub Jelinek ja...@redhat.com wrote:
On Thu, May 28, 2015 at 03:06:35PM +0100, Julian Brown wrote:
[...]
I think the lowering of this already at ompexp time is premature
Yes, we're aware of this wart. :-|
I think much better would be to have a
On Thu, May 28, 2015 at 04:49:43PM +0200, Thomas Schwinge wrote:
I think much better would be to have a function attribute (or cgraph
flag) that would be set for functions you want to compile this way
(plus a targetm flag that the targets want to support it that way),
plus a flag in loop
23 matches
Mail list logo